ChatGPT makes materials research much more efficient

A man wearing glasses and a white polo shirt reads a piece of paper in his hands. He is surrounded by reams of scrolling paper piling up around him and filling the frame. The image is near-photo realistic with an air of magical realism.

This image was generated using Stable Diffusion, a text-to-image generator, using the prompt “researchers working with huge piles of data.” UW–Madison’s Dane Morgan and Maciej Polak have published their solution for training ChatGPT to read academic articles, tabulate key data and check the results for accuracy, thereby saving valuable research time.

The artificial intelligence developer OpenAI promises to reshape the way people work and learn with its new chatbot called ChatGPT. At the University of Wisconsin–Madison, in fact, the large language model is already aiding materials engineers, who are harnessing its power to quickly and cost-effectively extract information from scientific literature.

For several years, Dane Morgan, a professor of materials science and engineering at UW–Madison, has used machine learning, a type of data-based AI, in his lab to evaluate and search for new types of materials with great success. Maciej Polak, a staff scientist who works closely with Morgan, brainstormed other tasks AI might help with.

“AI can increasingly help with tasks that are quite complex and time consuming,” says Polak. “And we thought, ‘What is something materials scientists do very often that we wish we had more time for?’ One key thing is reading papers to get data.”

Polak says materials scientists often download and then comb through long research papers to search for one small group of numbers to add to their data sets.

“We thought we could just offload all of these time-consuming tasks onto an AI that could read those papers for us and give us that information,” says Polak.

Asking chatbots, even powerful ones like ChatGPT, to simply look for and extract data from the full text of a paper remains beyond their capabilities. So Polak refined the technique, asking the bots to review sentence by sentence and decide whether each contained relevant data or not — a task that boiled papers down to one or two key sentences. He then asked the bots to present the information in a table form, at which point a human researcher could review the table and sentences to make sure they were correct and relevant. The technique yielded an accuracy rate of about 90%, allowing the researchers to extract data from a set of papers to create a database on the critical cooling rates for metallic glasses.

In February 2023, Polak, Morgan and colleagues posted a paper about the technique on the arXiv preprint server.

While the technique reduced the researchers’ paper-reading workload by about 99%, Polak was interested in improving it even more.

“I was the person still doing that last, manual step — checking the accuracy of the tables,” he says. “So, I wanted to find a way to fully automate this process.”

To get to that point, the team engaged in “prompt” engineering — figuring out the exact questions and sequence that would cause the bot to extract and then double-check the information they wanted. They applied their initial approach to extracted the data table, and then they asked the bot a series of follow-up questions to introduce the possibility that the data set was wrong. That forced the AI to double back, recheck the data and flag mistakes. In the vast majority of cases, the AI was able to identify faulty information.

“That’s the most important thing; it can admit it made a mistake,” says Polak. “Maybe it doesn’t know how to fix it, but at least we’re not getting factually incorrect information.”

The team released a separate paper on this iteration of the technique on arXiv in March 2023.

Morgan says this type of prompt engineering with ChatGPT and other large language models feels unusual at first.

“This isn’t programming in the traditional sense; the method of interacting with these bots is through language,” Morgan says. “Asking the program to extract data and then asking it to check if it is sure with normal sentences feels closer to how I train my children to get correct answers than how I usually train computers. It’s such a different way to ask a computer to do things. It really changes how you think about what your computer can do.”

Importantly, the technique doesn’t require a lot of effort or deep knowledge, according to Polak.

“Previously, people had to write hundreds of lines of code to do something like this, and the results often weren’t great,” Polak says. “Now we have this huge improvement in capabilities with tools like ChatGPT.”

Morgan is quick to note that integrating AI into research does not replace graduate students and scientists. Instead, these tools could allow researchers to pursue projects they previously didn’t have the time, money or people-power to undertake.

“I think these tools will change the way we do research, analogously to how Google changed the way we did research,” Morgan says. “Today we typically explore a field by using Google and other search tools to help us find papers and related resources, and then we read those papers and resources to extract information and data. Now you can go to one of these large language models to collect information around a topic and, using techniques like those we’ve been developing, build a database for review within hours.”

Morgan is the Harvey D. Spangler Professor in materials science and engineering. Maciej Polak is a research scientist in materials science and engineering.

ChatGPT makes materials research much more efficient

Enjoy this story?