UM Research: AI tests into top 1% for creative thinking
Gary Shimek
(UM News Service) New research from the University of Montana and its partners suggests artificial intelligence can match the top 1% of human thinkers on a standard test for creativity.
The study was directed by Dr. Erik Guzik, an assistant clinical professor in UM’s College of Business. He and his partners used the Torrance Tests of Creative Thinking, a well-known tool used for decades to assess human creativity.
The researchers submitted eight responses generated by ChatGPT, the application powered by the GPT-4 artificial intelligence engine. They also submitted answers from a control group of 24 UM students taking Guzik’s entrepreneurship and personal finance classes. These scores were compared with 2,700 college students nationally who took the TTCT in 2016. All submissions were scored by Scholastic Testing Service, which didn’t know AI was involved.
The results placed ChatGPT in elite company for creativity. The AI application was in the top percentile for fluency – the ability to generate a large volume of ideas – and for originality – the ability to come up with new ideas. The AI slipped a bit – to the 97th percentile – for flexibility, the ability to generate different types and categories of ideas.
“For ChatGPT and GPT-4, we showed for the first time that it performs in the top 1% for originality,” Guzik said. “That was new.”
He was gratified to note that some of his UM students also performed in the top 1%. However, ChatGTP outperformed the vast majority of college students nationally.
Guzik tested the AI and his students during spring semester. He was assisted in the work by Christian Gilde of UM Western and Christian Byrge of Vilnius University. The researchers presented their work in May at the Southern Oregon University Creativity Conference.
“We were very careful at the conference to not interpret the data very much,” Guzik said. “We just presented the results. But we shared strong evidence that AI seems to be developing creative ability on par with or even exceeding human ability.”
Guzik said he asked ChatGPT what it would indicate if it performed well on the TTCT. The AI gave a strong answer, which they shared at the conference:
“ChatGPT told us we may not fully understand human creativity, which I believe is correct,” he said. “It also suggested we may need more sophisticated assessment tools that can differentiate between human and AI-generated ideas.”
He said the TTCT is protected proprietary material, so ChatGPT couldn’t “cheat” by accessing information about the test on the internet or in a public database.
Guzik has long been interested in creativity. As a seventh grader growing up in the small town of Palmer, Massachusetts, he was in a program for talented-and-gifted students. That experience introduced him to the Future Problem Solving process developed by Ellis Paul Torrance, the pioneering psychologist who also created the TTCT. Guzik said he fell in love with brainstorming at that time and how it taps into human imagination, and he remains active with the Future Problem Solving organization – even meeting his wife at one of its conferences.
Guzik and his team decided to test the creativity of ChatGPT after playing around with it during the past year.
“We had all been exploring with ChatGPT, and we noticed it had been doing some interesting things that we didn’t expect,” he said. “Some of the responses were novel and surprising. That’s when we decided to put it to the test to see how creative it really is.”
Guzik said the TTCT test uses prompts that mimic real-life creative tasks. For instance, can you think of new uses for a product or improve this product?
“Let’s say it’s a basketball,” he said. “Think of as many uses of a basketball as you can. You can shoot it in a hoop and use it in a display. If you force yourself to think of new uses, maybe you cut it up and use it as a planter. Or with a brick you can build things, or it can be used as a paperweight. But maybe you grind it up and reform it into something completely new.”
Guzik had some expectation that ChatGPT would be good at creating a lot of ideas (fluency), because that’s what generative AI does. And it excelled at responding to the prompt with many ideas that were relevant, useful and valuable in the eyes of the evaluators.
He was more surprised at how well it did generating original ideas, which is a hallmark of human imagination. The test evaluators are given lists of common responses for a prompt – ones that are almost expected to be submitted. However, the AI landed in the top percentile for coming up with fresh responses.
“At the conference, we learned of previous research on GPT-3 that was done a year ago,” Guzik said. “At that time, ChatGPT did not score as well as humans on tasks that involved original thinking. Now with the more advanced GPT-4, it’s in the top 1% of all human responses.”
With AI advances speeding up, he expects it to become a key tool for the world of business going forward and a significant new driver of regional and national innovation.
“For me, creativity is about doing things differently,” Guzik said. “One of the definitions of entrepreneurship I love is that to be an entrepreneur is to think differently. So AI may help us apply the world of creative thinking to business and the process of innovation, and that’s just fascinating to me.”
He said the UM College of Business is open to teaching about AI and incorporating it into coursework.
“I think we know the future is going to include AI in some fashion,” Guzik said. “We have to be careful about how it’s used and consider needed rules and regulations. But businesses already are using it for many creative tasks. In terms of entrepreneurship and regional innovation, this is a game changer.”