AI

IQ ranking of AI Models: Claude-3 is more intelligent than ChatGPT-4

Georg Haas11. March 2024, 09:00

AI Chatbots get ranked by their IQ © Canva

Startup Interviewer: Gib uns dein erstes AI Interview

With all the hype surrounding AI models, the question often arises as to how “intelligent” they really are. The most common measure of intelligence in humans is the intelligence quotient (IQ), which determines a person’s intellectual performance in comparison to a predetermined comparison group. The average IQ of humans is usually 100. Blogger Maxim Lott has now carried out an IQ test on current AI models. Anthropic’s Claude-3 has been shown to beat its rivals, including OpenAI’s ChatGPT-4, and is also the first AI to exceed the usual human IQ of 100.

Claude-3 achieves average human IQ

In his test, Maxim Lott focused on how AI models think rather than how they see and interpret images. In the latter case, all models still have weaknesses. Instead, Lott created a 35-question, matrix-style verbal translation of the Norwegian Mensa IQ test. The goal was to describe each problem in such detail that even a blind person could theoretically draw the question accurately.

When Lott ChatGPT-4 described the matrices in words, it resulted in an assessable IQ. On average, OpenAI’s model answered an average of 13 out of 35 questions correctly on the Norwegian Mensa test, giving an estimated IQ of 85. But its rival Claude-3, which only became available in the EU a few days ago, performed much better. Claude-3 has achieved a total IQ of 101, putting it in first place among common models.

Mistral “Le Chat”: The newest ChatGPT rival struggles with well-known weaknesses

AI models are improving at lightning speed

Anthropic has made massive strides in its Claude models with each release. Claude-1, which was only published in March 2023, achieved an IQ of 64 in the test, putting it in 9th place in the ranking. Claude-2, released last July, has an IQ of 82, putting it in third place, just behind ChatGPT-4. In fourth place is Bing Copilot from Microsoft with an IQ of 79, in fifth place is Gemini from Google with an IQ of 77.5. Interestingly, Gemini Advanced performed marginally worse than the basic version with an IQ of 77. GPT-3.5, the previous version of the OpenAI AI, is on par with Claude-1 with an IQ of 64.

ChatGPT-4 was able to answer an average of 13 out of 35 questions correctly in each test, compared to twelve for Claude-1. Bing Copilot has eleven correctly answered questions, Gemini has 10.5. Above all, the ranking shows the massive progress the AI models of individual providers are making per version. Anthropic and OpenAI in particular massively improve their models with each new version.

Claude could become highly gifted in the future

Based on the development, Maxim Lott expects Anthropic’s next Claude model, which is expected in 12 to 16 months according to the release pattern, to reach an IQ of 120. The subsequent version, which could come in three to six years, has the possibility of an IQ of 140. A person is usually considered to be gifted if he has an IQ of 130. However, ChatGPT could achieve an IQ of around 106 with the next version.