Human-like property induction is a challenge for large language models


The impressive recent performance of large language models such as GPT-3 has led many to wonder to what extent they can serve as models of general intelligence or are similar to human cognition. We address this issue by applying GPT-3 to a classic problem in human inductive reasoning known as property induction. Our results suggest that while GPT-3 can qualitatively mimic human performance for some inductive phenomena (especially those that depend primarily on similarity relationships), it reasons in a qualitatively distinct way on phenomena that require more theoretical understanding. We propose that this emerges due to the reasoning abilities of GPT-3 rather than its underlying representations, and suggest that increasing its scale is unlikely to change this pattern.

In J Culbertson and A Perfors and H Rabagliati and V Ramenzoni (Eds.) Proceedings of the 44th Annual Conference of the Cognitive Science Society: 2782-2788