Summary
The study investigates the performance of pre-ChatGPT models and ChatGPT models on tasks embedded with intuitive traps, akin to psychological assessments of human cognitive processing. While pre-ChatGPT models, particularly GPT-3-davinci-003, exhibited a propensity for intuitive, fast, system 1-like responses, leading to errors, ChatGPT models demonstrated a significant increase in accurate responses, seemingly indicating an evolution in their processing capabilities.
The study proposes that this shift could be attributed to ChatGPT models’ ability to engage in “chain-of-thought” reasoning and the possible enhancement of their training processes, including exposure to a variety of tasks and the use of reinforcement learning from human feedback.
Key Findings
Increased Susceptibility to Intuitive Traps in Advanced Pre-ChatGPT Models: As the LLMs’ comprehension abilities improved, they exhibited a higher tendency to fall for intuitive traps in tasks, a characteristic likened to humans’ automatic, system 1 cognitive processing.
Significant Performance Shift with ChatGPT Models: ChatGPT models showed a substantial increase in correctly responding to tasks, avoiding the intuitive traps that previous models fell for. This suggests an evolution beyond mere system 1 processing.
Chain-of-Thought Reasoning and Enhanced Training as Possible Contributors: The improved performance in ChatGPT models might be due to their ability for chain-of-thought reasoning and potentially due to their exposure to a wider variety of training tasks and enhanced reinforcement learning processes.
Intuitive Decision-Making in LLMs and Normative Evaluation: The desirability of intuitive decision-making in LLMs is questioned, proposing that their effectiveness should perhaps be evaluated based on “ecological rationality,” considering the real-world environment’s structure in which they operate.
Implications
Understanding LLMs’ Cognitive Processes: The findings suggest that as LLMs evolve, their methods of processing and responding to information may become more sophisticated, potentially mimicking more advanced cognitive processes. This has broad implications for how we understand, utilize, and interact with these models.
Training and Development of LLMs: The possible impact of enhanced training processes, including reinforcement learning from human feedback, indicates that the methodologies used in developing these models significantly influence their performance. This insight is crucial for future LLM development strategies.
Normative Considerations in LLM Deployment: The study raises important questions about the normative expectations placed on LLMs, especially regarding their decision-making processes. It invites a re-evaluation of what is considered “correct” behavior or processing in the context of the environments in which these models are deployed.
Methodological Approach to Studying LLMs: The study underscores the value of utilizing methodologies from psychology to understand the behaviors and potential cognitive processes of LLMs, marking a significant interdisciplinary approach that can enrich both fields.
Anticipating LLM Behavior: The advancement in LLMs underscores a growing unpredictability in their behaviors and responses, indicating a need for continuous research and methodologies that consider LLMs as evolving entities with potentially emergent properties not directly attributable to their programming or architectural parameters.
Source
& Kosinski, M. Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT. Nat Comput Sci (2023). https://doi.org/10.1038/s43588-023-00527-x Hagendorff, T., Fabi, S.
Citation
@article{infoepi_lab2023,
author = {InfoEpi Lab},
publisher = {Information Epidemiology Lab},
title = {Human-Like Intuitive Behavior and Reasoning Biases Emerged in
Large Language Models but Disappeared in {ChatGPT}},
journal = {InfoEpi Lab},
date = {2023-10-12},
url = {https://infoepi.org/posts/2023/10/13-LLM-nature.html},
langid = {en}
}