OpenAI founding team member and Tesla’s former AI Director Andrej Karpathy posted a long article on X titled “AI Capability Perception Gap,” responding to a community phenomenon: polar opposite levels of amazement at AI—one group believes AI has already rewritten the world, while the other thinks AI only hallucinate, is boring, and has been overhyped. Karpathy offers two diagnoses and explains why these two groups are living in “parallel worlds,” each misunderstanding the other’s basis for judgment. This article summarizes his arguments and the lessons for tech readers in Taiwan.

Diagnosis 1: Which year—and which layer—of AI are you using?

Karpathy’s first observation is direct and sharp: “A lot of people tried the free version of ChatGPT last year, and that experience alone ended up driving their view of AI.” These people typically respond by mocking the model’s strange behaviors, hallucinations, clumsiness, and sharing videos where the upgraded OpenAI voice mode gets derailed by simple questions like “Should I drive to wash the car or walk?”

But Karpathy points out that these “free versions, old versions, abandoned versions” of models simply cannot reflect the capabilities of the most advanced agentic models in 2026 (especially OpenAI Codex and Claude Code). In short: using free ChatGPT from 2024 to judge whether AI can write code is like using the Nokia E71 from 2008 to decide whether a smartphone can work.

For many readers in Taiwan, this is also reality—subscribing to ChatGPT Plus ($20) is fairly common, but very few subscribe to ChatGPT Pro ($200) or Claude Max ($100). If you haven’t run agent tasks on the most advanced paid tier, you mostly see AI as “fun toys but unreliable”; if you have, you see AI as “fully rewriting your workflow.” The same technology—two different worlds.

Diagnosis 2: Capability progress is “asymmetric” across different domains

Karpathy’s second diagnosis is even more interesting: “Even if you pay $200 per month for the most advanced models, progress in capability is ‘spike-like,’ concentrated in highly technical fields.”

He argues that search, writing, and recommendation—those typical “query”-style uses—aren’t where AI improvements this year are most dramatic. There are two layers of reasons:

Reinforcement learning (RL) depends on verifiable reward functions—writing code has clear signals like “did the unit tests pass,” while writing doesn’t have corresponding objective grading criteria, so the speed gap in RL training can be huge.

The biggest commercial value for companies like OpenAI and Anthropic is in B2B code/research/engineering settings, so resources, headcount, and priorities concentrate in those areas; other use cases aren’t the biggest profit sources.

This observation is crucial—it explains the confusion many people have: why AI coding skills advance rapidly, but AI writing still often looks ordinary. It’s not that AI companies can’t do it; it’s that their gold mines are elsewhere, so their attention follows the money.

Who gets the “AI cognition shock” the most? People who meet two conditions

Combining the two diagnoses, Karpathy describes the group most likely to be hit by “AI cognition shock”—people who meet both conditions:

Paying to use the most advanced agentic models (OpenAI Codex, Claude Code)

Professional use in highly technical domains (programming, math, research)

This group is most affected by the so-called “AI Psychosis”—Karpathy’s term—describing how, when you personally see an LLM solve programming problems that would normally take days or weeks in just a few hours, your judgment about AI capability and its slope (rate of improvement) will lead you to a very different view of the technology landscape for the coming years.

For the other group (those who don’t pay and don’t use AI in technical fields), this kind of talk sounds like “overexcitement,” like a Silicon Valley clique fantasy. But Karpathy believes it’s not a myth—it’s a real judgment born from personal experience.

Two groups “talk about their own worlds to each other”

Karpathy’s core conclusion: “These two groups are talking to each other’s worlds, not talking to each other.” He describes two things that can be simultaneously true:

OpenAI’s free (and I believe half-abandoned) “advanced voice mode,” which derails the simplest questions on Instagram Reels

At the same time, OpenAI’s top-tier paid Codex model can spend 1 hour to consistently refactor an entire codebase, or find and exploit vulnerabilities in computer systems

Both things are true and don’t conflict. But each group only sees one side, and then each concludes the other is “overexcited” or “too ignorant.” The purpose of Karpathy writing this post is to bridge that gap.

Lessons for readers in Taiwan: Which group are you in?

Karpathy’s discussion is especially meaningful for Taiwan readers because the tech discourse here also splits into extremes: one side says “AI has already taken over,” and the other says “it’s just a chatbot.” To figure out which group you belong to, consider three self-check questions:

When was your last time personally prompting the most advanced paid model (GPT-5.5 Pro, Claude Opus 4.7)?

Have you let an agent run for more than 30 minutes and actually complete a production-grade task (refactoring code, writing a research review, debugging a complex system)?

On what basis are you judging AI capability—media reports, community memes, or firsthand use?

People who answer “yes, recently, firsthand use” to all three questions will land in Karpathy’s second group and will understand his “AI Psychosis” description more easily. People who answer “no, long ago, seen in the media” to all three questions will land in the first group and may significantly underestimate the speed of AI progress.

This isn’t saying which group is “right.” It’s that the basis for judgment differs fundamentally. When you see the next article saying “AI is a bubble” or “AI will replace all jobs,” first confirm which group the author is in, then decide how to read it.

Karpathy’s “OpenClaw moment” add-on

In a follow-up post, Karpathy added: “Someone recently told me the reason the OpenClaw moment is so huge is that it’s a non-technical crowd, and this is their first personal experience with the most advanced agentic models.” This observation shows that the perception gap isn’t only a “degree” gap—it’s also a “personal experience vs hearsay” gap.

For abmedia readers, the most practical solution is: take out $20, subscribe for a month to ChatGPT Plus or Claude Pro, pick a real task you actually care about (write a research report, compile a financial analysis, debug a coding project), run it end-to-end with the agent, and then reassess what AI means for your work. That will be more useful than reading 100 AI news pieces.

Why do some people think AI changes the world, while others think it’s ordinary? Karpathy’s two diagnoses first appeared on the blockchain news ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.