LLMs Fail to Generalize Beyond Training Data

Popular：

Virtualization DNS security formal verification reachability analysis compiler errors macro conflict web extension development framework Bitmap Graphics API inconsistencies All Tags

LLMs Fail to Generalize Beyond Training Data

2025-08-12

Researchers tested the generalization capabilities of large language models (LLMs) on tasks, formats, and lengths outside their training data. Results showed a dramatic drop in accuracy as the task diverged from the training distribution. Even when providing correct answers, the models often exhibited illogical reasoning or reasoning inconsistent with their answers. This suggests that chain-of-thought (CoT) reasoning in LLMs doesn't reflect true text understanding, but rather the replication of patterns learned during training. Performance also degraded sharply when presented with inputs of varying lengths or unfamiliar symbols, further highlighting the limitations in generalization.

(arstechnica.com)

Wayland Lock Screen Transformed into a Pokémon Puzzle

Website Anti-Scraping Mechanism: Anubis Explained