Can LLMs Accurately Recall the Bible?
2024-12-29
This article investigates the ability of Large Language Models (LLMs) to accurately recall biblical scripture. The author devised six tests evaluating different sized LLMs' ability to accurately reproduce verses. Larger models (Llama 405B, GPT 4o, and Claude Sonnet) performed best, accurately recalling verses and even entire chapters. Smaller models (7B parameter range) frequently mixed translations or hallucinated text. Medium-sized models (70B range) generally preserved the meaning but often blended translations or paraphrased slightly. The author concludes that for accurate biblical quotations, larger models are preferable, supplemented by verifying against an actual Bible.
Read more