OpenAI Admits: Even the Most Advanced AI Models Can't Replace Human Coders

2025-02-24
OpenAI Admits: Even the Most Advanced AI Models Can't Replace Human Coders

A new OpenAI paper reveals that even the most advanced large language models (LLMs), such as GPT-4 and Claude 3.5, are unable to handle the majority of software engineering tasks. Researchers used a new benchmark, SWE-Lancer, comprising over 1400 software engineering tasks from Upwork. Results showed these models could only solve superficial problems, failing to find bugs or root causes in larger projects. While LLMs are fast, their accuracy and reliability are insufficient to replace human coders, contradicting predictions by OpenAI CEO Sam Altman.

Development