DeepSeek-Prover-V2: Revolutionizing Formal Mathematical Reasoning with Reinforcement Learning

2025-04-30
DeepSeek-Prover-V2: Revolutionizing Formal Mathematical Reasoning with Reinforcement Learning

DeepSeek-Prover-V2 is an open-source large language model designed for formal theorem proving in Lean 4. It leverages a recursive theorem proving pipeline powered by DeepSeek-V3 and reinforcement learning to integrate both informal and formal mathematical reasoning. The model starts by decomposing complex problems into subgoals using DeepSeek-V3, synthesizing proofs of these subgoals to create initial data for reinforcement learning. DeepSeek-Prover-V2-671B achieves state-of-the-art performance, reaching an 88.9% pass ratio on MiniF2F-test and solving 49 problems from PutnamBench. A new benchmark dataset, ProverBench, containing 325 formalized problems from high school competitions and textbooks, is also introduced.