Extracting Training Data from LLMs: Reversing the Knowledge Compression
Researchers have developed a technique to extract structured datasets from large language models (LLMs), effectively reversing the process by which LLMs compress massive amounts of training data into their parameters. The method uses hierarchical topic exploration to systematically traverse the model's knowledge space, generating training examples that capture both factual knowledge and reasoning patterns. This technique has been successfully applied to open-source models like Qwen3-Coder, GPT-OSS, and Llama 3, yielding tens of thousands of structured training examples. These datasets have applications in model analysis, knowledge transfer, training data augmentation, and model debugging. This research opens new avenues for model interpretability and cross-model knowledge transfer.