Generating Prompts via Activation Maximization: 95.9% Accuracy on Yelp Review Polarity
2025-08-16
This article presents a novel approach to prompt engineering using activation maximization. By optimizing the input rather than the model weights, a 4-token prompt was generated that achieved 95.9% accuracy on the Yelp Review Polarity sentiment classification task using Llama-3.2-1B-Instruct, significantly outperforming hand-written prompts (57%). This method cleverly leverages the LLM's embedding vector space, representing the prompt as a differentiable tensor and using gradient descent for optimization. This technique shows potential for increasing task switching efficiency in large language models, especially under GPU memory constraints.
Read more