Insights into the Structure of Neural Embeddings
This article explores the structure of embeddings (latent spaces) produced by deep neural networks. Several key hypotheses are summarized: the Manifold Hypothesis (high-dimensional data lies in a low-dimensional manifold); Hierarchical Organization (features organize hierarchically across layers); Linear Hypothesis (neural networks represent features as linear directions in their activation space); Superposition Hypothesis (neural nets represent more independent features than a layer has neurons); Universality Hypothesis (circuits reappear across different models for the same data); Adversarial Vulnerability (small input changes cause large embedding shifts); and Neural Collapse (after training, class features cluster tightly around their means). These hypotheses collectively illuminate the complexity and potential limitations of deep neural network embeddings.
Read more