Softmax: Forever? A Deep Dive into Log-Harmonic Functions

2025-02-20

A decade ago, while teaching a course on NLP, the author was challenged by a student about alternatives to softmax. A recent paper proposes a log-harmonic function as a replacement, sparking a deeper investigation. The author analyzes the partial derivatives of both softmax and the log-harmonic function, revealing that softmax's gradient is well-behaved and interpretable, while the log-harmonic function's gradient exhibits singularity near the origin, potentially causing training difficulties. While powerful optimizers might overcome these challenges, the author concludes that the log-harmonic approach still warrants further exploration and potential improvements.

Read more

NeurIPS'24: Anxiety and Shifts in the AI Job Market

2024-12-24

At NeurIPS'24, many graduating PhD students and postdocs expressed anxiety and frustration about the AI job market. This stems from the rapid development of deep learning over the past decade, where large tech companies aggressively recruited AI PhDs, offering lucrative salaries and research freedom. However, with the maturation and productization of technologies like large language models, the demand for PhDs has decreased, and universities have started training undergraduates and master's students in relevant skills. This shift has left many PhD students feeling left behind, their research direction out of sync with market demands, and their future career prospects uncertain. The author expresses understanding and apologies, noting that many important research directions in AI remain, beyond large language models.

Read more