400x Faster Static Embedding Models with Sentence Transformers

This blog post introduces a method to train static embedding models that are 100x to 400x faster on CPU than state-of-the-art embedding models, while maintaining most of the quality. This unlocks exciting use cases like on-device and in-browser execution. Two highly efficient models are presented: sentence-transformers/static-retrieval-mrl-en-v1 for English retrieval and sentence-transformers/static-similarity-mrl-multilingual-v1 for multilingual similarity. These models achieve at least 85% of the performance of counterparts like all-mpnet-base-v2 and multilingual-e5-small, while being significantly faster on CPU.
Read more