When you type a query into a search bar and results appear in milliseconds, you're interacting with one of the most complex pieces of software engineering in the world. Modern web search is no longer about simple keyword matching; it's about semantic understanding and massive scale.
The Inverted Index
At the core of every search engine (like Elasticsearch or Algolia) is the Inverted Index. Think of it like the index at the back of a textbook. Instead of listing documents and the words they contain, it lists words and the documents that contain them.
The Ranking Problem: BM25 and Beyond
Finding documents is easy; ranking them is hard. Most modern search engines use an algorithm called BM25 (Best Matching 25), which is an evolution of TF-IDF (Term Frequency-Inverse Document Frequency). It calculates how relevant a document is to a query based on how often the search terms appear, adjusted for document length and the rarity of the terms.
Vector Search and Embeddings
The newest frontier in search is Vector Search. By using machine learning models to turn text into multi-dimensional vectors (embeddings), we can perform "semantic search." This allows a search for "dog" to return results for "canine" even if the word "dog" never appears in the document.
Architecture at Scale
To handle millions of queries per second, search architectures rely on:
- Sharding: Splitting the index into smaller pieces across multiple servers.
- Replication: Copying those shards to provide high availability and read scalability.
- Caching: Storing the results of common queries in memory (Redis/Memcached).