Implement Basic Semantic Search in Python

Build a simple yet powerful Semantic Search engine. This technique finds documents based on meaning, not just keywords, by converting text into vector embeddings and calculating similarity. It's the core of any RAG system.

beginner15 min5 steps

The play

Install Dependencies
Set up your Python environment by installing `sentence-transformers` for embedding generation and `torch` as its backend. These libraries will handle the complex model loading and vectorization.
Prepare Documents and Model
Define a list of documents (your 'corpus') and load a pre-trained sentence transformer model. We'll use 'all-MiniLM-L6-v2', a small and efficient model perfect for getting started.
Generate Document Embeddings
Use the loaded model to convert your text documents into numerical vectors (embeddings). This is the key step in Semantic Search, as these embeddings capture the 'meaning' of the text.
Encode Query and Find Similarity
Define a search query and encode it into an embedding using the same model. Then, use a utility function to calculate the cosine similarity between the query embedding and all document embeddings.
Retrieve Top Result
Identify the document with the highest similarity score. This score indicates which document is semantically closest to your query, completing the Semantic Search process.

Starter code

import torch
from sentence_transformers import SentenceTransformer, util

# 1. Load a pre-trained model
model = SentenceTransformer('all-MiniLM-L6-v2')

# 2. Define your document corpus
documents = [
    'A man is eating food.',
    'Someone is playing a guitar.',
    'The cat is chasing a mouse.',
    'A woman is reading a book in the library.',
    'Global warming is a significant environmental issue.',
    'Deep learning models require large amounts of data for training.'
]

# 3. Encode the documents into embeddings
print("Encoding documents...")
doc_embeddings = model.encode(documents, convert_to_tensor=True)

# 4. Define a search query
query = 'What are neural networks trained on?'

# 5. Encode the query
query_embedding = model.encode(query, convert_to_tensor=True)

# 6. Perform Semantic Search: Compute cosine similarity
# and find the most similar document
cosine_scores = util.cos_sim(query_embedding, doc_embeddings)

# Find the best match
top_score, top_index = torch.max(cosine_scores, dim=1)

print("\n--- Search Results ---")
print(f"Query: {query}\n")
print(f"Best match: '{documents[top_index]}'\nScore: {top_score.item():.4f}")