Skip to main content
Article
semantic-searchembeddingsvector-searchpythonsentence-transformersnlprag

Implement Basic Semantic Search in Python

Build a simple yet powerful Semantic Search engine. This technique finds documents based on meaning, not just keywords, by converting text into vector embeddings and calculating similarity. It's the core of any RAG system.

beginner15 min5 steps
The play
  1. Install Dependencies
    Set up your Python environment by installing `sentence-transformers` for embedding generation and `torch` as its backend. These libraries will handle the complex model loading and vectorization.
  2. Prepare Documents and Model
    Define a list of documents (your 'corpus') and load a pre-trained sentence transformer model. We'll use 'all-MiniLM-L6-v2', a small and efficient model perfect for getting started.
  3. Generate Document Embeddings
    Use the loaded model to convert your text documents into numerical vectors (embeddings). This is the key step in Semantic Search, as these embeddings capture the 'meaning' of the text.
  4. Encode Query and Find Similarity
    Define a search query and encode it into an embedding using the same model. Then, use a utility function to calculate the cosine similarity between the query embedding and all document embeddings.
  5. Retrieve Top Result
    Identify the document with the highest similarity score. This score indicates which document is semantically closest to your query, completing the Semantic Search process.
Starter code
import torch
from sentence_transformers import SentenceTransformer, util

# 1. Load a pre-trained model
model = SentenceTransformer('all-MiniLM-L6-v2')

# 2. Define your document corpus
documents = [
    'A man is eating food.',
    'Someone is playing a guitar.',
    'The cat is chasing a mouse.',
    'A woman is reading a book in the library.',
    'Global warming is a significant environmental issue.',
    'Deep learning models require large amounts of data for training.'
]

# 3. Encode the documents into embeddings
print("Encoding documents...")
doc_embeddings = model.encode(documents, convert_to_tensor=True)

# 4. Define a search query
query = 'What are neural networks trained on?'

# 5. Encode the query
query_embedding = model.encode(query, convert_to_tensor=True)

# 6. Perform Semantic Search: Compute cosine similarity
# and find the most similar document
cosine_scores = util.cos_sim(query_embedding, doc_embeddings)

# Find the best match
top_score, top_index = torch.max(cosine_scores, dim=1)

print("\n--- Search Results ---")
print(f"Query: {query}\n")
print(f"Best match: '{documents[top_index]}'\nScore: {top_score.item():.4f}")
Implement Basic Semantic Search in Python — Action Pack