Skip to main content
Article
ragpythonautomationvector-databasedata-ingestionlangchainopenai

Deploy a Production RAG Pipeline with a Setup Script

Use the RAG Pipeline Setup script to automate deploying a complete retrieval-augmented generation system. This guide walks you through configuring, provisioning infrastructure, ingesting documents, and testing the final retrieval endpoint.

intermediate30 min4 steps
The play
  1. Configure Your Environment
    Before running the RAG Pipeline Setup script, create a `.env` file to store your API keys. The script needs these to connect to your LLM provider (e.g., OpenAI) and your vector database (e.g., Pinecone).
  2. Provision the Infrastructure
    Execute the main RAG Pipeline Setup script. This command reads your configuration, provisions a new vector database index, and prepares the environment for document ingestion. Monitor the output for any errors.
  3. Ingest and Embed Documents
    Run the ingestion component of the RAG Pipeline Setup script, pointing it to your local document directory. The script will chunk the files, generate embeddings, and upload them to the newly provisioned vector store.
  4. Query the Retrieval Endpoint
    The script deploys a basic API endpoint for retrieval. Use a client script or a tool like cURL to send a query and confirm that the RAG pipeline returns relevant document chunks based on semantic similarity.
Starter code
import os
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader

# --- 1. Setup Environment (replace with your key) ---
# Make sure to `pip install langchain-openai langchain faiss-cpu`
# In a real scenario, use environment variables.
os.environ["OPENAI_API_KEY"] = "sk-YOUR_API_KEY_HERE"

# --- 2. Create a dummy document ---
with open("state_of_the_union.txt", "w") as f:
    f.write("The President said the economy is strong. He also mentioned infrastructure projects.")

# --- 3. Load and Chunk Document ---
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
texts = text_splitter.split_documents(documents)

# --- 4. Embed and Store in Vector DB ---
print("Creating embeddings and storing in FAISS...")
embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(texts, embeddings)
retriever = db.as_retriever()

# --- 5. Setup QA Chain ---
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0),
    chain_type="stuff",
    retriever=retriever
)

# --- 6. Query the RAG system ---
query = "What did the president say about infrastructure?"
print(f"\nQuery: {query}")
result = qa_chain.run(query)
print(f"Answer: {result}")

# Cleanup the dummy file
os.remove("state_of_the_union.txt")
Deploy a Production RAG Pipeline with a Setup Script — Action Pack