Build a GraphRAG Pipeline with Neo4j & LLMs

Use the llm-graph-builder to transform unstructured documents into a Neo4j knowledge graph. This enables a powerful GraphRAG pattern where you can ask natural language questions, retrieve relevant subgraphs, and generate context-aware answers with an LLM.

intermediate1 hour4 steps

The play

Set Up Your Environment
Clone the llm-graph-builder repository, install Python dependencies, and set environment variables for your Neo4j database and OpenAI API key. This prepares your system to run the pipeline.
Ingest Documents into Neo4j
Run the main script to process your text files. The script uses spaCy for entity extraction and an LLM to infer relationships, automatically building your knowledge graph from the documents. Create a sample text file to test.
Explore the Generated Graph
Connect to your Neo4j instance using the Neo4j Browser. Run a basic Cypher query to visualize the nodes and relationships created from your documents. This step confirms that the ingestion process worked correctly.
Ask a Question (GraphRAG)
Use the script's query mode to ask a natural language question about your documents. The tool converts your question to a Cypher query, retrieves a relevant subgraph, and feeds it to an LLM to generate a final answer.

Starter code

#!/bin/bash
# This script provides a complete end-to-end example of using the Neo4j RAG Pipeline.

# 1. Clone the repository and navigate into it
git clone https://github.com/neo4j-labs/llm-graph-builder.git
cd llm-graph-builder

# 2. Install required Python packages
pip install -r requirements.txt

# 3. Set environment variables (replace with your actual credentials)
export OPENAI_API_KEY="sk-YOUR_OPENAI_API_KEY"
export NEO4J_URI="neo4j+s://YOUR_NEO4J_AURA_INSTANCE.databases.neo4j.io"
export NEO4J_USERNAME="neo4j"
export NEO4J_PASSWORD="YOUR_NEO4J_PASSWORD"

# 4. Create a sample data file
mkdir -p data
echo "Alice is a software engineer at Acme Inc. Bob is a project manager at the same company. They both work on the 'Phoenix' project." > data/sample.txt

# 5. Run the ingestion process to build the knowledge graph
# This will read the text file, extract entities/relationships, and load them into Neo4j.
# Ensure your Neo4j database is empty or you're okay with adding new nodes/relationships.
echo "
--- Starting data ingestion... ---"
python main.py --path "data" --model "gpt-3.5-turbo-16k"

# 6. Ask a question using the GraphRAG pattern
# The script will convert the question to Cypher, query the graph, and generate an answer.
echo "
--- Asking a question to the graph... ---"
python main.py --question "Who works on the Phoenix project?"