Article
text-classificationnlpzero-shotprompt-engineeringllmpythonopenaicategorization
Classify Text Instantly with Zero-Shot LLM Prompts
Use a large language model for Text Classification without any training data. This guide shows how to craft prompts that categorize text into single or multiple labels and return structured JSON output for easy integration.
beginner15 min4 steps
The play
- Define Your Classification TaskFirst, decide on the categories (labels) you want to sort your text into. Then, identify the input text you need to classify. For example, to classify customer feedback, your labels might be 'Bug Report', 'Feature Request', and 'Pricing Issue'.
- Craft a Basic Zero-Shot PromptInstruct the LLM to act as a classifier. Provide the text and the list of possible categories, and ask it to choose the single most relevant one. This is the core of zero-shot Text Classification, requiring no prior model training.
- Adapt for Multi-Label ClassificationFor text that fits multiple categories, adjust your prompt to allow for more than one answer. Instruct the model to return a list of all applicable labels. This is useful for complex documents like product reviews which might touch on several topics.
- Enforce Structured JSON OutputTo make the output reliable for your application, explicitly ask the model to respond in JSON format. You can specify a schema including the label(s) and a confidence score to gauge the model's certainty. Many models have a dedicated JSON mode for this.
Starter code
import os
import json
from openai import OpenAI
# --- Configuration ---
# Make sure to set your OPENAI_API_KEY environment variable
# Or uncomment and set the value below
# os.environ['OPENAI_API_KEY'] = 'YOUR_API_KEY'
client = OpenAI()
# --- Text Classification Function ---
def perform_text_classification(text_to_classify: str, labels: list[str]) -> dict:
"""Performs zero-shot text classification using an LLM."""
system_prompt = (
"You are an expert text classifier. Your task is to categorize the user's text. "
"Respond ONLY with a valid JSON object matching this schema: "
"{'classifications': [{'label': 'string', 'confidence': 'float'}]}. "
"The 'label' must be one of the provided categories."
)
user_prompt = f"""Classify the following text using only these categories: {', '.join(labels)}.
Text to classify:
'''{text_to_classify}'''"""
try:
response = client.chat.completions.create(
model="gpt-4o",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
temperature=0.0
)
classification_result = json.loads(response.choices[0].message.content)
return classification_result
except Exception as e:
print(f"An error occurred: {e}")
return None
# --- Example Usage ---
if __name__ == "__main__":
customer_feedback = (
"I love the new dashboard, it's so much faster! However, I think the price is a bit too high for small teams like ours. "
"Also, it would be great if you could add a CSV export feature."
)
possible_labels = [
"Bug Report",
"Feature Request",
"Pricing Feedback",
"Positive Feedback",
"Customer Support"
]
print(f"Classifying text: \n'{customer_feedback}'\n")
result = perform_text_classification(customer_feedback, possible_labels)
if result:
print("--- Classification Result ---")
print(json.dumps(result, indent=2))
print("---------------------------")