Characterising LLM-Generated Competency Questions: a Cross-Domain Empirical Study using Open and Closed Models

Automate the generation of Competency Questions (CQs) for ontology engineering using LLMs. This action pack guides you through defining requirements and crafting effective prompts, significantly speeding up a traditionally human-intensive knowledge engineering task while maintaining the need for expert review.

intermediate30 min4 steps

The play

Define Ontology Domain & Scope
Clearly document the purpose, key entities, and specific problems the ontology aims to address. For example, for a 'Medical Diagnosis Ontology,' your scope might include 'diseases, symptoms, treatments for respiratory conditions.'
Choose Your LLM
Select an LLM (e.g., OpenAI's GPT-4, Anthropic's Claude, or an open-source model like Llama 2 via Hugging Face) based on performance, cost, data privacy, and fine-tuning capabilities. Closed models often perform better out-of-the-box, while open models offer more control.
Engineer Your Prompt for CQ Generation
Design a detailed prompt for the LLM. Include a specific role/persona, a clear task definition, a definition of Competency Questions, desired output format, and any constraints. Optionally, provide 2-3 high-quality few-shot examples relevant to your domain to guide the LLM's style and content.
Generate Competency Questions
Execute your crafted prompt using the chosen LLM API or local model to generate the list of Competency Questions.

Starter code

import openai
import os

# Ensure you have your OpenAI API key set as an environment variable or uncomment and set it directly
# os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"
openai.api_key = os.getenv("OPENAI_API_KEY")

domain_description = "a medical diagnosis ontology focused on respiratory conditions"
ontology_goal = "identify diseases, symptoms, and potential treatments for common respiratory ailments"

prompt_text = f"""
You are an expert ontology engineer and a domain expert in {domain_description}.
Your task is to generate 7 distinct Competency Questions (CQs) for an ontology focused on {ontology_goal}.
Competency Questions are natural language questions that an ontology should be able to answer. They help define the scope and requirements of the ontology.
Ensure CQs are clear, unambiguous, non-redundant, and testable. Present them as a numbered list.

Example CQs:
1. What are the common symptoms associated with pneumonia?
2. Which medications are typically prescribed for asthma?
3. What diagnostic tests are used to confirm bronchitis?
"""

try:
    response = openai.chat.completions.create(
        model="gpt-3.5-turbo", # Consider "gpt-4" for higher quality
        messages=[
            {"role": "system", "content": "You are a helpful assistant that generates competency questions for ontologies."},
            {"role": "user", "content": prompt_text}
        ],
        max_tokens=400,
        temperature=0.7
    )
    generated_cqs = response.choices[0].message.content
    print("Generated Competency Questions:")
    print(generated_cqs)
except Exception as e:
    print(f"An error occurred: {e}")