Interpretable Stylistic Variation in Human and LLM Writing Across Genres, Models, and Decoding Strategies

Move beyond simple LLM text detection by analyzing stylistic variations influenced by genre, model, and decoding strategies. Gain interpretable insights to develop robust detection systems and fine-tune LLMs for specific stylistic outcomes.

intermediate2 hours5 steps

The play

Acknowledge LLM Content Proliferation
Understand the growing volume of LLM-generated text and its potential misuse (spam, phishing, academic fraud). Recognize that this content is becoming increasingly sophisticated.
Shift from Binary Detection
Stop relying solely on 'human or AI' binary classification. This approach is often brittle and easily circumvented. Focus on understanding *how* LLM text differs stylistically.
Analyze Stylistic Variation Factors
Investigate how different elements impact writing style. Specifically, consider the target genre (e.g., news, academic, creative), the specific LLM architecture used (e.g., GPT-3, LLaMA), and the decoding strategy (e.g., greedy, top-k, temperature sampling).
Develop Interpretable Stylistic Metrics
Identify and measure specific linguistic features that differentiate human from LLM writing, and how these features change with different LLM settings. Examples include lexical diversity, sentence complexity, coherence, and specific grammatical patterns. This moves beyond 'black box' detection.
Apply Insights for Robust Systems
Use your understanding of stylistic nuances to build more resilient LLM detection systems that are harder to bypass. Alternatively, apply this knowledge to fine-tune LLMs to generate text with desired stylistic properties, enhancing control and ethical deployment.

Starter code

from transformers import pipeline

generator = pipeline('text-generation', model='gpt2')

# Generate text with different decoding strategies to observe stylistic variations
print("\n--- Low Temperature (more focused) ---")
print(generator("The quick brown fox", max_new_tokens=30, num_return_sequences=1, temperature=0.7)[0]['generated_text'])

print("\n--- High Temperature (more creative/random) ---")
print(generator("The quick brown fox", max_new_tokens=30, num_return_sequences=1, temperature=1.2)[0]['generated_text'])

print("\n--- Greedy Decoding (most probable token) ---")
print(generator("The quick brown fox", max_new_tokens=30, num_return_sequences=1, temperature=0.0)[0]['generated_text'])

Source

Paperarxiv.org