Parallax: Why AI Agents That Think Must Never Act

Implement the 'Parallax' principle for AI agents: design them to reason and propose actions, but never execute autonomously. This ensures critical human oversight, mitigating risks and unintended consequences in enterprise AI deployments.

intermediate30 min5 steps

The play

Define Agent Scope: 'Think, Don't Act'
Establish a fundamental design constraint for your AI agent: it will process information, analyze situations, and propose solutions or actions, but it will never have direct execution privileges for real-world operations.
Architect for Proposal Generation
Structure your agent's output to be a clear, structured proposal for an action. This proposal should include the action details, rationale, expected outcome, and any potential risks, rather than directly invoking an API or system command.
Implement Human-in-the-Loop Approval
Develop a dedicated interface or workflow where all agent-generated proposals are presented to a human operator for explicit review and approval. This mechanism is mandatory before any action can proceed to execution.
Design Clear Communication for Proposals
Ensure agent proposals are easy to understand for a human. Use plain language, clear formatting, and provide all necessary context for a human to make an informed decision quickly. Avoid jargon or ambiguous phrasing.
Integrate Audit Trails and Fail-Safes
Log every agent proposal, human decision (approve/reject), and the eventual outcome of approved actions. Implement fail-safe protocols that automatically halt or revert operations if human approval is not received within a defined timeframe or if an approved action fails.

Starter code

```python
class AI_Agent:
    def __init__(self, name):
        self.name = name

    def analyze_and_propose(self, context):
        # Simulate agent's reasoning and proposal generation
        if "critical_system_alert" in context:
            proposed_action = {
                "type": "shutdown_service",
                "service_id": "payment_gateway",
                "reason": "Detected unusual traffic spike and potential breach attempt.",
                "estimated_impact": "Temporary service outage (5-10 min)."
            }
            print(f"[{self.name}] Proposing action: {proposed_action['type']} for {proposed_action['service_id']}")
            return proposed_action
        else:
            return {"type": "no_action_needed", "reason": "All systems nominal."}

def human_approval_workflow(proposal):
    if proposal["type"] == "no_action_needed":
        print("No action needed. Exiting workflow.")
        return True

    print(f"\n--- HUMAN APPROVAL REQUIRED ---")
    print(f"Agent proposes: {proposal['type']} on {proposal.get('service_id', 'N/A')}")
    print(f"Reason: {proposal['reason']}")
    print(f"Impact: {proposal['estimated_impact']}")

    decision = input("Approve action? (yes/no): ").lower()
    if decision == 'yes':
        print("Action approved by human. Proceeding to execution...")
        return True
    else:
        print("Action rejected by human. Aborting.")
        return False

def execute_action(action):
    if action["type"] == "shutdown_service":
        print(f"Executing: Shutting down service {action['service_id']}...")
        # In a real system, this would call an API or script
        print(f"Service {action['service_id']} shut down successfully.")
    else:
        print(f"Unknown action type: {action['type']}")

# --- Usage Example ---
agent = AI_Agent("SecurityCopilot")

# Scenario 1: Agent proposes a critical action
context_alert = {"critical_system_alert": True, "traffic": "high"}
proposal_critical = agent.analyze_and_propose(context_alert)

if human_approval_workflow(proposal_critical):
    execute_action(proposal_critical)
else:
    print("Critical action was not executed due to human rejection.")

print("\n---")

# Scenario 2: Agent finds no action needed
context_normal = {"critical_system_alert": False, "traffic": "normal"}
proposal_normal = agent.analyze_and_propose(context_normal)
human_approval_workflow(proposal_normal)
```

Source

Paperarxiv.org