Paper·arxiv.org
ai-agentsautomationsecurityinfrastructuredeployment
Parallax: Why AI Agents That Think Must Never Act
Implement the 'Parallax' principle for AI agents: design them to reason and propose actions, but never execute autonomously. This ensures critical human oversight, mitigating risks and unintended consequences in enterprise AI deployments.
intermediate30 min5 steps
The play
- Define Agent Scope: 'Think, Don't Act'Establish a fundamental design constraint for your AI agent: it will process information, analyze situations, and propose solutions or actions, but it will never have direct execution privileges for real-world operations.
- Architect for Proposal GenerationStructure your agent's output to be a clear, structured proposal for an action. This proposal should include the action details, rationale, expected outcome, and any potential risks, rather than directly invoking an API or system command.
- Implement Human-in-the-Loop ApprovalDevelop a dedicated interface or workflow where all agent-generated proposals are presented to a human operator for explicit review and approval. This mechanism is mandatory before any action can proceed to execution.
- Design Clear Communication for ProposalsEnsure agent proposals are easy to understand for a human. Use plain language, clear formatting, and provide all necessary context for a human to make an informed decision quickly. Avoid jargon or ambiguous phrasing.
- Integrate Audit Trails and Fail-SafesLog every agent proposal, human decision (approve/reject), and the eventual outcome of approved actions. Implement fail-safe protocols that automatically halt or revert operations if human approval is not received within a defined timeframe or if an approved action fails.
Starter code
```python
class AI_Agent:
def __init__(self, name):
self.name = name
def analyze_and_propose(self, context):
# Simulate agent's reasoning and proposal generation
if "critical_system_alert" in context:
proposed_action = {
"type": "shutdown_service",
"service_id": "payment_gateway",
"reason": "Detected unusual traffic spike and potential breach attempt.",
"estimated_impact": "Temporary service outage (5-10 min)."
}
print(f"[{self.name}] Proposing action: {proposed_action['type']} for {proposed_action['service_id']}")
return proposed_action
else:
return {"type": "no_action_needed", "reason": "All systems nominal."}
def human_approval_workflow(proposal):
if proposal["type"] == "no_action_needed":
print("No action needed. Exiting workflow.")
return True
print(f"\n--- HUMAN APPROVAL REQUIRED ---")
print(f"Agent proposes: {proposal['type']} on {proposal.get('service_id', 'N/A')}")
print(f"Reason: {proposal['reason']}")
print(f"Impact: {proposal['estimated_impact']}")
decision = input("Approve action? (yes/no): ").lower()
if decision == 'yes':
print("Action approved by human. Proceeding to execution...")
return True
else:
print("Action rejected by human. Aborting.")
return False
def execute_action(action):
if action["type"] == "shutdown_service":
print(f"Executing: Shutting down service {action['service_id']}...")
# In a real system, this would call an API or script
print(f"Service {action['service_id']} shut down successfully.")
else:
print(f"Unknown action type: {action['type']}")
# --- Usage Example ---
agent = AI_Agent("SecurityCopilot")
# Scenario 1: Agent proposes a critical action
context_alert = {"critical_system_alert": True, "traffic": "high"}
proposal_critical = agent.analyze_and_propose(context_alert)
if human_approval_workflow(proposal_critical):
execute_action(proposal_critical)
else:
print("Critical action was not executed due to human rejection.")
print("\n---")
# Scenario 2: Agent finds no action needed
context_normal = {"critical_system_alert": False, "traffic": "normal"}
proposal_normal = agent.analyze_and_propose(context_normal)
human_approval_workflow(proposal_normal)
```Source