videointermediate
Microsoft AutoGen Tried to Buy 500 MacBooks. I Built the Layer That Stopped It.
By The Civic Stackyoutube
View original on youtubeA developer built a safety layer called OntoGuard that prevented Microsoft AutoGen from making unauthorized purchases of 500 MacBooks. The solution demonstrates the critical need for guardrails in AI agent systems to prevent unintended autonomous actions. OntoGuard acts as an intermediary validation layer that checks agent decisions against business rules and ontologies before execution, ensuring AI agents operate within defined boundaries.
Key Points
- •AI agents can autonomously execute actions without proper safeguards, leading to unintended consequences like unauthorized bulk purchases
- •OntoGuard is a validation layer that intercepts agent actions before execution to verify compliance with business rules
- •Implement ontology-based constraints to define what agents are allowed to do within specific domains and contexts
- •Use semantic validation to ensure agent decisions align with organizational policies and financial limits
- •Create approval workflows for high-risk actions (purchases, deletions, system changes) even in autonomous systems
- •Monitor and log all agent decisions for audit trails and debugging when unexpected behavior occurs
- •Design agent systems with explicit boundaries rather than relying on implicit safety assumptions
- •Separate agent reasoning from action execution to enable validation checkpoints in between
Found this useful? Add it to a playbook for a step-by-step implementation guide.
Workflow Diagram
Start Process
Step A
Step B
Step C
Complete
Concepts
Artifacts (2)
OntoGuard Safety Layer Architectureyamlconfig
# OntoGuard - AI Agent Safety Layer
# Validates agent actions against business rules and ontologies
validation_rules:
purchase_limits:
max_single_item: 10000
max_daily_total: 50000
requires_approval_above: 5000
restricted_actions:
- bulk_purchase_threshold: 100
- system_deletion: true
- credential_access: true
domain_constraints:
ecommerce:
allowed_actions: [search, add_to_cart, checkout]
blocked_actions: [modify_inventory, change_pricing]
finance:
allowed_actions: [view_balance, transfer_within_limit]
blocked_actions: [modify_ledger, delete_transactions]
ontology:
entity_types:
- purchase_order
- user_profile
- inventory_item
relationships:
purchase_order.quantity > 50: requires_approval
purchase_order.total > 5000: requires_approval
user_profile.role == 'agent': restricted_actions_applyOntoGuard Validation Middlewarepythonscript
class OntoGuard:
"""Safety layer for AI agent action validation"""
def __init__(self, rules_config, ontology):
self.rules = rules_config
self.ontology = ontology
self.audit_log = []
def validate_action(self, agent_action):
"""
Intercept and validate agent actions before execution
"""
action_type = agent_action.get('type')
action_params = agent_action.get('params', {})
# Check against ontology constraints
if not self._check_ontology_compliance(action_type, action_params):
return self._block_action(agent_action, 'ontology_violation')
# Check against business rules
if not self._check_business_rules(action_type, action_params):
return self._block_action(agent_action, 'rule_violation')
# Check for high-risk actions requiring approval
if self._is_high_risk(action_type, action_params):
return self._require_approval(agent_action)
# Action is safe to execute
return self._approve_action(agent_action)
def _check_ontology_compliance(self, action_type, params):
"""Verify action aligns with domain ontology"""
if action_type not in self.ontology.get('allowed_actions', []):
return False
# Check entity relationships
for constraint in self.ontology.get('constraints', []):
if not self._evaluate_constraint(constraint, params):
return False
return True
def _check_business_rules(self, action_type, params):
"""Verify action complies with business rules"""
if action_type == 'purchase':
quantity = params.get('quantity', 0)
total = params.get('total', 0)
if quantity > self.rules['purchase_limits']['max_single_item']:
return False
if total > self.rules['purchase_limits']['max_daily_total']:
return False
return True
def _is_high_risk(self, action_type, params):
"""Identify actions requiring human approval"""
if action_type == 'purchase':
quantity = params.get('quantity', 0)
if quantity > self.rules['purchase_limits']['bulk_purchase_threshold']:
return True
return action_type in self.rules.get('restricted_actions', [])
def _block_action(self, action, reason):
"""Block unsafe action and log"""
self.audit_log.append({
'action': action,
'status': 'blocked',
'reason': reason,
'timestamp': datetime.now()
})
return {'status': 'blocked', 'reason': reason}
def _require_approval(self, action):
"""Flag action for human approval"""
self.audit_log.append({
'action': action,
'status': 'pending_approval',
'timestamp': datetime.now()
})
return {'status': 'pending_approval', 'action_id': str(uuid.uuid4())}
def _approve_action(self, action):
"""Approve safe action for execution"""
self.audit_log.append({
'action': action,
'status': 'approved',
'timestamp': datetime.now()
})
return {'status': 'approved', 'execute': True}