Spaces:
Running
A newer version of the Gradio SDK is available:
6.1.0
Stage 6: Autonomous Deployment Guide
Complete guide for deploying configurations to real network devices using Overgrowth's autonomous deployment engine.
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Deployment Orchestration β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 1. Config Generation (Jinja2 Templates) β β
β β 2. Pre-Deployment Validation β β
β β 3. Device Connection (Netmiko/NAPALM) β β
β β 4. Configuration Deployment β β
β β 5. Post-Deployment Verification β β
β β 6. Automatic Rollback (on failure) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββ
β Cisco β β Arista β β Juniper β
β IOS/NXOS β β EOS β β JunOS β
ββββββββββββ ββββββββββββ ββββββββββββ
Components
1. DeviceDriver (agent/device_driver.py)
Manages connections and config deployment to network devices.
Supported Platforms:
- Cisco IOS
- Cisco NXOS (Nexus)
- Cisco IOS-XE (ASR, ISR, etc.)
- Arista EOS
- Juniper JunOS
Features:
- Connection pooling and management
- Config backup before deployment
- Automatic rollback on failure
- Dry-run mode (validate without deploying)
- Mock mode (testing without devices)
2. ConfigTemplateEngine (agent/config_templates.py)
Generates device configs from Jinja2 templates.
Built-in Templates:
cisco_ios_l2_switch- Layer 2 access switchcisco_ios_l3_router- Layer 3 router with OSPF/BGParista_eos- Arista EOS switch/routerjuniper_junos- Juniper router/switch
Features:
- Variable substitution from NetworkModel
- Custom template support
- Template validation
- Vendor-specific config generation
3. DeploymentEngine (agent/deployment_engine.py)
Orchestrates the entire deployment workflow.
Workflow:
- Generate config from template
- Connect to device
- Run pre-deployment checks
- Backup current config
- Deploy new config
- Run post-deployment checks
- Rollback if checks fail
- Record deployment history
Usage Examples
Example 1: Deploy Single Device
from agent.deployment_engine import DeploymentEngine, DeploymentTask, DeviceType
# Initialize engine
deployer = DeploymentEngine(use_napalm=True)
# Create deployment task
task = DeploymentTask(
device_id='core-sw-1',
device_type=DeviceType.CISCO_IOS,
hostname='192.168.1.10',
username='admin',
password='admin123',
config="""
hostname core-sw-1
!
vlan 10
name DATA
vlan 20
name VOICE
!
interface GigabitEthernet0/1
switchport mode access
switchport access vlan 10
no shutdown
!
""",
dry_run=False,
pre_checks=['command:show version'],
post_checks=['interface:GigabitEthernet0/1']
)
# Deploy
result = deployer.deploy_single_device(task)
print(f"Status: {result.status.value}")
print(f"Duration: {result.duration_seconds:.1f}s")
if result.status.value == 'success':
print("β Deployment successful!")
else:
print(f"β Deployment failed: {result.error}")
if result.rolled_back:
print("β Configuration rolled back")
Example 2: Generate and Deploy from Template
from agent.deployment_engine import DeploymentEngine
from agent.pipeline_engine import NetworkModel, Device, NetworkIntent
# Create network model
model = NetworkModel(
name="campus-network",
version="1.0",
intent=NetworkIntent(
description="Campus network deployment",
business_requirements=["High availability", "VLAN segmentation"],
constraints=["Budget friendly"]
),
devices=[
Device(
name="access-sw-1",
role="access",
model="Catalyst 2960",
vendor="Cisco",
mgmt_ip="192.168.1.20",
location="Building A",
interfaces=[
{
"name": "GigabitEthernet0/1",
"description": "Uplink to core",
"mode": "trunk",
"enabled": True
},
{
"name": "GigabitEthernet0/2",
"description": "Workstation port",
"mode": "access",
"vlan": 10,
"enabled": True
}
]
)
],
vlans=[
{"id": 10, "name": "DATA"},
{"id": 20, "name": "VOICE"},
{"id": 99, "name": "MANAGEMENT"}
],
subnets=[
{"network": "10.0.10.0/24", "vlan": 10},
{"network": "10.0.20.0/24", "vlan": 20}
],
routing={},
services=["DHCP", "NTP"]
)
# Initialize deployer
deployer = DeploymentEngine(use_napalm=True)
# Network context for templates
network_context = {
'vlans': model.vlans,
'routing': model.routing,
'domain_name': 'campus.local',
'ntp_servers': ['0.pool.ntp.org'],
'dns_servers': ['8.8.8.8']
}
# Credentials
credentials = {
'username': 'admin',
'password': 'secure123'
}
# Deploy each device
for device in model.devices:
result = deployer.generate_and_deploy(
device=device,
network_context=network_context,
credentials=credentials,
dry_run=False,
pre_checks=['command:show version'],
post_checks=['command:show running-config']
)
print(f"{device.name}: {result.status.value}")
Example 3: Dry-Run Mode (Test Without Deploying)
from agent.pipeline_engine import OvergrowthPipeline
pipeline = OvergrowthPipeline()
# Generate network model
intent = pipeline.stage1_consultation("Deploy 3-tier campus network")
model = pipeline.stage2_generate_sot(intent)
# Dry-run deployment (generates configs, validates, but doesn't deploy)
results = pipeline.stage6_autonomous_deploy(
model=model,
credentials={'username': 'admin', 'password': 'admin'},
dry_run=True, # No actual changes to devices
parallel=False
)
print(f"Dry-run complete: {results['successful']}/{results['total_devices']} would succeed")
for r in results['results']:
print(f" {r['device_id']}: {r['status']}")
if r['status'] == 'failed':
print(f" Error: {r['error']}")
Example 4: Parallel Deployment with Ray
from agent.pipeline_engine import OvergrowthPipeline
pipeline = OvergrowthPipeline()
model = pipeline.stage2_generate_sot(intent)
# Enable parallel mode
pipeline.enable_parallel_mode()
# Deploy to all devices in parallel
results = pipeline.stage6_autonomous_deploy(
model=model,
credentials={'username': 'admin', 'password': 'admin'},
dry_run=False,
parallel=True # Use Ray for concurrent deployment
)
print(f"Deployed to {results['successful']}/{results['total_devices']} devices")
print(f"Success rate: {results['success_rate']:.1f}%")
print(f"Rolled back: {results['rolled_back']}")
Example 5: Custom Pre/Post Validation Checks
from agent.deployment_engine import DeploymentEngine, DeploymentTask, DeviceType
deployer = DeploymentEngine()
task = DeploymentTask(
device_id='border-rtr-1',
device_type=DeviceType.CISCO_XE,
hostname='10.0.0.1',
username='admin',
password='admin',
config=router_config,
dry_run=False,
# Pre-deployment checks
pre_checks=[
'command:show version',
'command:show ip interface brief',
'ping:8.8.8.8', # Check internet connectivity
],
# Post-deployment checks
post_checks=[
'interface:GigabitEthernet0/0', # Verify interface up
'ping:10.0.1.1', # Verify internal connectivity
'command:show ip bgp summary', # Verify BGP
]
)
result = deployer.deploy_single_device(task)
# Check which validations passed/failed
print("Pre-checks:", result.pre_check_results)
print("Post-checks:", result.post_check_results)
Configuration Templates
Cisco IOS L2 Switch Template
Located in config_templates.py as CISCO_IOS_L2_SWITCH_TEMPLATE.
Variables:
device.name- Hostnamedevice.mgmt_ip- Management IPvlans- List of VLAN dicts (id,name)device.interfaces- List of interface dictsdefault_gateway- Default gateway IPntp_servers- List of NTP server IPsdns_servers- List of DNS server IPs
Example:
from agent.config_templates import generate_cisco_ios_config
config = generate_cisco_ios_config(
device=my_device,
vlans=[
{"id": 10, "name": "DATA"},
{"id": 20, "name": "VOICE"}
],
ntp_servers=['0.pool.ntp.org'],
dns_servers=['8.8.8.8'],
default_gateway='192.168.1.1'
)
Cisco IOS L3 Router Template
Includes routing protocols (OSPF, BGP, static routes).
Variables:
- All L2 variables plus:
routing.protocol-'ospf','bgp', or'static'routing.process_id- OSPF process IDrouting.networks- List of networks to advertiserouting.asn- BGP AS numberrouting.neighbors- List of BGP neighbor dicts
Example:
config = generate_cisco_ios_config(
device=router_device,
vlans=[],
routing={
'protocol': 'ospf',
'process_id': 1,
'networks': ['10.0.0.0 0.0.255.255'],
'area': 0
}
)
Custom Templates
Create custom Jinja2 templates:
from agent.config_templates import ConfigTemplateEngine
engine = ConfigTemplateEngine()
# Add custom template
custom_template = """
hostname {{ device.name }}
!
{% for vlan in vlans %}
vlan {{ vlan.id }}
name {{ vlan.name }}
{% endfor %}
!
"""
engine.add_custom_template('my_custom_template', custom_template)
# Use it
config = engine.render_template('my_custom_template', {
'device': {'name': 'my-switch'},
'vlans': [{'id': 10, 'name': 'DATA'}]
})
Validation Checks
Check Types
Command checks:
'command:show version' # Run command, pass if no error
Ping checks:
'ping:8.8.8.8' # Ping target, pass if successful
Interface checks:
'interface:GigabitEthernet0/1' # Check interface status, pass if up
Check Timing
Pre-checks: Run before config deployment
- Verify device accessible
- Check current state
- Validate prerequisites
Post-checks: Run after config deployment
- Verify config applied
- Test connectivity
- Validate services
Error Handling & Rollback
Automatic Rollback
If post-deployment checks fail, the engine automatically rolls back:
- Detect check failure
- Log error details
- Deploy previous config (from backup)
- Mark deployment as
ROLLED_BACK
result = deployer.deploy_single_device(task)
if result.rolled_back:
print(f"Deployment failed and was rolled back")
print(f"Reason: {result.error}")
print(f"Config restored to: {result.config_before[:100]}...")
Manual Rollback
from agent.device_driver import DeviceDriver, DeviceCredentials, DeviceType
driver = DeviceDriver()
# Connect
creds = DeviceCredentials(
hostname='192.168.1.10',
username='admin',
password='admin',
device_type=DeviceType.CISCO_IOS
)
conn = driver.connect(creds)
# Get current config
backup = driver.get_config('192.168.1.10', 'running')
# ... something goes wrong ...
# Rollback
driver.rollback_config('192.168.1.10', backup)
Multi-Vendor Support
Cisco IOS/IOS-XE
from agent.device_driver import DeviceType
# Cisco Catalyst, ISR, ASR
device_type = DeviceType.CISCO_IOS # or CISCO_XE
Supported Features:
- Config merge and replace
- Running/startup config backup
- Auto-save on Cisco IOS
Cisco NXOS (Nexus)
device_type = DeviceType.CISCO_NXOS
Features:
- Checkpoint/rollback support
- Config replace via NAPALM
Arista EOS
device_type = DeviceType.ARISTA_EOS
Features:
- Config sessions
- Atomic commits
- Fast boot times
Juniper JunOS
device_type = DeviceType.JUNIPER_JUNOS
Features:
- Candidate config
- Commit confirmed
- Rollback points
Testing Without Devices (Mock Mode)
All components support mock mode for development/testing:
from agent.device_driver import DeviceDriver
# Initialize in mock mode (auto-detected if Netmiko/NAPALM not installed)
driver = DeviceDriver()
print(f"Mock mode: {driver.mock_mode}") # True if no libraries
# Mock connections always succeed
conn = driver.connect(credentials)
print(f"Connected: {conn.status}") # CONNECTED
# Mock deployments simulate success
result = driver.deploy_config('device-1', config)
print(f"Deployed: {result.success}") # True
Deployment History & Metrics
from agent.deployment_engine import DeploymentEngine
deployer = DeploymentEngine()
# ... deploy devices ...
# Get summary
summary = deployer.get_deployment_summary()
print(f"Total deployments: {summary['total_deployments']}")
print(f"Success rate: {summary['success_rate']:.1f}%")
print(f"Average duration: {summary['avg_duration']:.1f}s")
# Recent deployments
for dep in summary['latest_deployments']:
print(f"{dep['device_id']}: {dep['status']} ({dep['duration']:.1f}s)")
Troubleshooting
Connection Failures
Symptom: Failed to connect: timeout
Solutions:
# Increase timeout
credentials = DeviceCredentials(
hostname='192.168.1.10',
username='admin',
password='admin',
device_type=DeviceType.CISCO_IOS,
timeout=60 # Increase from default 30s
)
# Check network connectivity
driver.verify_connectivity('device-id')
# Enable session logging for debugging
credentials.session_log = '/tmp/device-session.log'
Authentication Failures
Symptom: Failed to connect: authentication failed
Solutions:
# For devices requiring enable password
credentials = DeviceCredentials(
hostname='192.168.1.10',
username='admin',
password='admin',
secret='enable_password', # Enable secret
device_type=DeviceType.CISCO_IOS
)
Config Deployment Failures
Symptom: Deployment failed: command error
Solutions:
# Use dry-run to validate first
result = deployer.deploy_config(
device_id='device-1',
config=config,
dry_run=True # Test without applying
)
print(f"Would work: {result.success}")
# Check diff before deploying
if result.output:
print(f"Changes:\n{result.output}")
Post-Check Failures
Symptom: Post-deployment check failed: ping:8.8.8.8
Solutions:
# Add delay before post-checks
import time
time.sleep(5) # Wait for config to take effect
# Use more specific checks
post_checks=[
'command:show ip interface brief', # More specific than ping
'interface:GigabitEthernet0/1'
]
# Disable rollback for troubleshooting
# (manually verify and fix)
Production Best Practices
1. Always Use Dry-Run First
# Test deployment
dry_result = pipeline.stage6_autonomous_deploy(model, dry_run=True)
# Review results
if dry_result['success_rate'] == 100.0:
# Now deploy for real
real_result = pipeline.stage6_autonomous_deploy(model, dry_run=False)
2. Use Pre-Flight Validation
# Run Stage 0 validation before deployment
preflight = pipeline.stage0_preflight(model)
if not preflight['ready_to_deploy']:
print("Pre-flight failed - aborting")
print(f"Errors: {preflight['errors']}")
exit(1)
# Deploy only after validation passes
pipeline.stage6_autonomous_deploy(model)
3. Implement Change Windows
from datetime import datetime, time as dt_time
def in_change_window():
"""Check if current time is in approved change window"""
now = datetime.now()
# Only deploy between 2 AM - 4 AM
return dt_time(2, 0) <= now.time() <= dt_time(4, 0)
if not in_change_window():
print("Outside change window - aborting")
exit(1)
# Deploy during approved window
pipeline.stage6_autonomous_deploy(model)
4. Use Parallel Deployment Carefully
# Start with small batch
results = pipeline.parallel_deploy_fleet(
model=model,
staggered=True,
stages=[0.01, 0.05, 0.1, 1.0] # 1%, 5%, 10%, 100%
)
# Circuit breaker stops on high failure rate
5. Maintain Deployment Audit Trail
deployer = DeploymentEngine()
# Deploy
result = deployer.deploy_single_device(task)
# Log to external system
import json
with open(f'/var/log/deployments/{result.device_id}.json', 'w') as f:
json.dump({
'device_id': result.device_id,
'status': result.status.value,
'timestamp': result.timestamp.isoformat(),
'config_before': result.config_before,
'config_after': result.config_after,
'deployed_by': os.environ.get('USER'),
'duration': result.duration_seconds
}, f, indent=2)
Next Steps
- β Multi-vendor device support
- β Config templating
- β Pre/post validation
- β Automatic rollback
- π§ Full Ray parallel deployment integration
- π§ Advanced validation (pyATS test cases)
- π§ Change request workflow
- π§ Approval gates for production
You're now ready to deploy to real network devices! π