A newer version of the Gradio SDK is available:
6.2.0
Risk Model (Draft)
1. Overview
Lightning mode translates a preset change request into a deterministic risk score (0β100) and a risk level. The model focuses on intent metadata onlyβno MAESTRO telemetry in Phase 1βso that judges can see how the MCP server reasons about risk in milliseconds. Each preset captures pre-change health, magnitude, and post-change signals, and the scoring engine turns those inputs into the same risk JSON exposed by the FastAPI MCP endpoint.
2. Inputs
For every (change_type, preset_id) we define the following fields:
| Field | Description |
|---|---|
change_type |
vlan, interface, or bgp_neighbor. Determines base impact weight. |
preset_id |
Scenario identifier (e.g. leaf_tor_vlan_stage, tor_uplink_shutdown). |
pre_core_healthy |
True/False flag indicating control-plane health before the change. |
pre_interface_errors |
Whether interface errors already exist on affected devices. |
pre_existing_alarms |
Whether any alarms are active in the change scope. |
num_devices_touched |
How many devices the change modifies. Used for impact magnitude. |
post_lost_adjacencies |
Count of fabric adjacencies that disappear after the change. |
post_new_alarms |
Whether new alarms fire after the change. |
post_interface_errors |
Whether interface errors appear after the change. |
blast_radius_summary |
Human-readable description of the scope. |
context_note |
Short narrative used to build the explanation string. |
These values live in server/app/mcp.py inside the PRESETS mapping.
3. Scoring algorithm
Baseline pre-change (0β30)
baseline = 0 +15 if pre_core_healthy is False +10 if pre_interface_errors is True +10 if pre_existing_alarms is True clamp 0β30Change impact (10β55)
impact_type_base = 10 (VLAN) | 25 (interface) | 35 (BGP neighbor) impact_magnitude = min(20, 2 * num_devices_touched) change_impact = impact_type_base + impact_magnitudePost-change penalties (0β40)
post_penalty = 0 +20 if post_lost_adjacencies > 0 +10 if post_new_alarms is True +10 if post_interface_errors is True clamp 0β40Final score + level
risk_score_raw = baseline + change_impact + post_penalty risk_score = clamp(risk_score_raw, 0, 100)Levels:
- 0β30 β
low - 31β70 β
medium - 71β100 β
high
- 0β30 β
The FastAPI server uses the same logic in simulate_network_change.
4. Worked examples
VLAN β leaf_tor_vlan_stage
- Inputs: healthy core, no alarms, 2 devices touched, no post-change penalties.
- Scores: baseline 0, impact 14, post 0 β risk 14 (
low). - Interpretation: localized change with clean pre/post checks β safe to stage.
Interface β tor_uplink_shutdown
- Inputs: healthy pre-state, 1 device, but 1 adjacency lost + new alarms after shutdown.
- Scores: baseline 0, impact 27, post 30 β risk 57 (
medium). - Interpretation: redundancy keeps risk from going
high, but alarms + lost adjacency matter.
BGP β leaf_bgp_fabric_neighbor_add
- Inputs: healthy pre-state, 1 device, no penalties.
- Scores: baseline 0, impact 37, post 0 β risk 37 (
medium). - Interpretation: even clean BGP adds carry control-plane sensitivity, so Lightning keeps risk mid-range.
5. Limitations / future work
- Presets emulate checks; future phases will populate them from MAESTRO telemetry.
- Only three change types are modeled. WAN/core workflows will add more bases and penalties.
- Full mode is still a placeholder; Lightning simply annotates that
mode=fullis not implemented yet. - No randomness; this phase is deterministic by design so MCP judges can validate outputs offline.