Ralph Red vs Blue v4 - Final Report

📊 Executive Summary

▼

15

Total Rounds

14

Blue Wins

0

Red Wins

1

Draws

18

Network Nodes

3

VRFs Tested

Challenge Overview

This challenge pitted autonomous AI agents against each other in a 15-round network chaos engineering exercise. The Red Team executed CCIE+ expert-level attacks including multi-layer issues, misdirection tactics, and silent policy failures. The Blue Team demonstrated exceptional diagnostic skills, successfully defending against 14 of 15 attacks.

Win Rate

Blue Team: 93.3% Win Rate

🎯 Round-by-Round Results

▼

ROUND 1

Triple-Layer Attack

ML-E6

Target: PE3 / GAMMA VRF

ULTIMATE BLUE WINS

Blue fixed all 3 components: ISIS interface, MPLS Node-SID, BGP extended community

ROUND 2

One-Way ACL

MIS-E4

Target: PE2 / BETA VRF

EXPERT BLUE WINS

Asymmetric ACL blocking return path - Blue identified direction correctly

ROUND 3

SRGB Mismatch

SR-E2

Target: P3 Core Router

EXPERT DRAW

Attack didn't break connectivity - suboptimal state only

ROUND 4

MTU Black Hole

INT-E4

Target: P2 Core Link

MEDIUM BLUE WINS

1400 MTU causing large packet drops - Blue found and fixed

ROUND 5

Multi-VRF Cascade

XV-E6

Target: ALL VRFs

ULTIMATE BLUE WINS

RT manipulation across 3 VRFs - Blue traced all mismatches

ROUND 6

Route-Map Sequence

POL-E2

Target: PE1 / ALPHA VRF

EXPERT BLUE WINS

Inverted deny/permit sequence - Blue found policy logic error

ROUND 7

Community Strip

MIS-E5

Target: PE5 / BETA VRF

EXPERT BLUE WINS

Silent community removal causing filtering - Blue identified

ROUND 8

Cross-VRF Route Leak

ML-E5

Target: ALPHA → BETA

EXPERT BLUE WINS

RT export leak between VRFs - Blue found extra RT

ROUND 9

Next-Hop-Self on RR

TE-E5

Target: RR1 + RR2

EXPERT BLUE WINS

VPN label mismatch from RR next-hop-self - Blue fixed

ROUND 10

BGP Policy Inversion

POL-E3

Target: PE1 / ALPHA VRF

EXPERT BLUE WINS

Deny-only route-map on PE-CE - Blue identified and removed

ROUND 11

ISIS Metric Oscillation

INT-E6

Target: P1 + P3

HARD BLUE WINS

Suboptimal routing from high metrics - Blue normalized

ROUND 12

Metric Maze

MIS-E3

Target: P1 + P2 + P3

HARD BLUE WINS

Distinguished P3 max-metric from P1/P2 decoys

ROUND 13

Anycast SID Conflict

SR-E6

Target: PE1 + PE2

EXPERT BLUE WINS

Duplicate anycast loopback with same SID - Blue removed

ROUND 14

VRF RT Import Removal

VRF-E4

Target: PE5 / BETA VRF

EXPERT BLUE WINS

Missing RT import - Blue restored configuration

ROUND 15 - FINAL BOSS

RR Policy Corruption + Decoy

ML-E4

Target: PE5 (real) + PE6 (decoy)

ULTIMATE BLUE WINS

Blue ignored PE6 decoy, found silent policy rejection on PE5 via PolicyReject status

📁 Attack Categories Analysis

▼

Categories Used

Multi-Layer

3

Misdirection

3

Policy/RCF

3

SR-MPLS

2

Intermittent

2

VRF/RT

2

Difficulty Distribution

ULTIMATE

3

EXPERT

9

HARD

2

MEDIUM

1

🛡️ Blue Team Performance Analysis

▼

Diagnostic Skills Rating

Layer-by-layer diagnosis

★ ★ ★ ★ ★

Excellent

Silent failure detection

★ ★ ★ ★ ★

Excellent

Misdirection resistance

★ ★ ★ ★ ★

Excellent

Multi-component fixes

★ ★ ★ ★ ★

Excellent

Policy analysis

★ ★ ★ ★ ★

Excellent

SR-MPLS understanding

★ ★ ★ ★ ☆

Good

Key Diagnostic Commands Used
show bgp vpn-ipv4 detail          # Found PolicyReject status
show ip route vrf <name>          # Identified missing routes
show vrf <name>                   # Checked RT configuration
show isis interface brief         # Found metric issues
show bgp summary                  # Verified session state vs route acceptance
show bgp neighbors X received-routes  # Compared received vs accepted
                    

⚔️ Red Team Analysis

▼

Attack Effectiveness

Impact Level	Count	Rounds
Service Outage	10	1,2,4,5,6,7,9,10,14,15
Suboptimal Only	4	3,8,11,13
Partial	1	12

Misdirection Results

Round	Decoy	Blue Distracted?
12	P1/P2 metrics	No
15	PE6 ISIS	No

                    Most Effective Attack Patterns
                    Silent Policy Rejection - BGP sessions UP, routes rejected with no error
RT Manipulation - Breaks VPN routing without obvious symptoms
Multi-Layer Attacks - Require fixing multiple issues to restore service

                

🔬 Technical Insights

▼

For Network Engineers

BGP session UP ≠ routes accepted

Always check received vs accepted routes

Visible alarms may be decoys

Don't assume obvious issues are root cause

RT mismatches are silent

Routes just don't appear - no error messages

Check policy in both directions

Inbound filtering is easy to miss

ISIS metrics affect SR paths

High metrics change the entire label stack

For Chaos Engineering

Multi-layer attacks are most effective

Require deeper investigation to resolve

Decoys work better with real issues

Pure decoys are quickly dismissed

Silent failures are hardest

No error messages to find

Policy attacks are subtle

Sessions stay healthy while routes fail

🌐 Network Topology Reference

▼

                                           SITE 1                                      SITE 2
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                                                 │
│     ┌──────┐         ┌──────┐                           ┌──────┐         ┌──────┐              │
│     │  CE1 │         │  CE2 │                           │  CE4 │         │  CE5 │              │
│     │ALPHA │         │ BETA │                           │ALPHA │         │ BETA │              │
│     └──┬───┘         └──┬───┘                           └──┬───┘         └──┬───┘              │
│        │                │       ┌──────┐                   │                │                  │
│        │                │       │  CE3 │                   │                │     ┌──────┐     │
│        │                │       │GAMMA │                   │                │     │  CE6 │     │
│        │                │       └──┬───┘                   │                │     │GAMMA │     │
│     ┌──┴───┐         ┌──┴───┐     │                     ┌──┴───┐         ┌──┴───┐ └──┬───┘     │
│     │ PE1  │         │ PE2  │  ┌──┴───┐                 │ PE4  │         │ PE5  │    │         │
│     │      │         │      │  │ PE3  │                 │      │         │      │ ┌──┴───┐     │
│     └──┬───┘         └──┬───┘  └──┬───┘                 └──┬───┘         └──┬───┘ │ PE6  │     │
│        │                │         │                        │                │     └──┬───┘     │
│        └───────┬────────┴─────────┘                        └────────┬───────┴────────┘         │
│                │                                                    │                          │
│             ┌──┴───┐      ┌──────┐      ┌──────┐      ┌──────┐   ┌──┴───┐                      │
│             │  P1  │──────│  P2  │──────│  P3  │──────│  P4  │───│      │                      │
│             └──────┘      └──┬───┘      └──────┘      └──────┘   └──────┘                      │
│                             │                                                                  │
│                    ┌────────┴────────┐                                                         │
│                    │                 │                                                         │
│                 ┌──┴───┐         ┌───┴──┐                                                      │
│                 │ RR1  │         │ RR2  │     Route Reflectors                                 │
│                 └──────┘         └──────┘                                                      │
│                                                                                                │
└─────────────────────────────────────────────────────────────────────────────────────────────────┘

VRF Mapping:
┌─────────┬────────────┬────────────┬──────────────────┬──────────────────┐
│   VRF   │     RT     │   PE (S1)  │    Subnet (S1)   │    Subnet (S2)   │
├─────────┼────────────┼────────────┼──────────────────┼──────────────────┤
│  ALPHA  │ 65000:100  │  PE1 → PE4 │  192.168.1.0/24  │  192.168.4.0/24  │
│  BETA   │ 65000:200  │  PE2 → PE5 │  192.168.2.0/24  │  192.168.5.0/24  │
│  GAMMA  │ 65000:300  │  PE3 → PE6 │  192.168.3.0/24  │  192.168.6.0/24  │
└─────────┴────────────┴────────────┴──────────────────┴──────────────────┘
                

⚙️ Environment & Execution Notes

▼

Lab Infrastructure

Platform	ContainerLab 0.71.1
Image	Arista cEOS 4.35.1F
Nodes	18 (2 RR, 4 P, 6 PE, 6 CE)
Framework	RALPH + GAIT

Lab Recovery Issues

During execution, baseline configs had several bugs that were fixed:

P2-P3 IP mismatch (fixed: P3 Ethernet2)
PE3-P2 IP mismatch (fixed: PE3 Ethernet1)
RR next-hop-self causing VPN label issues (removed from both RRs)
RCF not available in cEOS (adapted attacks to use route-maps)

🏆 Conclusion

▼

Blue Team Dominant Victory

14 wins out of 15 rounds including the ULTIMATE-level Final Boss

The Blue Team demonstrated exceptional CCIE-level diagnostic skills throughout this challenge, successfully defending against expert-level attacks including multi-layer issues, silent policy failures, and sophisticated misdirection tactics.

Key accomplishments include:

Fixed all 3 components of the Triple-Layer Attack in Round 1
Traced RT mismatches across all VRFs in Round 5
Distinguished real issues from decoys in Rounds 12 and 15
Identified silent policy rejection using PolicyReject status in the Final Boss

The autonomous RALPH framework successfully executed both attack and defense operations with full GAIT compliance, demonstrating the viability of AI-driven network chaos engineering.

"Expert-level attacks require expert-level diagnosis, and this Blue Team proved capable of navigating complex multi-layer attacks, silent failures, and misdirection attempts."

🏗️ RALPH/GAIT Architecture

▼

Framework Overview

RALPH (Rapid Autonomous Lab Protocol Handler) + GAIT (Git-Aware Iterative Tasking) enables fully autonomous Red and Blue teams competing in chaos engineering exercises with complete audit trails.

System Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                           REFEREE (Orchestrator)                             │
│                                                                              │
│  1. Restores baseline between rounds    4. Captures evidence at each phase  │
│  2. Updates PROMPT.md with specs        5. Scores results                   │
│  3. Launches RALPH autonomous loops     6. Documents outcomes               │
└─────────────────────────────────────────────────────────────────────────────┘
         │                                              │
         │ scripts/run-round.sh                         │
         ▼                                              ▼
┌────────────────────────────┐         ┌────────────────────────────────────┐
│       ralph-red/           │         │           ralph-blue/               │
│    (Autonomous Attacker)   │         │      (Autonomous Defender)          │
│                            │         │                                      │
│  ├── PROMPT.md (attack)    │         │  ├── PROMPT.md (NOC ticket)         │
│  ├── @AGENT.md (identity)  │         │  ├── @AGENT.md (identity)           │
│  ├── @fix_plan.md (tasks)  │ ──────► │  ├── @fix_plan.md (tasks)           │
│  ├── expert-catalog.json   │ creates │  ├── diagnostic-playbook.md         │
│  └── status.json           │ ticket  │  └── status.json                    │
└────────────────────────────┘         └────────────────────────────────────┘
         │                                              │
         │  SSH to ContainerLab                         │  SSH to ContainerLab
         ▼                                              ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                    ContainerLab Server ()                        │
│                                                                              │
│    ┌──────────────────────────────────────────────────────────────────┐     │
│    │                    18 Arista cEOS 4.35.1F Nodes                   │     │
│    │   RR: RR1, RR2  │  Core: P1-P4  │  PE: PE1-PE6  │  CE: CE1-CE6   │     │
│    │   VRFs: ALPHA (65000:100), BETA (65000:200), GAMMA (65000:300)    │     │
│    └──────────────────────────────────────────────────────────────────┘     │
└─────────────────────────────────────────────────────────────────────────────┘
                

Project Structure

projects/ralph-red-vs-blue-v4/
├── ralph-red/                    # Red Team autonomous agent
│   ├── PROMPT.md                 # Per-round attack specification
│   ├── @AGENT.md                 # Agent identity & instructions
│   ├── @fix_plan.md              # GAIT task tracking
│   ├── expert-catalog.json       # All 15 expert attacks
│   └── status.json               # RALPH loop state
│
├── ralph-blue/                   # Blue Team autonomous agent
│   ├── PROMPT.md                 # Per-round NOC ticket
│   ├── @AGENT.md                 # Agent identity & instructions
│   ├── @fix_plan.md              # GAIT task tracking
│   └── diagnostic-playbook.md    # Troubleshooting procedures
│
├── rounds/                       # Per-round evidence
│   └── round-XX/
│       ├── noc-ticket.md         # NOC ticket for Blue
│       ├── red-attack.log        # RALPH-Red execution log
│       ├── blue-diagnosis.log    # RALPH-Blue execution log
│       ├── summary.md            # Round results
│       └── evidence/
│           ├── baseline-state/
│           ├── post-attack-state/
│           └── post-fix-state/
│
├── scripts/
│   ├── run-round.sh              # Full round orchestration
│   ├── restore-baseline.sh       # Reset lab to known-good
│   └── capture-evidence.sh       # Snapshot network state
│
└── topology/ → symlink           # Lab definition
                

GAIT Iron Laws

1

NO NETWORK CHANGES WITHOUT A GIT BRANCH

2

NO ACTIONS WITHOUT COMMITS

3

NO CHANGES WITHOUT VERIFICATION

4

NO COMPLETION WITHOUT SUMMARY

Agent Configuration Files

File	Purpose
`PROMPT.md`	Per-round task instructions (attack spec or NOC ticket)
`@AGENT.md`	Agent identity, role, and execution instructions
`@fix_plan.md`	GAIT task tracking with checkboxes
`status.json`	RALPH loop state (calls, status, exit reason)
`progress.json`	Iteration progress for resume capability

GAIT Commit Types

Prefix	Purpose
`gait:`	RALPH loop iteration marker
`baseline:`	State verification before changes
`attack:`	Red team action executed
`fix:`	Blue team correction applied
`complete:`	Round finished successfully

Per-Round GAIT Workflow

Round N Start
    │
    ├─► git checkout -b round-N-<attack-name>
    │
    ├─► [RALPH-Red starts]
    │   ├─► gait: Loop #1 - starting
    │   ├─► attack: <action taken>
    │   └─► gait: Loop #1 - EXIT_SIGNAL
    │
    ├─► capture-evidence.sh post-attack
    │   └─► baseline: post-attack state captured
    │
    ├─► [RALPH-Blue starts]
    │   ├─► gait: Loop #1 - starting
    │   ├─► fix: <diagnosis and fix>
    │   └─► gait: Loop #1 - EXIT_SIGNAL
    │
    ├─► capture-evidence.sh post-fix
    │   └─► complete: Round N - <WINNER>
    │
    └─► Baseline restored for next round
                

Evidence Capture System

Three Phases per Round

./scripts/capture-evidence.sh <round> baseline     # Before attack
./scripts/capture-evidence.sh <round> post-attack  # After Red
./scripts/capture-evidence.sh <round> post-fix     # After Blue
                        

Evidence Files Captured

`connectivity.txt`	VRF ping tests (CE→CE)
`isis-adjacencies.txt`	ISIS neighbor state
`bgp-summary.txt`	BGP session state
`mpls-lfib.txt`	MPLS label tables
`vrf-routes.txt`	VRF routing tables

RALPH Loop Execution

Each agent runs in autonomous loop mode: ralph --monitor --timeout 15

Loop Mechanics

1. Read PROMPT.md
2. Execute SSH commands
3. Log to @fix_plan.md
4. GAIT commit
5. Check EXIT_SIGNAL

Exit Conditions

• EXIT_SIGNAL: true
• 2+ completion indicators
• Timeout (10-15 min)
• Manual abort (Ctrl+C)

Status Tracking

• loop_count: iterations
• calls_made_this_hour
• last_action: graceful_exit
• status: completed

Baseline Restore Process

# Between each round, restore-baseline.sh:
DEVICES=("rr1" "rr2" "p1" "p2" "p3" "p4" "pe1" "pe2" "pe3" "pe4" "pe5" "pe6")

1. Copy golden baseline configs to remote server
2. Apply configs to all 12 cEOS devices via configure replace
3. Wait 15 seconds for protocol convergence
4. Verify all 3 VRFs pass connectivity tests
                

Expert Attack Catalog (expert-catalog.json)

Category	Count	Attack Types
Multi-Layer	3	Triple-Layer, Cross-VRF Route Leak, Policy+Decoy
Misdirection	3	One-Way ACL, Metric Maze, Community Strip
Policy	3	Route-Map Swap, RCF Inversion, BGP Policy Inversion
SR-MPLS	2	SRGB Mismatch, Anycast SID Conflict
Intermittent	2	MTU Black Hole, ISIS Metric Oscillation
VRF/RT	1	Multi-VRF RT Cascade

                    Forbidden Attacks (Too Easy)
                    interface shutdown on CE-facing interfaces
neighbor X.X.X.X shutdown (BGP session shutdown)
no redistribute connected
no isis enable CORE alone

                

📦 Git & GAIT Tracking

▼

369

Files in Repo

245

Evidence Files

28K

Lines of Code

~80

GAIT Commits

Branch Information

* red-round-01-triple-layer  e23a59f  docs: Add extensive HTML report with full challenge results
  master                      c17f236  plan: Initialize Ralph Red vs Blue v4 - CCIE+ Expert Challenge
                

Red Team Actions

Round	Attack
15	ML-E4 RR Policy + Decoy (FINAL BOSS)
14	VRF-E4 RT Import Removal
13	SR-E6 Anycast SID Conflict
12	MIS-E3 Metric Maze
11	INT-E6 ISIS Metric Oscillation
10	POL-E3 BGP Policy Inversion
9	TE-E5 Next-Hop-Self on RR
8	ML-E5 Cross-VRF Route Leak
7	MIS-E5 Community Strip
6	POL-E2 Route-Map Sequence
5	XV-E6 Multi-VRF Cascade
4	INT-E4 MTU Black Hole
3	SR-E2 SRGB Mismatch
2	MIS-E4 One-Way ACL
1	ML-E6 Triple-Layer Attack

Blue Team Fixes

Round	Resolution
15	Removed RR-BLOCK-IN route-map
14	Restored RT import 65000:200
13	Removed duplicate SID from PE2
12	Fixed P3 max-metric (ignored decoys)
11	Normalized ISIS metrics on P1/P3
10	Removed inverted route-map
9	Restored next-hop-self on RRs
8	Removed leaked RT export
7	Restored community settings
6	Fixed route-map sequence
5	Restored all 3 VRF RTs
4	Restored MTU to 9214
3	DRAW - No connectivity break
2	Removed asymmetric ACL
1	Fixed ISIS + SID + BGP communities

Commit Types Distribution

gait:

~60

attack:

15

fix:

14

complete:

3

plan:

2

baseline:

1

GAIT (Git-Aware Iterative Tasking) Compliance

Every action by both Red and Blue teams was tracked with git commits following the GAIT protocol:

plan: - Initial planning and attack strategy documentation
baseline: - Pre-attack state verification
attack: - Red team attack execution
fix: - Blue team remediation actions
complete: - Round completion verification
gait: - Autonomous loop iteration tracking

Evidence Directory Structure

rounds/
├── round-01/
│   ├── evidence/
│   │   ├── baseline-state/
│   │   │   └── connectivity.txt
│   │   ├── post-attack-state/
│   │   │   └── connectivity.txt
│   │   └── post-fix-state/
│   │       └── connectivity.txt
│   ├── red-attack.log
│   ├── blue-diagnosis.log
│   └── summary.md
├── round-02/
│   └── ... (same structure)
...
└── round-15/
    └── ... (same structure)

ralph-red/
├── PROMPT.md          # Attack specifications
├── logs/              # Execution logs
│   └── claude_output_*.log
└── status.json        # RALPH state

ralph-blue/
├── PROMPT.md          # NOC tickets
├── logs/              # Diagnosis logs
│   └── claude_output_*.log
└── status.json        # RALPH state
                

RED TEAM

BLUE TEAM

📊 Executive Summary

Challenge Overview

Win Rate

🎯 Round-by-Round Results

📁 Attack Categories Analysis

Categories Used

Difficulty Distribution

🛡️ Blue Team Performance Analysis

Diagnostic Skills Rating

Key Diagnostic Commands Used

⚔️ Red Team Analysis

Attack Effectiveness

Misdirection Results

Most Effective Attack Patterns

🔬 Technical Insights

For Network Engineers

For Chaos Engineering

🌐 Network Topology Reference

⚙️ Environment & Execution Notes

Lab Infrastructure

Lab Recovery Issues

🏆 Conclusion

Blue Team Dominant Victory

🏗️ RALPH/GAIT Architecture

Framework Overview

System Architecture

Project Structure

GAIT Iron Laws

Agent Configuration Files

GAIT Commit Types

Per-Round GAIT Workflow

Evidence Capture System

Three Phases per Round

Evidence Files Captured

RALPH Loop Execution

Baseline Restore Process

Expert Attack Catalog (expert-catalog.json)

Forbidden Attacks (Too Easy)

📦 Git & GAIT Tracking

Branch Information

Red Team Actions

Blue Team Fixes

Commit Types Distribution

GAIT (Git-Aware Iterative Tasking) Compliance

Evidence Directory Structure