Key outcomes
- Implement agent routing and tool interfaces
- Instrument evaluation events and telemetry
- Package a reusable agent starter kit
Deliverables
- Agent starter repo
- Tool contract definitions
- Telemetry checklist
Prerequisites
- AE-101
- AE-102
Evaluation signals
- AGENT-ROUTER-001
- EVAL-HOOK-001
Persona fit
Assistant orchestration
View assistant playbookScout
AgentResearch horizons, regulation updates, and pattern watchlists.
- Refresh critical intel within 24 hours of change
- Maintain 95% citation accuracy
- Flag module freshness risks automatically
Coach
AgentPair with learners during modules, labs, and retrospectives.
- Median response under 2 seconds
- Satisfaction above 4.6/5
- Escalate risky experiments within 10 minutes
Critic
AgentGuardrails, evaluations, and red-team simulations.
- Detect 98% evaluation anomalies
- Zero unlogged high-severity incidents
- Attach control evidence to every flagged issue
Archivist
AgentEvidence locker, credential manifests, and knowledge graph links.
- Tag 100% deliverables with owners and signals
- Keep schema drift under 1%
- Generate credential payloads automatically
Companion
AgentHealth, pacing, and personalised nudges across squads.
- On-time nudges for 90% milestones
- Keep burnout false positives below 5%
- Publish weekly sponsor-ready progress snapshots
Navigator
AgentCTA instrumentation, sponsor digest composition, and mastery guardrails.
- Cover 95% persona CTAs every sprint
- Generate sponsor digest drafts within 5 minutes of module completion
- Hold mastery drift within one tier
Micro lessons
Toolchain Blueprint
35 minObjective: Design routing logic and tool adapters for the coding agent.
Activities
- List candidate tools
- Define interface contracts
- Sketch routing diagram
Knowledge checks
- Which tool handles evaluation resets?
- How are secrets managed?
Telemetry Wiring
30 minObjective: Emit Langfuse and OpenTelemetry traces for each agent step.
Activities
- Install tracing SDK
- Label spans with guardrail IDs
- Verify run export to Supabase stub
Knowledge checks
- Which span indicates prompt rewrite?
- Where are traces stored?
Evaluation Dry Run
40 minObjective: Execute scripts/eval-harness locally and interpret initial metrics.
Activities
- Select eval config
- Run realtime and perception probes
- Record baseline metrics
Knowledge checks
- What is the realtime freshness score?
- Which preparedness gap was logged?
Knowledge points
Agent Skeleton Architecture
Define planner, executor, critic, and monitor components with clear data contracts.
Evaluation Baseline Metrics
Capture pass rate, latency, and preparedness averages as the baseline before scaling runs.
Micro paths featuring this module
Stand up a dependable coding agent that ships quality pull requests.
Credential alignment
This module contributes evidence across multiple credentials. See the credential framework for details.