EvoClaw — Design Diagrams

1. Task lifecycle FSM

Every task must pass through REFLECTING. Direct OBSERVING to COMPLETED is forbidden. The terminal exit is always ARCHIVED.

stateDiagram-v2 [*] --> RECEIVED RECEIVED --> PLANNING: parsed RECEIVED --> FAILED: parse failed or denied or budget out PLANNING --> TOOL_EXECUTING: plan ready PLANNING --> AWAITING_USER: missing param TOOL_EXECUTING --> OBSERVING: tool returned OBSERVING --> TOOL_EXECUTING: more steps OBSERVING --> AWAITING_USER: high risk or ambiguous OBSERVING --> REFLECTING: plan complete AWAITING_USER --> TOOL_EXECUTING: user approved AWAITING_USER --> FAILED: rejected or timeout 10min REFLECTING --> DISTILLING: success true REFLECTING --> FAILED: success false DISTILLING --> COMPLETED: write skill plus memory COMPLETED --> ARCHIVED: T plus 30 days FAILED --> ARCHIVED: T plus 30 days ARCHIVED --> [*]

RECEIVED + PLANNING ≤ 60s AWAITING_USER timeout 10min REFLECTING is mandatory no ARCHIVED bypass

2. Skill lifecycle FSM

Score is updated by EWMA. Below 0.3 the skill auto-deprecates. User correction has the strongest negative weight.

2.1 State transitions

stateDiagram-v2 [*] --> DRAFT DRAFT --> CANDIDATE: sandbox ok DRAFT --> DEPRECATED: sandbox fails three times CANDIDATE --> ACTIVE: real success three plus and score 0.7 plus CANDIDATE --> DEGRADED: score in 0.3 to 0.7 ACTIVE --> DEGRADED: score drops to 0.3 to 0.7 DEGRADED --> ACTIVE: score 0.7 plus and last 5 success rate 80 plus DEGRADED --> DEPRECATED: score under 0.3 or fails five times DEPRECATED --> ARCHIVED: T plus 30 days or merged ACTIVE --> ARCHIVED: merged CANDIDATE --> ARCHIVED: merged ARCHIVED --> [*]

2.2 Score updates

Event	Formula	Weight
success	`0.9 * s + 0.1 * 1.0`	normal
failure	`0.9 * s + 0.1 * 0.0`	normal
sandbox pass	`max(s, 0.6)`	promote
sandbox fail	`s * 0.5`	demote
thumbs up	`min(1, s + 0.1)`	strong
thumbs down	`s * 0.7`	strong
user correction	`s * 0.5` + revise	strongest

3. Memory layer FSM and GC

L0 / L1 are permanent. L2 / L3 age out. GC only collects low-confidence + long-unread items.

stateDiagram-v2 [*] --> L3_RECENT: write at task end L3_RECENT --> L4_SKILL: distilled into skill L3_RECENT --> L5_ARCHIVE: 90 days unread no skill L3_RECENT --> GC: confidence under 0.2 and 60 days unread L4_SKILL --> L5_ARCHIVE: skill archived L4_SKILL --> GC: confidence under 0.2 and 60 days unread L5_ARCHIVE --> GC: 365 days and confidence under 0.2 L0_PERM: L0 core rules permanent L1_PREF: L1 user prefs delete only on user revoke L2_ENV: L2 env facts 90 days or doctor refresh L2_ENV --> L1_PREF: user promotes to pref GC --> [*]

L3 default 30 days L5 default 365 days single record ≤ 64 KB redaction is mandatory before write

4. ER model

Three primary entities: Task, Skill, Memory. Auth and Channel join from the side. Time fields are RFC3339 (e.g. 2026-05-02T17:52:00Z).

erDiagram TASK ||--o{ TOOL_CALL : invokes TASK ||--|| REFLECTION : produces TASK ||--o{ AUDIT_EVENT : emits TASK ||--o{ COST_EVENT : accrues REFLECTION ||--o{ SKILL : creates_or_updates REFLECTION ||--|| MEMORY_L3 : writes SKILL ||--o{ SKILL_VERSION : versions SKILL ||--o{ FAILURE_PATTERN : records SKILL ||--o{ MEMORY_L4 : linked_to MEMORY_L3 ||--o| MEMORY_L4 : distill_into MEMORY_L3 ||--o| MEMORY_L5 : archive MEMORY_L4 ||--o| MEMORY_L5 : archive_when_skill_archived AUTH_RECORD ||--o{ TOOL_CALL : authorizes CHANNEL_SENDER ||--o{ TASK : originates POLICY_RULE ||--o{ AUDIT_EVENT : evaluates VAULT_ENTRY ||--o{ AUDIT_EVENT : substituted_in TASK { string task_id PK string user_input_safe string source string selected_model string state string started_at string finished_at } SKILL { string id PK string kind int version float score string state string created_from_task FK } MEMORY_L3 { string id PK string content float confidence string ts string source } VAULT_ENTRY { string name PK string value_redacted string kind string fingerprint string created_at }

5. End-to-end sequence (single task)

From user input to ARCHIVED. Notice the redactor scrubs at three boundary points.

sequenceDiagram autonumber participant U as User participant CLI as evoclaw CLI participant RED as Redactor participant RT as ConversationRuntime participant PROV as Provider participant TOOL as Tool Registry participant SK as Skill Tree participant MEM as Memory participant LOG as JSONL Log U->>CLI: type a task CLI->>RED: scrub user_input RED-->>CLI: user_input_safe CLI->>RT: run(user_input_safe) RT->>LOG: append Task record loop until no more tool calls RT->>PROV: stream(messages, tools) PROV-->>RT: assistant_text + tool_calls RT->>RED: scrub assistant_text + args RED-->>RT: scrubbed alt tool calls present RT->>TOOL: invoke(call) TOOL-->>RT: result RT->>RED: scrub result RED-->>RT: result_safe RT->>LOG: append Turn record end end RT->>PROV: reflection round PROV-->>RT: ReflectionRecord RT->>SK: distill + upsert skill RT->>MEM: write L3 record RT->>LOG: append End record RT-->>CLI: final_text CLI-->>U: output

6. ACP delegated loop

When provider is acp:<id>, EvoClaw spawns the upstream CLI as a subprocess and routes prompts through ACP.

sequenceDiagram autonumber participant U as User participant RT as ConversationRuntime participant AP as AcpProvider participant CHILD as External CLI participant LOG as JSONL Log U->>RT: run(user_input_safe) RT->>AP: stream(req) AP->>CHILD: spawn(claude --acp) AP->>CHILD: initialize handshake CHILD-->>AP: serverInfo AP->>CHILD: session/new CHILD-->>AP: session_id AP->>CHILD: session/prompt(text) CHILD-->>AP: final text AP-->>RT: stream events RT->>LOG: append Turn + End Note over CHILD: kill_on_drop true reaps child cleanly

7. MCP tool call

An MCP tool surfaces in the registry as mcp__<server>__<tool>. The model never sees the env auth.

sequenceDiagram autonumber participant RT as ConversationRuntime participant TR as ToolRegistry participant W as McpToolWrapper participant SRV as MCP Server participant LOG as JSONL Log Note over TR: install_all spawns each server at startup TR->>W: register(mcp_github_list_issues) RT->>TR: invoke(mcp_github_list_issues, args) TR->>W: run(ctx, args) W->>SRV: tools call name args SRV-->>W: ToolCallResult content isError W-->>TR: rendered text TR-->>RT: scrubbed result RT->>LOG: append Turn

8. Permission ladder

Permission is a totally ordered ladder P0 ≤ P1 ≤ … ≤ P8. The default ceiling is P1; channels are hard-capped at P4. Tool descriptions are ≤ 80 chars.

flowchart LR classDef p0 fill:#0f3a2a,stroke:#34d399,color:#fff classDef p1 fill:#0f3a44,stroke:#22d3ee,color:#fff classDef p2 fill:#1a2f4a,stroke:#5fb3ff,color:#fff classDef p3 fill:#2a1f4a,stroke:#a78bfa,color:#fff classDef p4 fill:#3a2f0a,stroke:#facc15,color:#fff classDef p5 fill:#3a210a,stroke:#fb923c,color:#fff classDef p6 fill:#3a1414,stroke:#f87171,color:#fff classDef p7 fill:#3a1424,stroke:#ec4899,color:#fff classDef p8 fill:#1a0014,stroke:#ec4899,color:#fff P0[P0 read only read_file list_dir ask_user]:::p0 P1[P1 workspace write write_file patch_file]:::p1 P2[P2 local safe shell run_shell]:::p2 P3[P3 network web_fetch and MCP]:::p3 P4[P4 channel cap browser style ops]:::p4 P5[P5 user dir write]:::p5 P6[P6 system modify]:::p6 P7[P7 credential ops]:::p7 P8[P8 production ops]:::p8 P0 --> P1 --> P2 --> P3 --> P4 --> P5 --> P6 --> P7 --> P8

9. Closure matrix (auditor's checklist)

evoclaw doctor closure walks ~/.evoclaw/logs/*.jsonl and checks each row. A run is closed iff every row passes for every session.

#	Producer	Consumer	Invariant
1	Runtime	JSONL log	each task has exactly one Task record at the head
2	Runtime	JSONL log	each task has ≥ 1 Turn record
3	Redactor	Task / Turn / End	no field contains a registered vault value or known-secret pattern
4	Runtime	JSONL log	each task has exactly one End record
5	Runtime	End record	state is COMPLETED or FAILED, never empty
6	Reflection	Memory L3	completed task → one L3 write
7	Distillation	Skill YAML	completed task → zero or one DRAFT skill upsert
8	CostEngine	cost.jsonl	each Turn record → one CostEvent (best-effort)
9	Provider	Audit Bus	budget exceeded → HardStop event recorded
10	ACP child	OS	kill_on_drop reaps the subprocess; no zombie left after run
11	MCP child	OS	same as #10 — every spawned server reaped
12	Vault	filesystem	vault.json is chmod 600 on Unix
13	Skill EWMA	Skill state	score crossing thresholds triggers a state move within ≤1 turn