1. Task lifecycle FSM
Every task must pass through REFLECTING. Direct OBSERVING to COMPLETED is forbidden. The terminal exit is always ARCHIVED.
stateDiagram-v2
[*] --> RECEIVED
RECEIVED --> PLANNING: parsed
RECEIVED --> FAILED: parse failed or denied or budget out
PLANNING --> TOOL_EXECUTING: plan ready
PLANNING --> AWAITING_USER: missing param
TOOL_EXECUTING --> OBSERVING: tool returned
OBSERVING --> TOOL_EXECUTING: more steps
OBSERVING --> AWAITING_USER: high risk or ambiguous
OBSERVING --> REFLECTING: plan complete
AWAITING_USER --> TOOL_EXECUTING: user approved
AWAITING_USER --> FAILED: rejected or timeout 10min
REFLECTING --> DISTILLING: success true
REFLECTING --> FAILED: success false
DISTILLING --> COMPLETED: write skill plus memory
COMPLETED --> ARCHIVED: T plus 30 days
FAILED --> ARCHIVED: T plus 30 days
ARCHIVED --> [*]
RECEIVED + PLANNING ≤ 60s
AWAITING_USER timeout 10min
REFLECTING is mandatory
no ARCHIVED bypass
2. Skill lifecycle FSM
Score is updated by EWMA. Below 0.3 the skill auto-deprecates. User correction has the strongest negative weight.
2.1 State transitions
stateDiagram-v2
[*] --> DRAFT
DRAFT --> CANDIDATE: sandbox ok
DRAFT --> DEPRECATED: sandbox fails three times
CANDIDATE --> ACTIVE: real success three plus and score 0.7 plus
CANDIDATE --> DEGRADED: score in 0.3 to 0.7
ACTIVE --> DEGRADED: score drops to 0.3 to 0.7
DEGRADED --> ACTIVE: score 0.7 plus and last 5 success rate 80 plus
DEGRADED --> DEPRECATED: score under 0.3 or fails five times
DEPRECATED --> ARCHIVED: T plus 30 days or merged
ACTIVE --> ARCHIVED: merged
CANDIDATE --> ARCHIVED: merged
ARCHIVED --> [*]
2.2 Score updates
| Event | Formula | Weight |
| success | 0.9 * s + 0.1 * 1.0 | normal |
| failure | 0.9 * s + 0.1 * 0.0 | normal |
| sandbox pass | max(s, 0.6) | promote |
| sandbox fail | s * 0.5 | demote |
| thumbs up | min(1, s + 0.1) | strong |
| thumbs down | s * 0.7 | strong |
| user correction | s * 0.5 + revise | strongest |
3. Memory layer FSM and GC
L0 / L1 are permanent. L2 / L3 age out. GC only collects low-confidence + long-unread items.
stateDiagram-v2
[*] --> L3_RECENT: write at task end
L3_RECENT --> L4_SKILL: distilled into skill
L3_RECENT --> L5_ARCHIVE: 90 days unread no skill
L3_RECENT --> GC: confidence under 0.2 and 60 days unread
L4_SKILL --> L5_ARCHIVE: skill archived
L4_SKILL --> GC: confidence under 0.2 and 60 days unread
L5_ARCHIVE --> GC: 365 days and confidence under 0.2
L0_PERM: L0 core rules permanent
L1_PREF: L1 user prefs delete only on user revoke
L2_ENV: L2 env facts 90 days or doctor refresh
L2_ENV --> L1_PREF: user promotes to pref
GC --> [*]
L3 default 30 days
L5 default 365 days
single record ≤ 64 KB
redaction is mandatory before write
4. ER model
Three primary entities: Task, Skill, Memory. Auth and Channel join from the side. Time fields are RFC3339 (e.g. 2026-05-02T17:52:00Z).
erDiagram
TASK ||--o{ TOOL_CALL : invokes
TASK ||--|| REFLECTION : produces
TASK ||--o{ AUDIT_EVENT : emits
TASK ||--o{ COST_EVENT : accrues
REFLECTION ||--o{ SKILL : creates_or_updates
REFLECTION ||--|| MEMORY_L3 : writes
SKILL ||--o{ SKILL_VERSION : versions
SKILL ||--o{ FAILURE_PATTERN : records
SKILL ||--o{ MEMORY_L4 : linked_to
MEMORY_L3 ||--o| MEMORY_L4 : distill_into
MEMORY_L3 ||--o| MEMORY_L5 : archive
MEMORY_L4 ||--o| MEMORY_L5 : archive_when_skill_archived
AUTH_RECORD ||--o{ TOOL_CALL : authorizes
CHANNEL_SENDER ||--o{ TASK : originates
POLICY_RULE ||--o{ AUDIT_EVENT : evaluates
VAULT_ENTRY ||--o{ AUDIT_EVENT : substituted_in
TASK {
string task_id PK
string user_input_safe
string source
string selected_model
string state
string started_at
string finished_at
}
SKILL {
string id PK
string kind
int version
float score
string state
string created_from_task FK
}
MEMORY_L3 {
string id PK
string content
float confidence
string ts
string source
}
VAULT_ENTRY {
string name PK
string value_redacted
string kind
string fingerprint
string created_at
}
5. End-to-end sequence (single task)
From user input to ARCHIVED. Notice the redactor scrubs at three boundary points.
sequenceDiagram
autonumber
participant U as User
participant CLI as evoclaw CLI
participant RED as Redactor
participant RT as ConversationRuntime
participant PROV as Provider
participant TOOL as Tool Registry
participant SK as Skill Tree
participant MEM as Memory
participant LOG as JSONL Log
U->>CLI: type a task
CLI->>RED: scrub user_input
RED-->>CLI: user_input_safe
CLI->>RT: run(user_input_safe)
RT->>LOG: append Task record
loop until no more tool calls
RT->>PROV: stream(messages, tools)
PROV-->>RT: assistant_text + tool_calls
RT->>RED: scrub assistant_text + args
RED-->>RT: scrubbed
alt tool calls present
RT->>TOOL: invoke(call)
TOOL-->>RT: result
RT->>RED: scrub result
RED-->>RT: result_safe
RT->>LOG: append Turn record
end
end
RT->>PROV: reflection round
PROV-->>RT: ReflectionRecord
RT->>SK: distill + upsert skill
RT->>MEM: write L3 record
RT->>LOG: append End record
RT-->>CLI: final_text
CLI-->>U: output
6. ACP delegated loop
When provider is acp:<id>, EvoClaw spawns the upstream CLI as a subprocess and routes prompts through ACP.
sequenceDiagram
autonumber
participant U as User
participant RT as ConversationRuntime
participant AP as AcpProvider
participant CHILD as External CLI
participant LOG as JSONL Log
U->>RT: run(user_input_safe)
RT->>AP: stream(req)
AP->>CHILD: spawn(claude --acp)
AP->>CHILD: initialize handshake
CHILD-->>AP: serverInfo
AP->>CHILD: session/new
CHILD-->>AP: session_id
AP->>CHILD: session/prompt(text)
CHILD-->>AP: final text
AP-->>RT: stream events
RT->>LOG: append Turn + End
Note over CHILD: kill_on_drop true reaps child cleanly
7. MCP tool call
An MCP tool surfaces in the registry as mcp__<server>__<tool>. The model never sees the env auth.
sequenceDiagram
autonumber
participant RT as ConversationRuntime
participant TR as ToolRegistry
participant W as McpToolWrapper
participant SRV as MCP Server
participant LOG as JSONL Log
Note over TR: install_all spawns each server at startup
TR->>W: register(mcp_github_list_issues)
RT->>TR: invoke(mcp_github_list_issues, args)
TR->>W: run(ctx, args)
W->>SRV: tools call name args
SRV-->>W: ToolCallResult content isError
W-->>TR: rendered text
TR-->>RT: scrubbed result
RT->>LOG: append Turn
8. Permission ladder
Permission is a totally ordered ladder P0 ≤ P1 ≤ … ≤ P8. The default ceiling is P1; channels are hard-capped at P4. Tool descriptions are ≤ 80 chars.
flowchart LR
classDef p0 fill:#0f3a2a,stroke:#34d399,color:#fff
classDef p1 fill:#0f3a44,stroke:#22d3ee,color:#fff
classDef p2 fill:#1a2f4a,stroke:#5fb3ff,color:#fff
classDef p3 fill:#2a1f4a,stroke:#a78bfa,color:#fff
classDef p4 fill:#3a2f0a,stroke:#facc15,color:#fff
classDef p5 fill:#3a210a,stroke:#fb923c,color:#fff
classDef p6 fill:#3a1414,stroke:#f87171,color:#fff
classDef p7 fill:#3a1424,stroke:#ec4899,color:#fff
classDef p8 fill:#1a0014,stroke:#ec4899,color:#fff
P0[P0 read only read_file list_dir ask_user]:::p0
P1[P1 workspace write write_file patch_file]:::p1
P2[P2 local safe shell run_shell]:::p2
P3[P3 network web_fetch and MCP]:::p3
P4[P4 channel cap browser style ops]:::p4
P5[P5 user dir write]:::p5
P6[P6 system modify]:::p6
P7[P7 credential ops]:::p7
P8[P8 production ops]:::p8
P0 --> P1 --> P2 --> P3 --> P4 --> P5 --> P6 --> P7 --> P8
9. Closure matrix (auditor's checklist)
evoclaw doctor closure walks ~/.evoclaw/logs/*.jsonl and checks each row. A run is closed iff every row passes for every session.
| # | Producer | Consumer | Invariant |
| 1 | Runtime | JSONL log | each task has exactly one Task record at the head |
| 2 | Runtime | JSONL log | each task has ≥ 1 Turn record |
| 3 | Redactor | Task / Turn / End | no field contains a registered vault value or known-secret pattern |
| 4 | Runtime | JSONL log | each task has exactly one End record |
| 5 | Runtime | End record | state is COMPLETED or FAILED, never empty |
| 6 | Reflection | Memory L3 | completed task → one L3 write |
| 7 | Distillation | Skill YAML | completed task → zero or one DRAFT skill upsert |
| 8 | CostEngine | cost.jsonl | each Turn record → one CostEvent (best-effort) |
| 9 | Provider | Audit Bus | budget exceeded → HardStop event recorded |
| 10 | ACP child | OS | kill_on_drop reaps the subprocess; no zombie left after run |
| 11 | MCP child | OS | same as #10 — every spawned server reaped |
| 12 | Vault | filesystem | vault.json is chmod 600 on Unix |
| 13 | Skill EWMA | Skill state | score crossing thresholds triggers a state move within ≤1 turn |