Kleos is cognitive infrastructure for AI agents. 16 Rust crates: persistent memory with FSRS-6 decay, 4-channel hybrid search, knowledge graphs, 7 coordination services, and hardware-backed security. One binary. Zero cloud dependencies.
Modular capabilities that compose into a full cognitive memory layer. Built in Rust for speed, safety, and zero runtime dependencies.
Power-law forgetting that models human memory. Memories fade unless reinforced.
10 typed edge relations, Louvain community detection, PageRank with 180-day temporal decay. 2-hop graph traversal augments search.
4-channel RRF: vector similarity, FTS5 BM25, graph traversal, personality signals. IBM Granite cross-encoder reranks top-K.
SQLCipher database encryption. Keyfile, environment variable, or YubiKey HMAC-SHA1. Argon2id KDF.
Learned preferences, style guides, and user-specific behaviors that persist.
View memory state at any point in time. See what the agent knew and when.
Shard-per-tenant isolation. Deadpool reader/writer pools. Lazy loading with LRU eviction. Quota enforcement.
Break complex memories into atomic facts. Each scored and decays independently.
Token-aware assembly. Fill context windows optimally for any model.
Single POST /activity fans out to 6 services: Chiasm tasks, Axon events, Broca logs, Thymus metrics, Skills, Memory.
Secure credential daemon. Agent keys, secret resolution, proxy injection. Never hardcode again.
Modern associative memory (Ramsauer 2020). Pattern completion with exponential capacity.
6-stage autonomous consolidation: replay, merge, prune, discover, decorrelate, resolve. Runs on schedule like biological sleep.
Track cause/effect relationships. Build reasoning chains across memories with confidence scoring.
IBM Granite ONNX model. Semantic precision on top-K results after initial retrieval.
Secure tool execution. MCP, Shell, Web, GUI backends with sandboxing and security policies.
LanceDB vector index handles 1M+ memories per user. Single-user or production multi-tenant.
Structured reasoning protocol. spec_task, log_hypothesis, verify, challenge_code. Quality workflows baked in.
Session-scoped Guardian. Batched observation flushing with retry, file watcher, Prometheus metrics, Syntheos integration.
Every agent and model gets its own encrypted API key. Full audit trail of who stored what.
Drop-in hooks for Claude Code. Auto context loading, session tracking, memory storage. Zero manual prompting.
What it actually looks like to use persistent memory.
Your agent pulls context from Kleos. It knows the project state, your preferences, what you decided last time, and what's still unresolved.
New architecture choices, deployment configs, bug resolutions -- all stored with importance scoring and auto-linked to related memories.
Kleos catches contradictions between new information and existing knowledge. Your agent surfaces them instead of silently overwriting history.
Important memories grow stronger. Irrelevant details lose retrieval strength. The knowledge graph reorganizes around what actually matters.
Not a transcript dump. Not a vector search over flat files. Weighted, prioritized, personality-aware context -- assembled from a living knowledge graph.
Click "Recall" to see how spaced repetition builds lasting memory.
Kleos includes a built-in WebGL graph visualization. Explore connections between memories, search, create, edit, all from the browser.
Explore a live demo with sample memories and auto-generated links. Click nodes to inspect. Search, filter by category, create new memories. Keyboard shortcuts for power users.
Single activity endpoint that fans out to six downstream services. One POST triggers Chiasm task updates, Axon event publishing, Broca action logging, Thymus metrics, skill matching, and memory storage.
Neuroscience-named coordination primitives that share the same substrate. No external message broker. No separate databases. Everything in-process, zero-latency.
Pub/sub with channels, retention windows, cursor-based consumption. Webhook push or pull polling. At-least-once delivery.
Permanent structured log of agent actions. Complex filtering by agent, service, action type. Causal links to Axon events.
Long-running task lifecycle management. State machine: active, paused, blocked, completed. Project-scoped for multi-agent work.
Directory of active agents with heartbeat monitoring. Capability declarations, quality scores, drift flags.
DAG-based execution engine. Actions, decisions, LLM calls as nodes. State persistence across restarts.
Aggregates feedback signals into quality scores. Rubric-based evaluation. Reinforcement learning loop stored in Soma.
Hopfield network or external subprocess. Init, Query, Absorb, DreamCycle commands. JSON over STDIN/STDOUT protocol.
No more "search Kleos for..." in every prompt. Install hooks once, memory works automatically. Session context loads at start, memories store on end, relevant context surfaces each turn.
Four hooks that wire up the entire system:
Structured reasoning workflow enforced via hooks:
Session-scoped Guardian:
Connect via MCP protocol, HTTP API, or CLI. Drop Kleos into your existing workflow in minutes.
Not just authentication. Non-repudiable request signing, pre-action policy gates, layered middleware, and optional YubiKey integration. Security that does not depend on trust.
Non-repudiable request envelopes. ECDSA-P256 (YubiKey PIV) or Ed25519 (software). 60-second replay window with nonce protection.
Pre-action decision engine. Allow, warn, or block policies. Human-in-the-loop approval queue for sensitive operations.
6 layers: Tower-HTTP, safe mode (crash-loop detection), pre-auth rate limit, authentication, audit logging, tenant quotas.
PIV for remote API signing with public PEM registry. HMAC-SHA1 challenge-response for local credential unlocking.
Per-tenant database encryption at rest. Key resolution from keyfile, environment variable, or YubiKey HMAC-SHA1.
AES-256-GCM credential daemon on port 4400. Keyless agent bootstrap. Secret resolution without plaintext exposure.
Rust-native performance with zero garbage collection pauses. Local ONNX inference means no API latency.
Internal benchmarks on synthetic workloads. Results vary with dataset size, embedding model, and usage patterns.
CLI playground, live memory graph, and store demo. All running in your browser.
One command. Full memory. Your hardware.
127.0.0.1:4200 · 16 crates · rusqlite + ONNX + LanceDB · Zero cloud dependencies