🧠 Cortex

The Second Brain for AI Agents — Local Knowledge Base · Single Binary · MCP Native

Cortex is a local knowledge base engine purpose-built for AI Agents. One single binary, native MCP protocol support, built-in hybrid search (Vector + BM25) and Agent Memory System. 100% local, zero external dependencies.

Give Claude Code, OpenCode, Cursor and other AI Agents a permanent memory 🧠

GitHub · 🎨 Landing Page

_{⬆️ Click to view the full landing page (Xiaomi/Mimo style)}

📊 Comparison · ⚡ Quick Start · ✨ Features · 🏗️ Architecture · 📡 API · 🔧 Config
🌐 中文版 · 🛠️ Dev · 📖 Docs

--- ## 📊 Product Comparison

Feature	🚀 Cortex	Mem0	AnythingLLM	ChromaDB	Qdrant	Dify
📦 Deployment
Deployment	✅ Single Binary _{Download & run}	⚠️ pip/Docker _{Python required}	⚠️ Docker/Desktop _Node.js	⚠️ pip/Docker _Python	✅ Single Binary _{Download & run}	⚠️ Docker Compose _{Multi-service}
Dependencies	✅ Zero _{Ollama optional}	❌ LLM API	❌ LLM API	✅ None	✅ None	❌ Multiple
🤖 AI Agent Integration
MCP Protocol	✅ Native _{cortex mcp}	⚠️ Plugin	✅ Supported	❌ No	❌ No	✅ Supported
MCP Tools	🔧 5 tools _{search/context/memory}	🔧 1-2	🔧 1	—	—	🔧 1-2
Agent Memory	✅ Built-in _{Long-term + RAG}	✅ Focused _Multi-level	❌ Chat only	❌ Vector DB	❌ Vector DB	⚠️ Basic
🔍 Search
Search Type	✅ Hybrid _{Vector+BM25+RRF}	✅ Hybrid	✅ Vector	✅ Vector/Hybrid	✅ Hybrid	⚠️ Backend dep.
File Formats	📄 MD/PDF/DOCX _{+code files}	— _{Memory only}	✅ Multi	— _{Vectors only}	— _{Vectors only}	✅ Multi
📊 Operations
Monitoring	✅ Prometheus _{39 metrics}	⚠️ Dashboard	⚠️ Basic	❌ None	❌ None	✅ Grafana
Caching	✅ L1+L2 _{Memory+SQLite}	⚠️ Basic	⚠️ Basic	❌ None	✅ Memory	⚠️ Basic
Privacy	✅ 100% Local _{Fully offline}	⚠️ Local/Cloud	✅ 100% Local	✅ 100% Local	✅ 100% Local	⚠️ Local/Cloud
License	✅ MIT _{Free use/ modify/commercial}	✅ Apache 2.0	✅ MIT	✅ Apache 2.0	✅ Apache 2.0	⚠️ Restricted

> **Cortex Differentiator**: The only tool that combines single-binary deployment + native MCP protocol + built-in agent memory + hybrid search + Prometheus monitoring — purpose-built for AI Agent scenarios. --- ## 🎯 Use Cases | Use Case | Description | |----------|-------------| | 🤖 **AI Agent Memory** | Give Claude Code / OpenCode / Cursor persistent memory across sessions | | 📚 **Team Knowledge Base** | Index team wikis, technical docs, and project specs into a searchable RAG knowledge base | | 🔍 **Codebase Search** | Index Go/Python/JS code for natural language semantic search | | 🏢 **Enterprise Docs** | Private, local document retrieval for internal knowledge—data never leaves your network | | 🧪 **RAG Backend** | Serve as the retrieval layer for RAG pipelines via REST API and MCP dual protocol | | 🔐 **Privacy-First** | Finance, healthcare, legal — deploy 100% locally for sensitive data | --- ## ✨ Changelog ### 🧠 v2.2 — MCP Memory Tools (2026-05-05) - ✅ **5 MCP Tools** — `cortex_search` / `cortex_context` / `cortex_memory_write` / `cortex_memory_search` / `cortex_memory_delete` - ✅ **Zero-Dependency Mode** — `embedding.provider: none`, FTS5-only, no Ollama required - ✅ **Pure Go SQLite** — Switched to `modernc.org/sqlite`, no CGO/gcc needed - ✅ **MCP Graceful Shutdown** — Signal handling, safe Ctrl+C exit - ✅ **MCP Unit Tests** — 11 test cases covering all tool edge conditions ### 🔥 v2.1 — Production Hardening (2026-04-25) - ✅ **L1+L2 Two-Level Cache** — Memory + SQLite, 10x search speed - ✅ **Graceful Shutdown** — 30s window for in-flight requests - ✅ **Request Timeout** — Default 30s, search 60s, index 5min - ✅ **Rate Limiting** — Token bucket, 100 req/s burst 200 - ✅ **36 Test Cases** — Storage/Auth/Search core modules ### ✨ v2.0 — Core Features - ✅ **Memory System API** — Full CRUD for agent memory - ✅ **Auth Persistence** — Users/Tokens/APIKeys in SQLite - ✅ **Prometheus Monitoring** — 39 metrics on port 9090 --- ## ⚡ Quick Start ```bash # 1. Download # macOS/Linux curl -fsSL https://github.com/lh123aa/cortex/releases/latest/download/cortex-linux-amd64.zip | unzip - chmod +x cortex # Windows # Invoke-WebRequest -Uri "..." -OutFile "cortex.zip" # 2. Index your documents cortex index ~/my-docs # 3. Start MCP server (for AI Agent integration) cortex mcp # 4. Search cortex search "How to implement Go concurrency" ``` --- ## ✨ Core Features

🚀 Single Binary

_{Download & run. No Python/Node/Docker
curl → cortex mcp}

🔌 MCP Native

_{5 MCP tools for AI Agents
cortex_search · cortex_memory_write}

🧠 Memory System

_{Long-term memory + RAG context
Cross-session user preference recall}

🔍 Hybrid Search

_{Vector semantics + BM25 keywords
RRF fusion for precision recall}

⚡ L1+L2 Cache

_{In-memory + SQLite two-level cache
10x search speed improvement}

📊 Prometheus

_{39 monitoring metrics
:9090/metrics}

### 🔌 MCP Tools Reference | Tool | Description | REST API Equivalent | |------|-------------|--------------------| | `cortex_search` | Hybrid search (vector + BM25) | `GET /v1/search` | | `cortex_context` | RAG context assembly | `GET /v1/context` | | `cortex_memory_write` | Write a memory entry | `POST /v1/memory` | | `cortex_memory_search` | Search memory entries | `GET /v1/memory/search` | | `cortex_memory_delete` | Delete a memory entry | `DELETE /v1/memory/:id` | --- ## 🏗️ Architecture ``` ┌──────────────────────────────────────────────────────────┐ │ Cortex CLI │ ├──────────────────────────────────────────────────────────┤ │ index │ search │ context │ serve │ mcp │ ├──────────────────────────────────────────────────────────┤ │ Hybrid Search Engine │ │ HNSW Vector Search │ FTS5 (BM25) │ ├──────────────────────────────────────────────────────────┤ │ L1+L2 Cache Layer (v2.1) │ │ go-cache (memory) │ SQLite │ ├──────────────────────────────────────────────────────────┤ │ SQLite Storage │ │ Documents │ Chunks │ Vectors │ Cache │ Users │ ├──────────────────────────────────────────────────────────┤ │ MCP Protocol · REST API │ │ 5 MCP Tools │ 15+ REST Endpoints │ Prometheus │ └──────────────────────────────────────────────────────────┘ ``` **Tech Stack:** - **Language**: Go 1.21+ — Cross-platform single binary (pure Go, no CGO) - **Storage**: SQLite + WAL — Zero-config embedded database - **Vector**: HNSW — High-performance approximate nearest neighbor search - **Embedding**: Ollama / ONNX / None（FTS5-only mode） - **Protocol**: MCP SDK — Native AI Agent communication - **Monitoring**: Prometheus — 39 built-in metrics --- ## 📡 API Reference | Category | Endpoint | Method | Description | |----------|----------|--------|-------------| | **Search** | `/v1/search` | GET | Hybrid search (vector + FTS) | | | `/v1/context` | GET | RAG context assembly | | **Memory** | `/v1/memory` | POST | Write memory entry | | | `/v1/memory/batch` | POST | Batch write memories | | | `/v1/memory/search` | GET | Search memories | | | `/v1/memory/context` | GET | Memory RAG context | | | `/v1/memory/:id` | DELETE | Delete memory | | **Auth** | `/auth/register` | POST | Register user | | | `/auth/login` | POST | Login | | | `/auth/logout` | POST | Logout | | **Monitor** | `/health` | GET | Health check | | | `/metrics` | GET | Prometheus metrics（:9090） | --- ## 🔧 Configuration ```yaml # ~/.cortex/config.yaml cortex: db_path: ~/.cortex/cortex.db log_level: info auth_enabled: false embedding: provider: ollama # ollama | onnx | none（FTS5-only, no external service） ollama: base_url: http://localhost:11434 model: nomic-embed-text index: workers: 8 max_tokens: 512 search: cache_ttl: 5m default_top_k: 10 prometheus: enabled: true port: 9090 ``` --- ## 📦 Supported File Formats | Format | Support | Chunking Method | |--------|---------|----------------| | Markdown (.md) | ✅ | Hierarchical, heading-path traceable | | PDF (.pdf) | ✅ | Text extraction, auto-chunking | | Word (.docx) | ✅ | Paragraph parsing, structure preserved | | Plain Text (.txt) | ✅ | Line/paragraph based | | Code (.go/.py/.js/.ts/.java etc.) | ✅ | Function/class based | | Config (.yaml/.json/.toml/.ini) | ✅ | Structured extraction | --- ## 🛠️ Development ```bash git clone https://github.com/lh123aa/cortex.git cd cortex go build -o cortex ./cmd/cortex # Pure Go, no CGO required ./cortex serve go test ./... # 114 tests ``` --- ## 📊 Performance | Metric | Value | |--------|-------| | Search Latency P50 | < 50ms (cache hit < 1ms) | | Search Latency P95 | < 100ms | | Cache Hit Rate | > 60% (L1+L2) | | Index Throughput | > 100 files/min | | Test Coverage | 114 unit tests | --- ## 🤝 Contributing - 🐛 Found a bug? [Open an Issue](https://github.com/lh123aa/cortex/issues) - 💡 Have an idea? [Start a Discussion](https://github.com/lh123aa/cortex/discussions) - ⭐ Like the project? Give it a Star! ## 📄 License **MIT License** — Free to use, modify, and commercialize. --- ## ⭐ Star History [![Star History Chart](https://api.star-history.com/svg?repos=lh123aa/cortex&type=Date)](https://star-history.com/#lh123aa/cortex&Date) --- ## 💬 Community & Support - ⭐ **Star the repo** — The best way to show support - 🐛 **Report bugs** — [Open an Issue](https://github.com/lh123aa/cortex/issues) - 💡 **Feature ideas** — [Start a Discussion](https://github.com/lh123aa/cortex/discussions) - 📣 **Share it** — Recommend Cortex on HN, Reddit, Twitter - 🤝 **Contribute** — Read [CONTRIBUTING.md](CONTRIBUTING.md) ## 🔗 Resources - [MCP Protocol Specification](https://modelcontextprotocol.io) — AI Agent communication standard - [Ollama](https://ollama.ai) — Local LLM & Embedding - [OpenCode](https://opencode.ai) — AI Agent Framework - [Awesome MCP Servers](https://github.com/lh123aa/awesome-mcp-servers) — MCP server list ---

🧠 Cortex v2.2 — Give AI Agents a Memory
_{Single Binary · Zero Config · MCP Native · 100% Local · MIT License}