Most AI tools forget everything between sessions. MitrAI Memory Engine gives your AI a persistent, searchable knowledge base that automatically improves its own retrieval quality over time.

Upload documents, ask questions, and watch the system learn your domain — powered by DSPy optimization that compiles per-user query strategies from your actual data.

100+ LLM Providers
3 Interfaces
Self-Improving

Why it's different

🧠

Self-Improving Retrieval

DSPy autotuner generates training data from your uploads and compiles optimized query strategies per user. Your knowledge base literally gets smarter.

🔌

Any LLM Provider

Ollama, OpenAI, Anthropic, Cohere, and 100+ more via LiteLLM. One config change to swap providers. No code changes, no lock-in.

🔒

Local-First Privacy

Run entirely on your machine with Ollama. Your documents, embeddings, and optimized models never leave your hardware.

🏗

Production Architecture

Qdrant vector store, per-user collection isolation, Bearer token auth, health probes, and graceful degradation built in.

Three ways to connect

REST API

15 endpoints for web apps, mobile, and custom integrations. Full OpenAPI docs included.

curl localhost:8100/api/query
MCP Server

9 tools for Claude Desktop, Cursor, OpenClaw, and any MCP-compatible AI agent.

memory-mcp
CLI

Interactive setup wizard, health checks, ingest, query — all from your terminal.

memory-engine init

Works with your agent stack

One memory backend. Every agent framework. Pick your integration path.

REST

LangChain

Use as a custom retriever. Two HTTP calls: /api/ingest-batch to store, /api/query to retrieve. Fits into any LangChain chain.

from langchain.retrievers import BaseRetriever
import httpx

class MemoryRetriever(BaseRetriever):
    def _get_relevant_documents(self, query):
        r = httpx.post("http://localhost:8100/api/query",
            json={"query": query, "user_id": "my-agent"})
        return [Document(page_content=c["content"])
                for c in r.json()["results"]]
REST

CrewAI

Give CrewAI agents long-term memory across tasks. Each crew member gets its own user namespace with isolated collections.

from crewai import Agent
from crewai_tools import tool

@tool("search_knowledge")
def search(query: str) -> str:
    """Search the team knowledge base."""
    r = httpx.post("http://localhost:8100/api/query",
        json={"query": query, "user_id": "crew-research"})
    return r.json()["results"][0]["content"]
REST

AutoGen / AG2

Register memory as a function tool. AutoGen agents call it automatically when they need context from uploaded documents.

assistant.register_for_llm(
    name="query_memory",
    description="Search the knowledge base"
)(lambda query, user_id="autogen-team":
    httpx.post("http://localhost:8100/api/query",
        json={"query": query, "user_id": user_id}
    ).json()["results"]
)
REST

Any Agent / Custom

15 REST endpoints. If your framework can make HTTP calls, it can use the Memory Engine. Bearer token auth, JSON in/out, OpenAPI spec.

curl -X POST http://localhost:8100/api/query \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "quarterly revenue",
       "user_id": "my-agent",
       "top_k": 5}'

Built for real workflows

🤖

Agent Memory

Give any AI agent persistent memory across sessions. Ingest context once, recall it forever. Each agent gets its own namespace.

📚

Team Knowledge Base

Ingest company docs, wikis, and SOPs. Every team member's AI assistant shares the same searchable knowledge.

💻

Code Context

Index repos, ADRs, and design docs. Get contextual answers about your codebase inside Cursor or any MCP-enabled editor.

🎓

Research & Learning

Drop papers and notes. The engine builds connections across your reading and surfaces relevant context as you ask questions.

Provider presets

Start with a preset, customize with environment variables.

🦙 Ollama Free · Fully local · Best privacy
🟢 OpenAI Best quality · API key required
🟠 Anthropic Claude models · API key required
🔵 Cohere Great embeddings · Free tier available
⚡ Hybrid Local embed + cloud optimize
Coming Soon
Experience It

Try the Memory Engine with Mitrai App

Mitrai is a private, local AI chat app with the Memory Engine built in. Upload documents, ask questions, and see self-improving retrieval in action — no integration required.

💬 Chat with Documents

Drop files into the conversation. Memory Engine indexes them and returns cited answers.

🧠 Self-Improving Answers

The more you use it, the better retrieval gets. DSPy optimization runs in the background.

🔒 Completely Local

Ollama models, local vector store, no data leaves your machine. One-click install.

You can use the cloud version to see how the engine works. However, if you prefer, we have a downloadable Private Chat version.

Feature demos (YouTube series)

Short, focused videos — coming soon. We'll add them here feature-by-feature.

Memory Engine · Coming soon
Docs Q&A · Coming soon
MCP Integration · Coming soon
Self-hosting · Coming soon