Last week I attended MongoDB Dev Day here in Chicago, a two-day deep dive into modern data modeling, search, and building AI applications on MongoDB. Going in, I was curious about one thing in particular: how well does MongoDB fit into agentic workflows?
I left convinced that the document model is one of the most underrated primitives for building AI agents — especially when it comes to agent memory. This post is a recap of what I learned, with code examples, and a focused look at why MongoDB is a strong backbone for agent memory.
Why MongoDB Clicks for AI Workflows
LLMs and agents naturally produce and consume JSON-like, semi-structured data. Tool calls, intermediate reasoning, plans, observations, and memory all tend to be hierarchical and shape-shifting. Traditional relational schemas force you to flatten, normalize, and translate that data on the way in and out.
MongoDB skips that impedance mismatch entirely:
- An LLM’s output ≈ a MongoDB document.
- Schemas can evolve as your agent evolves.
- Vector embeddings live alongside the data they describe — no separate vector DB to sync.
That last point is the unlock for agent memory, which I’ll get to. First, a quick walk through the foundations.
Day 1: Data Modeling & Search Foundations
Document Modeling: Embed vs. Reference
We built an online library app to explore the central tradeoff in document design: embedding vs. referencing.
Embedding keeps related data co-located so you avoid joins:
// Embedded: a book with its reviews inline
{
"_id": "book_123",
"title": "The Pragmatic Programmer",
"authors": ["Andy Hunt", "Dave Thomas"],
"reviews": [
{ "user": "alice", "rating": 5, "text": "Classic." },
{ "user": "bob", "rating": 4, "text": "Still relevant." }
]
}
This is great… until it isn’t. If reviews grow unbounded, or are updated independently of the book, referencing is better:
// Referenced
// books collection
{ "_id": "book_123", "title": "The Pragmatic Programmer" }
// reviews collection
{ "_id": "rev_1", "book_id": "book_123", "user": "alice", "rating": 5 }
{ "_id": "rev_2", "book_id": "book_123", "user": "bob", "rating": 4 }
Rule of thumb: embed when data is read together and bounded in size; reference when data grows unbounded or is written independently.
Search Fundamentals
MongoDB’s Atlas Search gives you full-text search built into the database — no separate Elasticsearch cluster required. Highlights:
- Autocomplete, fuzzy search, filters, facets
- Multi-language support (40+ languages)
- Custom scoring, synonyms, highlighting, more-like-this
A simple fuzzy search aggregation:
db.books.aggregate([
{
$search: {
index: "default",
text: {
query: "pragmatik programer",
path: ["title", "authors"],
fuzzy: { maxEdits: 2 }
}
}
},
{ $limit: 10 }
]);
We ended Day 1 by earning a skill badge — a nice way to validate everything hands-on.
Day 2: Building AI Applications
Vector Search + Auto Embeddings
The standout feature of Day 2 was MongoDB’s auto embedding capability: embeddings are generated automatically when a document is inserted or updated, and they’re stored inside the document itself.
That means no sync pipeline between your operational DB and your vector store. One source of truth.
A vector index definition:
db.memories.createSearchIndex({
name: "memory_vector_index",
type: "vectorSearch",
definition: {
fields: [
{
type: "vector",
path: "embedding",
numDimensions: 1536,
similarity: "cosine"
},
{ type: "filter", path: "agent_id" },
{ type: "filter", path: "type" }
]
}
});
And a vector search query, combining semantic similarity with metadata filters:
db.memories.aggregate([
{
$vectorSearch: {
index: "memory_vector_index",
path: "embedding",
queryVector: queryEmbedding, // produced by your embedding model
numCandidates: 100,
limit: 5,
filter: {
agent_id: "agent_42",
type: "long_term"
}
}
},
{
$project: {
content: 1,
type: 1,
created_at: 1,
score: { $meta: "vectorSearchScore" }
}
}
]);
This is the foundation for RAG — and, more interestingly, for agent memory.
The Main Event: Agent Memory in MongoDB
Here’s where MongoDB really shines. Most agent frameworks need three kinds of memory:
- Short-term memory — the current conversation / scratchpad
- Long-term memory — semantic recall across sessions
- Structured memory — facts, preferences, entities
In a typical stack, that’s three systems (Redis, a vector DB, a relational DB) glued together. In MongoDB, it’s one collection with a flexible schema.
A Unified Memory Document
{
"_id": "mem_9f2c",
"agent_id": "agent_42",
"session_id": "sess_1029",
"type": "long_term", // short_term | long_term | fact
"role": "user", // user | assistant | tool | system
"content": "I prefer concise answers and code in TypeScript.",
"metadata": {
"source": "conversation",
"importance": 0.82,
"entities": ["user_preference", "language:typescript"]
},
"embedding": [0.0123, -0.0456, ...], // auto-generated
"created_at": "2025-01-15T14:32:11Z",
"last_accessed": "2025-01-16T09:11:02Z"
}
One schema flexes to cover all three memory types. No migrations when the agent evolves.
Writing Memories
Here’s a simplified Python snippet for storing a memory. With auto embeddings configured, you don’t even need to compute the vector yourself:
from pymongo import MongoClient
from datetime import datetime
client = MongoClient(MONGO_URI)
memories = client.agents.memories
def remember(agent_id: str, session_id: str, content: str, mem_type: str):
memories.insert_one({
"agent_id": agent_id,
"session_id": session_id,
"type": mem_type,
"content": content, # embedding auto-generated
"created_at": datetime.utcnow(),
"last_accessed": datetime.utcnow(),
})
remember(
agent_id="agent_42",
session_id="sess_1029",
content="User prefers concise answers and TypeScript code samples.",
mem_type="long_term",
)
Retrieving Memories
When the agent needs context, do a hybrid retrieval — recent short-term messages plus semantically relevant long-term memories:
def recall(agent_id: str, session_id: str, query_embedding: list[float]):
# Recent short-term context (last 10 turns)
short_term = list(memories.find(
{"agent_id": agent_id, "session_id": session_id, "type": "short_term"}
).sort("created_at", -1).limit(10))
# Semantically relevant long-term memories
long_term = list(memories.aggregate([
{
"$vectorSearch": {
"index": "memory_vector_index",
"path": "embedding",
"queryVector": query_embedding,
"numCandidates": 100,
"limit": 5,
"filter": {"agent_id": agent_id, "type": "long_term"}
}
},
{"$project": {"content": 1, "score": {"$meta": "vectorSearchScore"}}}
]))
return {"short_term": short_term, "long_term": long_term}
Building the Prompt
Then fold both into your prompt:
def build_prompt(user_msg: str, memory: dict) -> str:
long_term = "\n".join(f"- {m['content']}" for m in memory["long_term"])
short_term = "\n".join(f"{m['role']}: {m['content']}" for m in reversed(memory["short_term"]))
return f"""You are a helpful assistant.
Relevant long-term memory about the user:
{long_term}
Recent conversation:
{short_term}
User: {user_msg}
Assistant:"""
That’s a fully functional agent memory layer in one collection, one index, and a handful of queries.
Bonus: Memory Decay & Reinforcement
Because each memory is just a document, you can model real cognitive patterns trivially. Boost importance on access:
db.memories.updateOne(
{ _id: "mem_9f2c" },
{
$set: { last_accessed: new Date() },
$inc: { "metadata.access_count": 1 }
}
);
Or prune stale, low-importance memories on a schedule:
db.memories.deleteMany({
type: "short_term",
created_at: { $lt: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000) },
"metadata.importance": { $lt: 0.3 }
});
Try doing that cleanly across three separate systems.
Key Takeaways
- The document model matches how agents think. JSON in, JSON out, no translation layer.
- Auto embeddings collapse the stack. Your operational data and its vector representation live in the same document, updated atomically.
- Agent memory becomes a single collection. Short-term, long-term, and structured memory all fit one flexible schema — with hybrid retrieval in a single aggregation pipeline.
- You can model real memory dynamics (decay, reinforcement, importance) with plain MongoDB updates — no custom infrastructure.
If you’re building agentic workflows, RAG systems, or anything where the data is fluid and the access patterns mix semantic + structured queries, MongoDB deserves a serious look. The fewer moving parts in your agent infrastructure, the more time you spend on the agent itself.
Thanks to the MongoDB team for a great two days. If you want to dig in further, check out the Data Modeling slides, Search Fundamentals, and the Vector Search lab.