/ AI Development / AI Agent Memory Management: Complete Solution from Short-term to Persistent Memory
AI Development 2 min read

AI Agent Memory Management: Complete Solution from Short-term to Persistent Memory

Complete guide to implementing context memory for AI Agents - covers vector database storage, summarization techniques, and persistent storage solutions.

AI Agent Memory Management: Complete Solution from Short-term to Persistent Memory - Complete AI Development guide and tutorial

The biggest pain point in conversational AI: every new conversation starts from scratch. The AI forgets everything from previous sessions. This guide covers a complete memory architecture that solves this problem.

Three-Layer Memory Architecture

The solution uses three distinct memory layers:

  • Long-term Memory - stores preferences and key decisions (weeks to years retention)
  • Mid-term Memory - stores current project context (days to weeks retention)
  • Short-term Memory - stores immediate conversation (current session only)

Implementation

Short-term Memory: Recent Message Buffer

from collections import deque

class ShortTermMemory:
    def __init__(self, max_items=10):
        self.buffer = deque(maxlen=max_items)

    def add(self, role: str, content: str):
        self.buffer.append({"role": role, "content": content})

    def get_context(self, system_prompt: str) -> list:
        messages = [{"role": "system", "content": system_prompt}]
        messages.extend(self.buffer)
        return messages

Mid-term Memory: Vector Database Storage

import chromadb
from chromadb.config import Settings

class MidTermMemory:
    def __init__(self, collection_name="memories"):
        self.client = chromadb.Client(Settings(
            anonymized_telemetry=False,
            allow_reset=True
        ))
        self.collection = self.client.get_or_create_collection(
            name=collection_name,
            metadata={"hnsw:space": "cosine"}
        )

    def add_memory(self, text: str, metadata: dict):
        self.collection.add(
            documents=[text],
            metadatas=[metadata],
            ids=[f"mem_{metadata.get('timestamp', 0)}"]
        )

    def search(self, query: str, top_k=3) -> list:
        results = self.collection.query(
            query_texts=[query],
            n_results=top_k
        )
        return results.get("documents", [[]])[0]

Long-term Memory: Key Information Summary

class LongTermMemory:
    def __init__(self, db_path="./memory.db"):
        import sqlite3
        self.conn = sqlite3.connect(db_path)
        self._init_db()

    def _init_db(self):
        self.conn.execute("""
            CREATE TABLE IF NOT EXISTS memories (
                id INTEGER PRIMARY KEY,
                content TEXT,
                importance TEXT,
                created_at INTEGER
            )
        """)
        self.conn.commit()

    def save_key_memory(self, content: str, importance: str = "medium"):
        import time
        self.conn.execute(
            "INSERT INTO memories (content, importance, created_at) VALUES (?, ?, ?)",
            (content, importance, int(time.time()))
        )
        self.conn.commit()

Complete Agent Memory System

class AgentMemory:
    def __init__(self):
        self.short = ShortTermMemory(max_items=10)
        self.mid = MidTermMemory()
        self.long = LongTermMemory()

    def remember(self, role: str, content: str):
        self.short.add(role, content)

        # Sync to mid-term when buffer is full
        if len(self.short) >= 10:
            summary = self._summarize(self.short.buffer)
            self.mid.add_memory(summary, {"source": "conversation"})

    def get_full_context(self, query: str) -> str:
        preferences = self.long.get_preferences()
        related = self.mid.search(query)
        recent = self.short.get_context("")

        return f"Preferences: {preferences}\nRelated: {related}\nRecent: {recent}"

Tech Stack Recommendations

Scenario Recommended Solution
Simple chatbot Short + Mid memory only
Personal AI assistant All three layers
Enterprise客服 Mid + Summary (no long-term for privacy)
Multi-user system Per-user vector indexes

Start simple and add layers as needed. Most projects only need short-term memory with keyword search.