trustmem

provenance:github:jupiturliu/trustmem

WHAT THIS AGENT DOES

Here's a plain English description of the TrustMem agent, suitable for a non-technical business user: TrustMem helps AI assistants remember important information and learn from their experiences over time. It solves the problem of AI forgetting things or confidently stating incorrect information, which can lead to bad decisions. Business analysts, researchers, and software engineers would find it useful because it makes AI more reliable and helps them build more effective AI-powered tools.

View Source ↗First seen 3mo agoNot yet hireable

README

# TrustMem

> **AI agents have the world's most powerful brains. TrustMem gives them a hippocampus.**

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://python.org)
[![Node.js 20+](https://img.shields.io/badge/node-20+-green.svg)](https://nodejs.org)
[![arXiv](https://img.shields.io/badge/arXiv-coming_soon-red.svg)](paper/)

---

## Why TrustMem Exists

TrustMem is infrastructure for a different kind of AI future.

Not AI that replaces human judgment. Not AI that simply serves requests.

AI and humans evolving together — each making the other sharper.

The problem isn't that agents forget. It's that current systems create dependency without growth: humans offload thinking to AI, AI has no memory of what it learned, and neither party gets wiser.

TrustMem is built on a different premise:

> **Agents should remember and verify what they know. Humans should be challenged, not just served. The loop between them should make both sides stronger over time.**

This means agents that hold knowledge accountable — with confidence scores, decay, and cross-verification. It means humans whose first-hand insights carry the highest weight in the system. And it means AI that doesn't just amplify what you already think, but provides the friction that sharpens how you think.

We call this co-evolution. TrustMem is the memory layer that makes it possible.

---

## Results — 11 Days Production Deployment

| Metric | Baseline | TrustMem | Improvement |
|--------|----------|----------|-------------|
| **Hit@5 Retrieval Accuracy** | 62% | **100%** | +38pp |
| **MRR (Mean Reciprocal Rank)** | 0.518 | **0.944** | +82% |
| **Knowledge Verification Coverage** | 0% | **100%** | ∞ |
| **Citation Coverage** | — | **88%** (56/64 files) | — |
| **Agents Collaborating** | 1 | **5** | 5× |
| **Knowledge Artifacts Produced** | — | **68** | — |

> Deployed across AI infrastructure research, investment analysis, and software engineering domains.

---

## What is TrustMem?

Modern AI agents suffer from three fundamental memory failures:

| Problem | Symptom | TrustMem Solution |
|---------|---------|-------------------|
| **Forgetting** | Agents lose context across sessions; important episodes vanish | Episodic Memory with persistent storage and hippocampal indexing |
| **Untrustworthiness** | Agents confidently recall hallucinated or stale facts | Knowledge Trust Layer with confidence scores, decay, and cross-verification |
| **Stagnation** | Agents repeat the same mistakes; no learning loop | Agent Bus + Research Lab for continuous improvement and longitudinal tracking |

TrustMem is a **monorepo** that unifies all the components needed to give an AI agent a reliable, trustworthy, self-improving memory.

---

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                        AGENT LAYER                              │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐         │
│  │ Coordinator  │  │   Research   │  │ Intelligence │  ...    │
│  │   (Annie)    │  │    Agent     │  │  (Briefing)  │         │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘         │
│         │                 │                  │                  │
│         └─────────────────┴──────────────────┘                 │
│                           │                                     │
│              ┌────────────▼────────────┐                       │
│              │       AGENT BUS         │                       │
│              │  learning_queue.json    │                       │
│              │  implementation_queue   │                       │
│              │  verification_queue     │                       │
│              │  alerts.json            │                       │
│              └────────────┬────────────┘                       │
└───────────────────────────┼─────────────────────────────────────┘
                            │
┌───────────────────────────▼─────────────────────────────────────┐
│                       TRUST LAYER                               │
│                                                                 │
│  Every knowledge artifact carries:                             │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ confidence: 0.85  │  verified_by: [research]            │   │
│  │ decay_class: fast │  data_freshness: 2026-03-30         │   │
│  │ domain: ai-infra  │  effective_confidence: 0.72         │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  knowledge/          tools/                                     │
│  ├── ai-infra/       ├── knowledge_search.py                   │
│  ├── trust-config    ├── knowledge_decay_scan.py               │
│  └── KNOWLEDGE-INDEX └── knowledge_verify_request.py  ...     │
└───────────────────────────┬─────────────────────────────────────┘
                            │
┌───────────────────────────▼─────────────────────────────────────┐
│                   OPTIMIZATION LAYER                            │
│                                                                 │
│   Fixed Test Suite          Autoresearch Loop                  │
│   (50 queries)     ──────►  measure → diagnose → improve       │
│                             retain winners, discard regressions │
│                                                                 │
│   research/experiments/     research/metrics/                  │
│   research/reports/         research/baselines/                │
└─────────────────────────────────────────────────────────────────┘
```

### Episodic Memory — Three-Layer Memory Architecture

```
  Semantic Memory  (MEMORY.md / knowledge/)
       ↕  Sleep Replay consolidation
  Episodic Memory  (episodes.db)          ← packages/episodic-memory
       ↕  Episode Encoder
  Working Memory   (Agent Context Window)
```

Inspired by hippocampal-neocortical memory consolidation: episodes recorded during active sessions are replayed and compressed into long-term semantic knowledge, while working memory provides the immediate context window for the running agent.

---

## Modules

### 📦 `packages/episodic-memory` — Core Memory Engine

The TypeScript implementation of the episodic memory system. Inspired by hippocampal memory consolidation:

- **Encoding**: capture raw agent episodes with full context
- **Consolidation**: merge and deduplicate episodic fragments
- **Retrieval**: semantic search over the memory store
- **Forgetting**: importance-weighted decay to prevent memory bloat

```bash
cd packages/episodic-memory
npm install
npm run build
```

### 🚌 `agent-bus` — Inter-Agent Communication

Lightweight JSON-based message bus for multi-agent coordination:

- `alerts.json` — real-time notifications between agents
- `learning_queue.json` — knowledge items pending learning
- `implementation_queue.json` — coding tasks pending execution
- `verification_queue.json` — facts pending cross-agent verification

The bus uses a simple file-polling pattern: agents write to queues, consumers process and acknowledge.

### 🧠 `knowledge` — Knowledge Trust Layer

Every piece of knowledge has a **trust score** based on:
- Source credibility (agent weights defined in `trust-config.json`)
- Verification status (unverified → single-verified → cross-verified)
- Data freshness (decay_class: stable | normal | volatile)
- Citation count (how many other knowledge nodes reference this)

```bash
# Search knowledge
python3 tools/knowledge_search.py "memory verification" --top 5

# Check for stale entries
python3 tools/knowledge_decay_scan.py
```

### 🔧 `tools` — Knowledge Management CLI

Python tools for maintaining the knowledge base:

| Tool | Purpose |
|------|---------|
| `knowledge_search.py` | Semantic + keyword 

[truncated…]

PUBLIC HISTORY

First discoveredMar 31, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenMar 25, 2026

last updatedMar 30, 2026

last crawled3 months ago

version—

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:jupiturliu/trustmem)