githubinferredactive
Night-Shift
provenance:github:mentailityai/Night-Shift
An automated data pipeline that captures user corrections to AI outputs and transforms them into reusable training data — so your AI learns from every edit.
README
# Night-Shift Preference Optimizer
**An automated data pipeline that captures user corrections to AI outputs and transforms them into reusable training data — so your AI learns from every edit.**
Night-Shift operates on a **dual-loop architecture**:
- **Fast Loop** — Extracted preferences are immediately available to any AI agent via an MCP server. No retraining required.
- **Slow Loop** — Once enough high-quality training pairs accumulate, the system exports JSONL for permanent model fine-tuning.
## Overview
---
<video src="https://github.com/MentailityAI/Night-Shift/raw/main/NightShiftPromo.mp4" controls autoplay muted loop playsinline></video>
---
## Table of Contents
- [How It Works](#how-it-works)
- [Architecture](#architecture)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Configuration Reference](#configuration-reference)
- [Running the System](#running-the-system)
- [API Reference](#api-reference)
- [Integrating into an Agentic Workflow](#integrating-into-an-agentic-workflow)
- [Project Structure](#project-structure)
- [Testing](#testing)
- [License](#license)
---
## How It Works
1. A user edits an AI-generated output (e.g., corrects a legal clause, adjusts formatting, fixes a tone issue).
2. Your application captures the original AI output and the human correction and sends them to the Night-Shift API.
3. A background worker (the "Night-Shift Agent") analyses the delta between the two versions using an LLM to extract the underlying preference rule.
4. The extracted rule is embedded and stored in a vector database for **immediate semantic retrieval** by any AI agent (Fast Loop).
5. A cleaned prompt-response training pair is staged for **future model fine-tuning** (Slow Loop).
```
┌────────────────────────────────────────────────────────────────────┐
│ YOUR APPLICATION │
│ │
│ User edits AI output ──► POST /api/logs ──► Night-Shift │
└────────────────────────────────────────────────────────────────────┘
│
┌─────────────┴─────────────┐
Cron Trigger Batch Trigger
└─────────────┬─────────────┘
▼
┌───────────────────────┐
│ Night-Shift Agent │
│ (LLM Analysis) │
│ │
│ • Analyse the delta │
│ • Extract the rule │
│ • Clean training data │
└───────────┬───────────┘
│
┌─────────────┴─────────────┐
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ FAST LOOP │ │ SLOW LOOP │
│ │ │ │
│ Extracted Rules │ │ Training Pairs │
│ ──────────────── │ │ ──────────────── │
│ Stored as vector │ │ Staged in DB │
│ embeddings. │ │ until threshold │
│ │ │ is met. │
│ Queried via MCP │ │ Exported as JSONL │
│ by any AI agent │ │ for fine-tuning │
│ before generation.│ │ your model. │
└─────────────────────┘ └─────────────────────┘
```
---
## Architecture
| Component | Technology | Purpose |
|---|---|---|
| **API Server** | FastAPI + Uvicorn | Captures interaction payloads from your application |
| **Database** | PostgreSQL + pgvector | Stores logs, rules (with embeddings), and training pairs |
| **Task Queue** | Celery + Redis | Schedules and runs background processing (cron + batch trigger) |
| **LLM Client** | vLLM / OpenAI / Anthropic | Analyses the human correction delta and extracts rules |
| **Embedding Model** | BAAI/bge-m3 (1024-dim) | Generates vector embeddings for semantic rule search |
| **MCP Server** | Model Context Protocol | Exposes preference rules as a tool for AI agents |
| **Fine-Tune Exporter** | JSONL + replay buffer | Exports training data when the configurable threshold is met |
---
## Prerequisites
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.11+ | Required for type hints and async features |
| Docker & Docker Compose | Latest | For PostgreSQL + pgvector and Redis |
| A running LLM endpoint | — | Local vLLM instance, OpenAI API key, or Anthropic API key |
---
## Installation
### 1. Clone the repository
```bash
git clone https://github.com/your-org/NightShift.git
cd NightShift
```
### 2. Create and activate a virtual environment
```bash
python -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activate
```
### 3. Install dependencies
```bash
pip install -r requirements.txt
```
### 4. Configure the environment
```bash
cp .env.example .env
```
Open `.env` and update the following at minimum:
| Variable | What to set |
|---|---|
| `LLM_PROVIDER` | `vllm`, `openai`, or `anthropic` |
| `VLLM_BASE_URL` | Your vLLM server URL (e.g., `http://localhost:8000/v1`) |
| `OPENAI_API_KEY` | Your OpenAI key (if using OpenAI) |
| `ANTHROPIC_API_KEY` | Your Anthropic key (if using Anthropic) |
All other settings have sensible defaults. See the full [Configuration Reference](#configuration-reference) below.
### 5. Start infrastructure services
```bash
docker-compose up -d
```
This starts:
- **PostgreSQL** (port 5432) with the pgvector extension pre-installed
- **Redis** (port 6379) as the Celery message broker
### 6. Run database migrations
```bash
alembic upgrade head
```
---
## Configuration Reference
All settings live in the `.env` file. Every parameter has a sensible default so the system works out of the box for local development.
### LLM Provider
| Variable | Default | Description |
|---|---|---|
| `LLM_PROVIDER` | `vllm` | Active provider: `vllm`, `openai`, or `anthropic` |
| `VLLM_BASE_URL` | `http://localhost:8000/v1` | vLLM OpenAI-compatible endpoint |
| `VLLM_MODEL_NAME` | `nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4` | Model served by vLLM |
| `VLLM_API_KEY` | `token-placeholder` | vLLM API key (if authentication is enabled) |
| `OPENAI_API_KEY` | — | OpenAI API key |
| `OPENAI_MODEL_NAME` | `gpt-4o` | OpenAI model to use |
| `ANTHROPIC_API_KEY` | — | Anthropic API key |
| `ANTHROPIC_MODEL_NAME` | `claude-sonnet-4-20250514` | Anthropic model to use |
### Worker Scheduling
| Variable | Default | Description |
|---|---|---|
| `WORKER_CRON_HOUR` | `2` | Hour (UTC) for the nightly processing run |
| `WORKER_CRON_MINUTE` | `0` | Minute for the nightly processing run |
| `WORKER_BATCH_TRIGGER_SIZE` | `50` | Pending log count to trigger immediate processing |
| `WORKER_BATCH_PROCESS_LIMIT` | `100` | Max logs to process in a single run |
### Fine-Tuning (Slow Loop)
| Variable | Default | Description |
|---|---|---|
| `FINE_TUNING_THRESHOLD` | `500` | Number of staged training pairs to trigger JSONL export |
| `FINE_TUNING_EXPORT_DIR` | `./exports` | Directory for exported `.jsonl` files |
| `FINE_TUNING_PROVIDER` | `local` | `local` (file only), `openai`, or `vllm` |
| `FINE_TUNING_REPLAY_BUFFER_RATIO` | `0.15` | Fraction of older data mixed in to prevent catastrophic forgetting |
### MCP Server (Fast Loop)
| Variable | Default | Description |
|---|---|---|
| `MCP_TRANSPORT` | `stdio` | Transport layer: `stdio` or `sse` |
| `MCP_SSE_HOST` | `0.0.0.0` | SSE server bind address (when `MCP_TRANSPORT=sse`) |
| `MCP_SSE_PORT` | `8080` | SSE server port |
| `MCP_SEARCH_TOP_K` | `5` | Max number of rules returned per search |
| `MCP_SE
[truncated…]PUBLIC HISTORY
First discoveredMar 23, 2026
IDENTITY
inferred
Identity inferred from code signals. No PROVENANCE.yml found.
Is this yours? Claim it →METADATA
platformgithub
first seenMar 22, 2026
last updatedMar 22, 2026
last crawled25 days ago
version—
README BADGE
Add to your README:
