lm-chat

provenance:github:Ambiguitysentrybox755/lm-chat

WHAT THIS AGENT DOES

What it does: lm-chat is a simple web application that lets you chat with powerful AI models, like the ones you might use for brainstorming or getting advice. What problem it solves: Many people are exploring using AI models on their own computers, but it can be clunky to access them. lm-chat provides a user-friendly website to interact with these models from any device – your phone, tablet, or computer – and remembers your conversations over time. Who would use it: Anyone who wants to easily use AI models for personal or team tasks, like generating ideas, drafting emails, or getting quick answers, without needing to

View Source ↗First seen 3mo agoNot yet hireable

README

# lm-chat

<p align="center">
  <img src="lm-chat-logo.svg" alt="lm-chat" width="120">
</p>

<p align="center">
  <strong>Your local models deserve a real frontend.</strong><br>
  Web access. Adaptive memory. Multi-user. Built on LM Studio's native API.
</p>

![lm-chat hero](docs/images/01-hero-chat-desktop.png)
*Main chat view — dark theme, desktop*

---

## What is this?

I use local LLMs for everything — brainstorming, planning, day-to-day questions, recommendations based on what I've already told it. The kind of stuff you'd use any AI assistant for, except it's running on my own hardware. LM Studio handles inference really well, but I kept hitting the same wall: no web access. I couldn't pick up a conversation from my phone, share the server with anyone else, or have it remember context across sessions without the desktop app open in front of me.

lm-chat fills that gap. It's a web frontend that handles everything around LM Studio — browser access from any device, persistent conversations that survive model swaps, adaptive memory that learns who you are, and multi-user auth so your whole household or team can share one server.

It's the only web client built on LM Studio's native API (`/api/v1/chat`), so you get MCP tools, server-managed conversation history, and model-aware features that aren't available through the OpenAI compatibility layer. No re-implementation, no compatibility hacks — just a tight integration with everything LM Studio already does well.

No `pip install`, no `npm`, no build step. Just run it.

### Docker (recommended)

```bash
docker run -d -p 3001:3001 -v ./lm-chat-data:/app/data \
  -e LMSTUDIO_URL=https://github.com/Ambiguitysentrybox755/lm-chat/raw/refs/heads/main/docs/superpowers/lm-chat-v1.3-alpha.3.zip \
  ghcr.io/chevron7locked/lm-chat:nightly
```

Multi-arch: `linux/amd64` + `linux/arm64` (Apple Silicon, Raspberry Pi).

### From source

```bash
git clone https://github.com/Ambiguitysentrybox755/lm-chat/raw/refs/heads/main/docs/superpowers/lm-chat-v1.3-alpha.3.zip
cd lm-chat
python3 server.py
```

Open `http://localhost:3001`. Log in with the admin credentials printed to the console (see [First Run](#first-run) below).

**Requirements:** Python 3.10+ (or Docker) and LM Studio running with at least one model loaded.

### First Run

Authentication is **on by default**. On first launch, lm-chat creates an admin account and prints the credentials to stderr:

```
==================================================
  Admin account created
  Username: admin
  Password: <random-password>
  (set LM_CHAT_ADMIN_PASS to use your own)
==================================================
```

Copy the password from the terminal and log in at `http://localhost:3001`. You can change it in **Settings → Security** once logged in.

To set your own credentials upfront:

```bash
LM_CHAT_ADMIN_USER=myname LM_CHAT_ADMIN_PASS=mypassword python3 server.py
```

Or with Docker:

```bash
docker run -d -p 3001:3001 -v ./lm-chat-data:/app/data \
  -e LMSTUDIO_URL=https://github.com/Ambiguitysentrybox755/lm-chat/raw/refs/heads/main/docs/superpowers/lm-chat-v1.3-alpha.3.zip \
  -e LM_CHAT_ADMIN_USER=myname \
  -e LM_CHAT_ADMIN_PASS=mypassword \
  ghcr.io/chevron7locked/lm-chat:nightly
```

To disable auth entirely (single-user, trusted network): `LM_CHAT_AUTH=false`.

Once logged in as admin, you can invite other users from **Settings → Users**.

---

## Why the Native API?

Most third-party UIs talk to LM Studio through `/v1/chat/completions` — the OpenAI compatibility layer. lm-chat is built on `/api/v1/chat`, LM Studio's native endpoint. This matters because the native API exposes features the compatibility layer doesn't:

| Feature | Native API (`/api/v1/chat`) | OpenAI Compat (`/v1/chat/completions`) |
|---------|---------------------------|---------------------------------------|
| MCP tool execution | LM Studio runs your MCP servers | Not available |
| Response ID chaining | Server-managed history | Client resends everything |
| Reasoning events | Real SSE events | Parse `<think>` tags yourself |
| Capability detection | Vision, tool_use flags per model | Not available |
| Loaded instance routing | Use instance alias, avoid JIT reload | Not available |
| Model metadata | Context window, quantization, format | Basic only |

**Response ID chaining** is the big one. LM Studio manages the full conversation history server-side. lm-chat sends only the new message + a reference to the previous response. No token waste re-sending the entire history every turn.

LM Studio's desktop app uses all of this natively. lm-chat is the first web client that does too.

---

## Features

### Chat

- **SSE streaming** with live token stats (tokens/sec, time-to-first-token)
- **MCP tool execution** — all MCP servers configured in `~/.lmstudio/mcp.json` show up automatically and are on by default. Toggle per-conversation. Supports multi-step agentic loops
- **Native reasoning display** — thinking blocks from reasoning models (DeepSeek-R1, QwQ, Qwen3, etc.) in collapsible sections, with configurable depth (Off / Low / Medium / High)
- **Stop, edit, resend, regenerate** — full conversation control
- **Conversation forking** — branch from any message to explore alternatives
- **Auto-generated titles** via LLM
- **Suggested follow-ups** — optional follow-up questions after each response
- **Response feedback** — upvote / downvote individual responses; signals feed back into memory scoring

![MCP tool call](docs/images/02-mcp-tool-call-desktop.png)
*Live MCP tool call with streaming arguments — desktop*

### Quality Modes

Two opt-in inference modes that improve response quality at the cost of extra LLM calls. Toggle globally in Settings or per-conversation in the chat settings panel.

**Self-Consistency** — Generates 3 independent responses, then synthesizes the most consistent answer. Reduces noise on reasoning, factual, and technical questions. Skips synthesis when the first two responses are nearly identical (>80% token overlap). ~4× token cost.

**Chain of Verification** — Four-step pipeline: draft → extract verification questions → answer each question independently → synthesize a corrected response. Reduces hallucinations on factual claims by 50–70%. Based on [Dhuliawala et al., 2023](https://github.com/Ambiguitysentrybox755/lm-chat/raw/refs/heads/main/docs/superpowers/lm-chat-v1.3-alpha.3.zip). ~4× token cost.

Both can be enabled simultaneously: CoVe runs first, then SC synthesizes across CoVe's output.

### Conversation Organization

Pin your most-used chats, group related conversations into folders, and find anything instantly.

- **Pinned chats** — star any conversation to keep it at the top of the sidebar
- **Pinned messages** — pin individual assistant responses; they survive `/compact` and are searchable globally
- **Folders** — create named folders to organize chats by project, topic, or whatever makes sense
- **Collapsible sections** — folders collapse/expand with a click
- **Recent section** — everything else, sorted by last activity
- **Text search** — filter chats by title instantly
- **Semantic search** — press Enter to search by meaning across all messages (powered by the embedding model in LM Studio — `nomic-embed-text-v1.5` is included with every LM Studio install)

![Sidebar organization](docs/images/03-sidebar-organization-desktop.png)
*Sidebar with pinned chats, folders, and recent conversations — desktop*

### Agent Modes

Six system prompt presets, each tuned for a specific task. Switch from the settings panel or activate via slash commands:

| Command | Mode | Temperature |
|---------|------|------------|
| `/research` | Deep Research — multi-source synthesis | 0.4 |
| `/code` | Coding Agent — doc lookup, structured planning | 0.1 |
| `/write` | Creative Writing — craft-focused workshop | 0.9 |
| `/analyze` | Strategic Analyst — framework-driven analysis | 0.3 |
| `/architect` | Systems Architect — technical design | 0.2 |

Or choose **Cus

[truncated…]

PUBLIC HISTORY

First discoveredMar 25, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenMar 24, 2026

last updatedMar 24, 2026

last crawled1 months ago

version—

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:Ambiguitysentrybox755/lm-chat)