langsight

provenance:github:sumankalyan123/langsight

WHAT THIS AGENT DOES

LangSight acts as a safety net for AI agents, constantly monitoring the tools they use to complete tasks. It identifies problems like agents getting stuck repeating the same action, tools experiencing unexpected errors, or costs spiraling out of control. This helps businesses avoid frustrating user experiences, unexpected expenses, and potential security risks. Teams building and deploying AI agents, particularly those relying on external services, would find LangSight invaluable. What sets it apart is its focus on the operational reliability of these tools, providing real-time visibility and automated safeguards that go beyond simply evaluating the AI's responses.

View Source ↗First seen 4mo agoNot yet hireable

README

# LangSight

**Your agent failed. Which tool broke — and how do we stop it next time?**

Detect loops. Enforce budgets. Break failing tools. Map blast radius.
For MCP servers: health checks, security scanning, schema drift detection.

[![Website](https://img.shields.io/badge/website-langsight.dev-blue)](https://www.langsight.dev)
[![PyPI](https://img.shields.io/pypi/v/langsight)](https://pypi.org/project/langsight/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue)](https://www.python.org/downloads/)
[![CI](https://github.com/LangSight/langsight/actions/workflows/ci.yml/badge.svg)](https://github.com/LangSight/langsight/actions/workflows/ci.yml)
[![Docs](https://img.shields.io/badge/docs-langsight.dev-green)](https://docs.langsight.dev)

> **Not another prompt, eval, or simulation platform.**
> LangSight is the runtime reliability layer for AI agent toolchains.

---

## Where LangSight fits

Langfuse watches the **brain** (model outputs, token costs, evals).
LangWatch tests the **brain** (simulations, prompt optimization).
Datadog watches the **body** (CPU, memory, HTTP codes).
**LangSight watches the hands** (tools the agent calls, their health, safety, and cost).

| Question | Best tool |
|----------|-----------|
| Did the prompt/model perform well? | LangWatch / Langfuse / LangSmith |
| Should I change prompts or eval policy? | LangWatch / Langfuse / LangSmith |
| Is my server CPU/memory healthy? | Datadog / New Relic |
| **Which tool call failed in production?** | **LangSight** |
| **Is my agent stuck in a loop?** | **LangSight** |
| **Is an MCP server unhealthy or drifting?** | **LangSight** |
| **Is an MCP server exposed or risky?** | **LangSight** |
| **Why did this session cost $47 instead of $3?** | **LangSight** |
| **If this tool goes down, which agents break?** | **LangSight** |

Use LangSight alongside Langfuse and LangWatch — not instead of them.

---

## The problem

LLM quality is only half the problem. Teams already have ways to inspect prompts and eval scores. What they still cannot answer fast enough:

- **Agent stuck in a loop** — retries the same tool 47 times, burns $200, produces nothing
- **MCP server degraded silently** — schema changed, latency spiked, auth expired. Agent keeps calling, gets bad data
- **Cost explosion** — sub-agent retries geocoding-mcp endlessly. Nobody knows until the invoice arrives
- **Cascading failure** — postgres-mcp goes down. 3 agents depend on it. All sessions fail. No blast radius visibility
- **Unsafe MCP server** — 66% of community MCP servers have critical code smells. No automated scanning

---

## What LangSight does

### 1. Prevent — stop failures before users notice

```python
from langsight.sdk import LangSightClient

client = LangSightClient(
    url="http://localhost:8000",
    loop_detection=True,        # detect same tool+args called 3x → auto-stop
    max_cost_usd=1.00,          # hard budget limit per session
    max_steps=25,               # hard step limit
    circuit_breaker=True,       # auto-disable tools after 5 consecutive failures
)
```

- **Loop detection** — same tool called with same args 3x → session terminated, alert fired
- **Budget guardrails** — max cost / max steps per session → hard stop before bill shock
- **Circuit breaker** — tool fails 5x → auto-disabled for cooldown → alert → auto-recovery test

### 2. Detect — see what broke and why

```
$ langsight sessions --id sess-f2a9b1

Trace: sess-f2a9b1  (support-agent)  [LOOP_DETECTED]
5 tool calls · 1 failed · 2,134ms · $0.023

sess-f2a9b1
├── jira-mcp/get_issue        89ms  ✓
├── postgres-mcp/query        42ms  ✓
├──  → billing-agent          handoff
│   ├── crm-mcp/update    120ms  ✓
│   └── slack-mcp/notify    —   ✗  timeout
Root cause: slack-mcp timed out at 14:32 UTC
```

- **Action traces** — every tool call in every session, with latency, status, cost
- **Multi-agent trees** — full call tree across agent handoffs via `parent_span_id`
- **Run health tags** — every session auto-classified: `success`, `loop_detected`, `budget_exceeded`, `tool_failure`

### 3. Monitor — MCP health + security

```
$ langsight mcp-health

Server              Status    Latency     Schema    Circuit
snowflake-mcp       ✅ UP     142ms       Stable    closed
slack-mcp           ⚠️ DEG   1,240ms     Stable    closed
jira-mcp            ❌ DOWN   —           —         open (5 failures)
postgres-mcp        ✅ UP     31ms        Changed   closed
```

```
$ langsight security-scan

CRITICAL  jira-mcp        CVE-2025-6514  Remote code execution in mcp-remote
HIGH      slack-mcp       OWASP-MCP-01   Tool description contains injection pattern
HIGH      postgres-mcp    OWASP-MCP-04   No authentication configured
```

- **MCP health checks** — continuous ping, latency, uptime tracking
- **Schema drift detection** — tool schemas change → alert fires before agents hallucinate
- **Security scanning** — CVE (OSV), OWASP MCP Top 10, tool poisoning detection, auth audit

### 4. Attribute — cost at the tool level

```
$ langsight costs --hours 24

Tool                    Calls   Failed   Cost       % of Total
geocoding-mcp           2,340   12       $1,872     44.6%
postgres-mcp/query      890     3        $445       10.6%
claude-3.5 (LLM)       156     0        $312       7.4%
```

Not model-level costs (Langfuse does that). **Tool-level costs.** Which MCP server is burning your budget?

### 5. Map — blast radius via lineage

```
postgres-mcp ❌ DOWN

Impact:
  - support-agent: 200 sessions/day (HIGH)
  - billing-agent: 50 sessions/day (MEDIUM)
  - data-agent: 10 sessions/day (LOW)

Total: ~260 sessions/day affected
Circuit breaker: active (auto-disabled 3 minutes ago)
```

- **Lineage DAG** — which agents call which tools
- **Blast radius** — if this tool goes down, what else breaks?
- **Impact alerts** — "postgres-mcp is DOWN — 3 agents affected, 260 sessions/day"

### 6. Investigate — AI-assisted root cause

```
$ langsight investigate jira-mcp

Investigation: jira-mcp
├── Health: DOWN since 14:32 UTC (3 consecutive failures)
├── Schema: 2 tools changed (get_issue dropped 'priority' field)
├── Recent errors: 429 Too Many Requests (rate limit)
└── Recommendation: check API rate limits, restore 'priority' field
```

---

## Quick start

### Prerequisites

- Docker and Docker Compose
- Python 3.11+ and [uv](https://docs.astral.sh/uv/)

### 1. Clone and start

```bash
git clone https://github.com/LangSight/langsight.git
cd langsight
./scripts/quickstart.sh
```

Takes ~2 minutes. Generates secrets, starts 5 containers, seeds demo data.

### 2. Open the dashboard

**http://localhost:3003** — log in with the admin email and password written to `.env` by `quickstart.sh` (randomly generated — check the file).

### 3. Instrument your agent

```python
from langsight.sdk import LangSightClient

client = LangSightClient(url="http://localhost:8000", api_key="<from quickstart>")
traced = client.wrap(mcp_session, server_name="postgres-mcp", agent_name="my-agent")
result = await traced.call_tool("query", {"sql": "SELECT * FROM orders"})
```

Two lines. Every tool call is now traced, guarded, and cost-attributed.

---

## Alerting

| Channel | Status |
|---|---|
| Slack (Block Kit) | Shipped |
| Generic webhook | Shipped |
| OpsGenie (native Events API) | v0.3 |
| PagerDuty (Events API v2) | v0.3 |

Alert types: server down/recovered, schema drift, latency spike, SLO breach, anomaly, loop detected, budget exceeded, circuit breaker open, failure rate spike, blast radius impact.

---

## Horizontal scaling with Redis

For multi-worker deployments, add Redis for shared rate limiting, SSE broadcasting, and circuit breaker state:

```bash
# Install Redis support
pip install "langsight[redis]"

# Add to .env
REDIS_PASSWORD=$(openssl rand -hex 24)
LANGSIGHT_REDIS_URL=redis://:${REDIS_PASSWORD}@redis:6379
LANGSIGHT_WORKERS=4

# Start with Redis
docker compose --pro

[truncated…]

PUBLIC HISTORY

First discoveredMar 21, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenMar 16, 2026

last updatedMar 20, 2026

last crawled3 months ago

version—

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:sumankalyan123/langsight)