githubinferredactive
ai-phone-negotiator
provenance:github:FilHouston/ai-phone-negotiator
AI voice agent that calls customer hotlines and negotiates better contract terms
README
# 📞 AI Phone Negotiator
> An autonomous AI voice agent that calls customer hotlines and negotiates better contract terms — on your behalf.
[](https://www.typescriptlang.org/)
[](https://nodejs.org/)
[](https://platform.openai.com/docs/api-reference/realtime)
[](https://www.twilio.com/docs/voice)
[](LICENSE)
---
## The Problem
Calling your ISP to negotiate a better deal means:
- ⏳ 20–45 minutes in hold queues
- 🤖 Navigating confusing IVR phone menus
- 🎭 Arguing with trained retention agents
- 😤 Repeating the same conversation every 12 months
**What if an AI agent could do all of that for you?**
## The Solution
AI Phone Negotiator is a fully autonomous voice agent that:
1. **Dials the hotline** — initiates an outbound call via Twilio Voice
2. **Navigates the IVR** — listens to menu options and sends DTMF tones or voice commands
3. **Waits on hold** — detects hold music via energy-based VAD and **disconnects the AI** to save cost (~$9/30 min saved)
4. **Reconnects instantly** — when a human agent picks up, the AI reconnects in <2 seconds
5. **Negotiates** — engages with the human agent using a multi-phase strategy (friendly → competitor leverage → cancellation threat → retention)
6. **Logs offers** — extracts and records any counter-offers via function calling
7. **Never commits** — always defers final decisions to the human ("I'll discuss this with my client")
The agent handles the call. You make the decision.
---
## Architecture
```
┌──────────────┐ ┌───────────────┐ ┌──────────────────┐
│ Dashboard │────▶│ Backend │────▶│ Twilio Voice │
│ (Web UI) │◀────│ (Express) │◀────│ (Outbound Call) │
└──────────────┘ └───────┬───────┘ └────────┬─────────┘
▲ WS │ │
│ ┌──────▼───────┐ ┌────────▼─────────┐
│ │ Agent Brain │◀───▶│ Media Streams │
│ │ (OpenAI │ │ (WebSocket) │
│ │ Realtime) │ │ µ-law 8kHz │
│ └──────┬───────┘ └──────────────────┘
│ │
│ ┌──────▼───────┐ ┌──────────────────┐
│ │ Tool Layer │ │ Hold Detector │
└──────────────│ ├ DTMF │ │ Energy-based │
│ ├ Call Ctrl │ │ VAD on µ-law │
│ ├ Offers │ │ ┌────────────┐ │
│ └ Transcript│ │ │IVR→HOLD→ │ │
└──────────────┘ │ │ HUMAN │ │
│ └────────────┘ │
└──────────────────┘
```
### Data Flow
1. **Twilio** places an outbound call and opens a bidirectional **Media Stream** (WebSocket, µ-law 8kHz)
2. The **Hold Detector** analyzes incoming audio frames in real time — classifying the call phase as `IVR`, `HOLD`, or `HUMAN`
3. During **IVR** and **HUMAN** phases: audio is forwarded to the **OpenAI Realtime API** for speech processing and response generation
4. During **HOLD** phase: the OpenAI WebSocket is **disconnected** — no AI cost while waiting
5. When the Hold Detector detects a human voice after hold: OpenAI reconnects in <2 seconds with context ("You were just on hold, a human agent is now speaking")
6. **Function calling** enables the agent to send DTMF tones, log offers, and control the call
7. The **Dashboard** provides real-time monitoring via WebSocket — live transcript, phase indicator, hold timer, and manual override controls
---
## Features
### 🎯 Strategy Engine
Pluggable JSON profiles per provider with customer data, leverage points, competition offers, escalation paths, and hard safety rules. Provider-specific verification fields (PIN, IBAN, DOB) are only disclosed when explicitly asked.
### 📊 Dashboard
Dark-themed web UI with three views:
- **Start** — select a strategy profile, view call history, initiate calls
- **Live** — real-time transcript, offer cards, phase indicator, hold timer, emergency disconnect
- **Result** — call summary with all logged offers
### ✏️ Strategy Editor
Create and manage negotiation profiles directly in the dashboard — no JSON editing required. Dynamic forms adapt to the selected provider (Vodafone PIN fields, Telekom IBAN/DOB, etc.). Sensitive fields are masked in the UI.
### 💤 Hold Gating
Energy-based Voice Activity Detection on raw µ-law audio frames. When hold music is detected, the OpenAI Realtime API WebSocket is disconnected entirely. A 30-minute hold queue costs ~$0.06 instead of ~$9. Auto-timeout at 45 minutes.
### 🔒 Safety
- Agent **never** accepts, changes, or confirms any contract
- Only shares pre-approved customer data, only when explicitly asked
- All offers are logged via function calling — nothing is agreed to
- Emergency kill switch in dashboard and via API
- Configurable hard rules per strategy profile
---
## Tech Stack
| Component | Technology | Purpose |
|-----------|-----------|---------|
| **Telephony** | [Twilio Voice](https://www.twilio.com/docs/voice) + [Media Streams](https://www.twilio.com/docs/voice/media-streams) | Outbound calls, DTMF, bidirectional audio |
| **Voice AI** | [OpenAI Realtime API](https://platform.openai.com/docs/api-reference/realtime) | Real-time speech processing, function calling |
| **Backend** | Express + WebSocket (Node.js) | Orchestration, audio bridge, API endpoints |
| **Frontend** | Vanilla HTML/CSS/JS | Dashboard — no build step, no framework |
| **Language** | TypeScript 5.9 | Type safety, better DX |
| **Strategy** | JSON profiles | Provider-specific negotiation playbooks |
### Why Twilio + OpenAI Realtime (not Bland.ai / Vapi)?
Managed platforms abstract away the telephony layer — convenient, but limiting. Building directly on Twilio + OpenAI Realtime gives full control over:
- **Audio pipeline** — direct Media Stream ↔ Realtime API bridge, no middleman latency
- **Hold Gating** — disconnect AI during hold queues (impossible with managed platforms)
- **Tool layer** — custom function calling for DTMF, call control, offer extraction
- **Strategy engine** — pluggable negotiation profiles per provider
- **Cost** — ~$0.45/min active negotiation, ~$0.002/min on hold
---
## Project Structure
```
ai-phone-negotiator/
├── src/
│ ├── server.ts # Express + WebSocket server, static files, API routes
│ ├── config.ts # Environment config + validation
│ ├── check.ts # Connectivity check (Twilio + OpenAI)
│ ├── call.ts # Outbound call initiation via Twilio REST
│ ├── media-stream.ts # Twilio Media Stream ↔ OpenAI audio bridge + hold gating
│ ├── openai-realtime.ts # OpenAI Realtime API WebSocket client
│ ├── strategy.ts # Strategy engine + system prompt builder
│ └── tools/
│ ├── dtmf.ts # DTMF tone sending
│ ├── call-control.ts # Call lifecycle (end, hold-for-human)
│ ├── offer-logger.ts # Offer extraction + logging via function calling
│ └── transcript.ts # TranscriptLogger (JSON) + EventEmitter for dashboard WS
├── strategies/
│ └── example.json # Example strategy profile (safe to commit)
├── transcripts/ # Call transcripts (gitignored)
├── .env.example # Environment template
├── TWILIO-SETUP.md # Twilio account + phone number setup guide
├── tsconfig.json
└── package.json
```
---
## Strategy Profiles
Negotiations follow pluggable JSON profiles:
```json
{
"provider": "Example ISP",
"hotline": "+49XXXXXXXXXXX",
[truncated…]PUBLIC HISTORY
First discoveredMar 21, 2026
IDENTITY
inferred
Identity inferred from code signals. No PROVENANCE.yml found.
Is this yours? Claim it →METADATA
platformgithub
first seenMar 3, 2026
last updatedMar 6, 2026
last crawled5 days ago
version—
README BADGE
Add to your README:
