ai-phone-negotiator

provenance:github:FilHouston/ai-phone-negotiator

WHAT THIS AGENT DOES

The AI Phone Negotiator is an autonomous voice agent designed to negotiate better contract terms with customer service hotlines on your behalf. It handles the entire call process, from dialing the number and navigating automated menus to engaging with human agents and logging offers. This agent is ideal for individuals or businesses who find negotiating contracts time-consuming and frustrating. It saves users significant time and effort by automating a tedious and often stressful process. The agent's ability to detect hold times and disconnect to save costs, along with its rapid reconnection feature, makes it a uniquely efficient solution. Ultimately, it empowers users to make informed decisions about their contracts without the hassle of lengthy phone calls.

PROBLEM IT SOLVES

Negotiating contracts with service providers like ISPs can be a frustrating and time-consuming process, often involving long hold times and repetitive conversations. This agent solves that problem by automating the entire negotiation process, freeing up users' time and reducing the stress associated with these interactions.

View Source ↗First seen 3mo agoNot yet hireable

CAPABILITIES & CONSTRAINTS

TECH & STACK

typescriptopenaitwiliovoicenegotiationautomationrealtimenode.js

README

# 📞 AI Phone Negotiator

> An autonomous AI voice agent that calls customer hotlines and negotiates better contract terms — on your behalf.

[![TypeScript](https://img.shields.io/badge/TypeScript-5.9-blue?logo=typescript)](https://www.typescriptlang.org/)
[![Node.js](https://img.shields.io/badge/Node.js-≥20-green?logo=node.js)](https://nodejs.org/)
[![OpenAI Realtime](https://img.shields.io/badge/OpenAI-Realtime_API-412991?logo=openai)](https://platform.openai.com/docs/api-reference/realtime)
[![Twilio](https://img.shields.io/badge/Twilio-Voice_API-F22F46?logo=twilio)](https://www.twilio.com/docs/voice)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

---

## The Problem

Calling your ISP to negotiate a better deal means:

- ⏳ 20–45 minutes in hold queues
- 🤖 Navigating confusing IVR phone menus
- 🎭 Arguing with trained retention agents
- 😤 Repeating the same conversation every 12 months

**What if an AI agent could do all of that for you?**

## The Solution

AI Phone Negotiator is a fully autonomous voice agent that:

1. **Dials the hotline** — initiates an outbound call via Twilio Voice
2. **Navigates the IVR** — listens to menu options and sends DTMF tones or voice commands
3. **Waits on hold** — detects hold music via energy-based VAD and **disconnects the AI** to save cost (~$9/30 min saved)
4. **Reconnects instantly** — when a human agent picks up, the AI reconnects in <2 seconds
5. **Negotiates** — engages with the human agent using a multi-phase strategy (friendly → competitor leverage → cancellation threat → retention)
6. **Logs offers** — extracts and records any counter-offers via function calling
7. **Never commits** — always defers final decisions to the human ("I'll discuss this with my client")

The agent handles the call. You make the decision.

---

## Architecture

```
┌──────────────┐     ┌───────────────┐     ┌──────────────────┐
│  Dashboard   │────▶│   Backend     │────▶│  Twilio Voice    │
│  (Web UI)    │◀────│   (Express)   │◀────│  (Outbound Call) │
└──────────────┘     └───────┬───────┘     └────────┬─────────┘
       ▲ WS                  │                      │
       │              ┌──────▼───────┐     ┌────────▼─────────┐
       │              │  Agent Brain │◀───▶│  Media Streams    │
       │              │  (OpenAI     │     │  (WebSocket)      │
       │              │   Realtime)  │     │  µ-law 8kHz       │
       │              └──────┬───────┘     └──────────────────┘
       │                     │
       │              ┌──────▼───────┐     ┌──────────────────┐
       │              │  Tool Layer  │     │  Hold Detector   │
       └──────────────│  ├ DTMF      │     │  Energy-based    │
                      │  ├ Call Ctrl │     │  VAD on µ-law    │
                      │  ├ Offers    │     │  ┌────────────┐  │
                      │  └ Transcript│     │  │IVR→HOLD→   │  │
                      └──────────────┘     │  │   HUMAN    │  │
                                           │  └────────────┘  │
                                           └──────────────────┘
```

### Data Flow

1. **Twilio** places an outbound call and opens a bidirectional **Media Stream** (WebSocket, µ-law 8kHz)
2. The **Hold Detector** analyzes incoming audio frames in real time — classifying the call phase as `IVR`, `HOLD`, or `HUMAN`
3. During **IVR** and **HUMAN** phases: audio is forwarded to the **OpenAI Realtime API** for speech processing and response generation
4. During **HOLD** phase: the OpenAI WebSocket is **disconnected** — no AI cost while waiting
5. When the Hold Detector detects a human voice after hold: OpenAI reconnects in <2 seconds with context ("You were just on hold, a human agent is now speaking")
6. **Function calling** enables the agent to send DTMF tones, log offers, and control the call
7. The **Dashboard** provides real-time monitoring via WebSocket — live transcript, phase indicator, hold timer, and manual override controls

---

## Features

### 🎯 Strategy Engine
Pluggable JSON profiles per provider with customer data, leverage points, competition offers, escalation paths, and hard safety rules. Provider-specific verification fields (PIN, IBAN, DOB) are only disclosed when explicitly asked.

### 📊 Dashboard
Dark-themed web UI with three views:
- **Start** — select a strategy profile, view call history, initiate calls
- **Live** — real-time transcript, offer cards, phase indicator, hold timer, emergency disconnect
- **Result** — call summary with all logged offers

### ✏️ Strategy Editor
Create and manage negotiation profiles directly in the dashboard — no JSON editing required. Dynamic forms adapt to the selected provider (Vodafone PIN fields, Telekom IBAN/DOB, etc.). Sensitive fields are masked in the UI.

### 💤 Hold Gating
Energy-based Voice Activity Detection on raw µ-law audio frames. When hold music is detected, the OpenAI Realtime API WebSocket is disconnected entirely. A 30-minute hold queue costs ~$0.06 instead of ~$9. Auto-timeout at 45 minutes.

### 🔒 Safety
- Agent **never** accepts, changes, or confirms any contract
- Only shares pre-approved customer data, only when explicitly asked
- All offers are logged via function calling — nothing is agreed to
- Emergency kill switch in dashboard and via API
- Configurable hard rules per strategy profile

---

## Tech Stack

| Component | Technology | Purpose |
|-----------|-----------|---------|
| **Telephony** | [Twilio Voice](https://www.twilio.com/docs/voice) + [Media Streams](https://www.twilio.com/docs/voice/media-streams) | Outbound calls, DTMF, bidirectional audio |
| **Voice AI** | [OpenAI Realtime API](https://platform.openai.com/docs/api-reference/realtime) | Real-time speech processing, function calling |
| **Backend** | Express + WebSocket (Node.js) | Orchestration, audio bridge, API endpoints |
| **Frontend** | Vanilla HTML/CSS/JS | Dashboard — no build step, no framework |
| **Language** | TypeScript 5.9 | Type safety, better DX |
| **Strategy** | JSON profiles | Provider-specific negotiation playbooks |

### Why Twilio + OpenAI Realtime (not Bland.ai / Vapi)?

Managed platforms abstract away the telephony layer — convenient, but limiting. Building directly on Twilio + OpenAI Realtime gives full control over:

- **Audio pipeline** — direct Media Stream ↔ Realtime API bridge, no middleman latency
- **Hold Gating** — disconnect AI during hold queues (impossible with managed platforms)
- **Tool layer** — custom function calling for DTMF, call control, offer extraction
- **Strategy engine** — pluggable negotiation profiles per provider
- **Cost** — ~$0.45/min active negotiation, ~$0.002/min on hold

---

## Project Structure

```
ai-phone-negotiator/
├── src/
│   ├── server.ts            # Express + WebSocket server, static files, API routes
│   ├── config.ts            # Environment config + validation
│   ├── check.ts             # Connectivity check (Twilio + OpenAI)
│   ├── call.ts              # Outbound call initiation via Twilio REST
│   ├── media-stream.ts      # Twilio Media Stream ↔ OpenAI audio bridge + hold gating
│   ├── openai-realtime.ts   # OpenAI Realtime API WebSocket client
│   ├── strategy.ts          # Strategy engine + system prompt builder
│   └── tools/
│       ├── dtmf.ts          # DTMF tone sending
│       ├── call-control.ts  # Call lifecycle (end, hold-for-human)
│       ├── offer-logger.ts  # Offer extraction + logging via function calling
│       └── transcript.ts    # TranscriptLogger (JSON) + EventEmitter for dashboard WS
├── strategies/
│   └── example.json         # Example strategy profile (safe to commit)
├── transcripts/             # Call transcripts (gitignored)
├── .env.example             # Environment template
├── TWILIO-SETUP.md          # Twilio account + phone number setup guide
├── tsconfig.json
└── package.json
```

---

## Strategy Profiles

Negotiations follow pluggable JSON profiles:

```json
{
  "provider": "Example ISP",
  "hotline": "+49XXXXXXXXXXX",

[truncated…]

PUBLIC HISTORY

First discoveredMar 21, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenMar 3, 2026

last updatedMar 6, 2026

last crawled1 months ago

version—

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:FilHouston/ai-phone-negotiator)