jarvis
Jarvis is an AI assistant that can understand and respond to your requests, even complex ones involving multiple steps. It helps you manage information and tasks by processing documents like PDFs and images, searching the web, and even controlling your computer. Business professionals, researchers, or anyone dealing with a lot of information could find it helpful for automating workflows and quickly finding answers. What sets Jarvis apart is its ability to plan and execute tasks intelligently, adapting to your needs with a visually engaging interface and a natural-sounding voice. It’s designed to be reliable and efficient, making it a powerful tool for boosting productivity.
README
<div align="center">
[](.)
[](https://git.io/typing-svg)
<br/>
[](https://python.org)
[](CHANGELOG.md)
[](LICENSE)
[](.)
[](.)
[](https://groq.com)
[](https://threejs.org)
[](CONTRIBUTING.md)
<br/>
[](.)
[](.)
[](.)
[](.)
[](.)
[](.)
[](.)
<br/>
> **A production-grade, reliability-first autonomous AI assistant** combining a priority intent router,
> a full Planner→Validator→Executor→Synthesizer agent loop, multimodal document intelligence,
> realtime voice output, OS-level system control, and a stunning Three.js adaptive plasma core UI.
<br/>
</div>
---
<p align="center">
<img src="assets/jarvis_ui.gif" width="700"/>
</p>
---
## 📌 Table of Contents
<details>
<summary>Expand Navigation</summary>
- [✨ What is JARVIS?](#-what-is-jarvis)
- [🚀 Feature Highlights](#-feature-highlights)
- [🏗️ Architecture](#️-architecture)
- [🛠️ Tech Stack](#️-tech-stack)
- [⚡ Quick Start](#-quick-start)
- [⚙️ Configuration](#️-configuration)
- [🖥️ Run Modes](#️-run-modes)
- [📁 Project Structure](#-project-structure)
- [🗺️ Roadmap](#️-roadmap)
- [🤝 Contributing](#-contributing)
- [🔐 Security](#-security)
- [📜 License](#-license)
- [👤 Author](#-author)
</details>
---
## ✨ What is JARVIS?
**JARVIS** is not a chatbot. It is a full-stack, autonomous AI assistant runtime built around a strict **reliability-first principle** — meaning every answer that claims to be real-time actually is, every tool call is validated before synthesis, and every system action is OS-verified before being reported as successful.
At its core, JARVIS combines:
- **⚡ Sub-millisecond local routing** for greetings, identity, and conversational turns
- **🧠 A multi-step agent loop** (Plan → Validate → Execute → Synthesize) for tool-backed queries
- **📄 A hybrid document intelligence pipeline** fusing text extraction, OCR, and LLM vision
- **🎤 Real-time, streaming voice synthesis** via Piper TTS with chunk-level playback
- **🖥️ A pywebview desktop GUI** rendered through a Three.js adaptive plasma sphere with live telemetry
Every module enforces its own reliability contract. No hallucinated real-time data. No fake success confirmations. No persona drift.
---
## 🚀 Feature Highlights
| Category | Capability |
|---|---|
| 🧭 **Smart Routing** | Priority intent router with 30+ local fast-paths before agent loop fallback |
| 🌐 **Live Web Search** | Real-time web + news evidence via Serper — factual queries always use live sources |
| 🌦️ **Weather + Forecast** | Current conditions, daily forecasts, and rain probability via Open-Meteo |
| 📄 **Document Intelligence** | PDF · DOCX · Image — text extraction, PaddleOCR, Groq Vision, SQLite caching |
| 💬 **Document Q&A** | Follow-up Q&A over analyzed documents without re-processing |
| ⚖️ **Multi-Doc Compare** | Pricing, risk, and feature comparison across multiple documents simultaneously |
| 🎤 **Realtime TTS** | Streaming Piper voice with first-chunk latency optimization |
| 🖥️ **App Control** | Open/close desktop apps with Start Menu indexing, fuzzy resolution, OS verification |
| 🔊 **System Control** | Volume · Brightness · Window management · Desktop control · Screen lock |
| 🌍 **Network Diagnostics** | Public IP · IP-based location · Connectivity probes · Speedtest |
| 🕒 **Temporal Awareness** | Precise time/date/day/month/year responses |
| 💾 **Persistent Memory** | JSON-backed user profile with session location and search context |
| 🎭 **Personality Engine** | Contextual humor system with anti-repetition guards and tone adaptation |
| ⏭️ **Skip Control** | UI button to safely interrupt active TTS mid-stream |
| 📊 **Live Telemetry** | CPU · RAM · Disk · Battery · Network · Uptime — all live in the HUD |
---
## 🏗️ Architecture
### Main Request Flow
```mermaid
flowchart TD
A(["🎙️ User Input"]) --> B{"⚡ Priority\nIntent Router"}
B -->|"Greeting / Wellbeing\nName / Correction\nLocation / Help"| C(["✅ Local Handler\n~0ms"])
B -->|"Tool-capable query"| D["🧠 Agent Loop"]
D --> E["📋 Planner\nGroq JSON"]
E --> F["🛡️ Validator\nSchema + Safety"]
F --> G["⚙️ Executor\nAsync / Parallel"]
G --> H[("🔧 Tools\nWeather · Search\nSystem · Document")]
H --> I["🔬 Synthesizer\nRelevance Filter"]
B -->|"General LLM query"| J["💬 Groq Stream\nllama-3.1-8b"]
I --> K["🎭 Personality +\nIdentity Guardrails"]
J --> K
C --> K
K --> L(["🔊 Response + TTS"])
style A fill:#0066ff,color:#fff,stroke:#00e1ff
style L fill:#0066ff,color:#fff,stroke:#00e1ff
style C fill:#00C853,color:#fff,stroke:none
style K fill:#7C3AED,color:#fff,stroke:none
```
### Document Intelligence Pipeline
```mermaid
flowchart LR
A(["📄 Document\nIntent"]) --> B["📁 File Selector\n+ Path Validation"]
B --> C{"File Type"}
C -->|"PDF"| D["PyMuPDF\n+ pdfplumber"]
C -->|"DOCX"| E["python-docx"]
C -->|"Image"| F["OcrParser"]
D & E --> G{"Content\nAnalysis"}
G -->|"Text-Rich"| H["📝 Text Primary\nLLM Pass"]
G -->|"Has Images\nor Scanned"| I["👁️ Groq Vision\nLlama 4 Scout"]
G -->|"Low Confidence"| J["🔠 PaddleOCR"]
H & I & J --> K["🔀 Fusion\nProcessor"]
K --> L["🧠 Reasoning\nllama-3.3-70b"]
L --> M["🗂️ Active Document\nIndex + SQLite Cache"]
M --> N(["💬 Q&A Engine\n+ Multi-Doc Compare"])
style A fill:#0066ff,color:#fff,stroke:#00e1ff
style N fill:#0066ff,color:#fff,stroke:#00e1ff
style L fill:#7C3AED,color:#fff,stroke:none
```
### Intent Routing Precedence
```mermaid
flowchart TD
A(["Query"]) --> B{"Priority 1–17\nCorrection · Name\nGreeting · Location\nWellbeing · Help"}
B -->|"Matched"| C(["Local Response"])
B -->|"No match"| D{"Priority 18–27\nSpeedtest · Connectivity\nIP · Weather · Stat
[truncated…]PUBLIC HISTORY
IDENTITY
Identity inferred from code signals. No PROVENANCE.yml found.
Is this yours? Claim it →METADATA
README BADGE
Add to your README:
