AGENTS / GITHUB / jarvis
githubinferredactive

jarvis

provenance:github:dev-core-busy/jarvis

Autonomous AI Desktop Agent for Linux – Multi-LLM, Desktop Control via VNC, WhatsApp Integration, RAG Knowledge Base, OpenClaw Skill Ecosystem

View Source ↗First seen 1mo agoNot yet hireable
README
<div align="center">

# 🤖 Jarvis AI Desktop Agent

**An autonomous AI agent with web frontend, desktop control, and multi-LLM support**

[![Python](https://img.shields.io/badge/Python-3.13-blue?logo=python&logoColor=white)](https://www.python.org/)
[![License](https://img.shields.io/badge/License-AGPL--3.0-green?logo=gnu)](LICENSE)
[![Version](https://img.shields.io/badge/Version-0.8-orange)](https://github.com/dev-core-busy/jarvis/releases)
[![Platform](https://img.shields.io/badge/Platform-Linux-lightgrey?logo=linux)](https://www.linux.org/)
[![PRs Welcome](https://img.shields.io/badge/PRs-Welcome-brightgreen)](https://github.com/dev-core-busy/jarvis/pulls)
[![OpenClaw Compatible](https://img.shields.io/badge/OpenClaw-Compatible-6366f1)](https://github.com/dev-core-busy/jarvis#openclaw-skill-ecosystem)

*Control your Linux desktop with natural language. Receive tasks via WhatsApp. Search your knowledge base. Automate everything.*

[**Live Demo**](https://jarvis-ai.info) · [**Report Bug**](https://github.com/dev-core-busy/jarvis/issues) · [**Request Feature**](https://github.com/dev-core-busy/jarvis/issues) · [**Contribute**](#contributing)

---

![Jarvis Split View](https://jarvis-ai.info/img/split_view.png)

</div>

---

## 📋 Table of Contents

- [Overview](#overview)
- [Key Features](#key-features)
- [Architecture](#architecture)
- [Tech Stack](#tech-stack)
- [Screenshots](#screenshots)
- [Installation](#installation)
- [Configuration](#configuration)
- [Skill System](#skill-system)
- [WhatsApp Integration](#whatsapp-integration)
- [Knowledge Base](#knowledge-base)
- [API Reference](#api-reference)
- [Contributing](#contributing)
- [Third-Party Licenses](#third-party-licenses)
- [License](#license)

---

## Overview

Jarvis is a **self-hosted, autonomous AI desktop agent** that runs on a Linux server. It combines a polished web frontend with real desktop control — you can watch and direct the agent as it works, right in your browser.

The core idea: give Jarvis a task (via chat, WhatsApp, or the web UI), and it figures out how to complete it — browsing the web, reading files, writing code, sending emails, managing your calendar — all while you observe through a live VNC split-screen view.

```
"Find all emails from last week about Project Alpha, summarize them,
 and create a calendar event for the follow-up meeting."
```

Jarvis handles it. You watch it happen.

---

## Key Features

### 🖥️ VNC Split View
The web interface shows your LLM chat **and a live desktop feed side by side**. The agent can see exactly what it's doing — screenshots feed back into the LLM context automatically. No more blind automation.

### 🧩 Modular Skill System
Skills are self-contained Python packages that extend Jarvis with new capabilities. Install, enable, disable, and configure them through the UI without touching config files. Compatible with [openclaw](https://github.com/steipete/gogcli) skills.

### 🔀 Multi-LLM Support
Switch between AI providers without restarting anything:
- **Google Gemini** (gemini-2.0-flash, gemini-1.5-pro, ...)
- **Anthropic Claude** (claude-opus-4, claude-sonnet-4, ...)
- **OpenRouter** (hundreds of models)
- **Local Ollama** (llama3, mistral, qwen2.5, ... — fully offline)
- Any **OpenAI-compatible** endpoint

Both native tool/function calling **and** prompt-based tool calling are supported — so even models without native tool support can use all of Jarvis's capabilities.

### 📱 WhatsApp Agent
Send Jarvis a voice note or text message on WhatsApp, get a response back. Voice messages are transcribed via faster-whisper (runs locally, no cloud). Perfect for mobile task delegation.

### 📚 Knowledge Base
Drop PDFs, DOCX files, or plain text into watched folders. Jarvis indexes them with TF-IDF and can search them during tasks. Multi-folder support, automatic re-indexing on file changes.

### 🌐 Google Workspace Integration
Manage Gmail, Google Calendar, and Google Drive through natural language commands — powered by the openclaw/gog CLI.

### 🤖 Browser Automation
Full browser control via CDP (Chrome DevTools Protocol) and xdotool. The agent can navigate websites, fill forms, click elements, and extract information.

### 🔐 Secure by Default
- HTTPS with self-signed certificates (auto-generated)
- Session-based authentication
- All external services proxied through the FastAPI backend

---

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                        Browser Client                        │
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│   │  LLM Chat UI │  │  noVNC :6080 │  │  Settings / Skills│  │
│   │  (WebSocket) │  │  (Live VNC)  │  │  WhatsApp Logs   │  │
│   └──────┬───────┘  └──────┬───────┘  └────────┬─────────┘  │
└──────────┼────────────────┼────────────────────┼────────────┘
           │ WSS/HTTPS       │ WSS                │ HTTPS
           ▼                 ▼                    ▼
┌─────────────────────────────────────────────────────────────┐
│                   FastAPI Backend :443                        │
│   ┌─────────────┐  ┌────────────┐  ┌──────────────────────┐  │
│   │ JarvisAgent │  │ Skills API │  │  WhatsApp Proxy      │  │
│   │  (agent.py) │  │ /api/skills│  │  _wa_bridge_async()  │  │
│   └──────┬──────┘  └─────┬──────┘  └──────────┬───────────┘  │
│          │               │                     │              │
│   ┌──────▼──────────┐    │              ┌──────▼───────────┐  │
│   │   SkillManager  │◄───┘              │  Baileys Bridge  │  │
│   │  (skills/*.py)  │                   │  Node.js :3001   │  │
│   └──────┬──────────┘                   │  (localhost only)│  │
│          │                              └──────────────────┘  │
│   ┌──────▼──────────────────────────────────────────────────┐ │
│   │                      Tool Layer                          │ │
│   │  shell · desktop · filesystem · screenshot · memory     │ │
│   │  knowledge · browser_control · whatsapp · google_apps   │ │
│   └──────────────────────────────────────────────────────────┘│
│                                                               │
│   ┌──────────────┐    ┌──────────────┐    ┌───────────────┐  │
│   │  LLM Client  │    │  x11vnc :5900│    │  Xvfb/X11 :1  │  │
│   │  (llm.py)    │    │  (→ noVNC)   │    │  Openbox WM   │  │
│   │  Multi-Provider│  └──────────────┘    └───────────────┘  │
│   └──────────────┘                                            │
└─────────────────────────────────────────────────────────────┘
```

### Component Overview

| Component | File | Description |
|-----------|------|-------------|
| FastAPI Server | `backend/main.py` | HTTP/WebSocket endpoints, auth, WhatsApp proxy |
| Agent Loop | `backend/agent.py` | Task execution, tool calling, LLM orchestration |
| LLM Client | `backend/llm.py` | Multi-provider abstraction (Gemini, Claude, OpenRouter, Ollama) |
| Config | `backend/config.py` | Environment + settings.json management |
| Skill Manager | `backend/skills/manager.py` | Load, enable, disable, configure skills |
| Tool Base | `backend/tools/base.py` | `BaseTool` class all tools inherit from |
| WhatsApp Bridge | `services/whatsapp-bridge/index.js` | Baileys v7 + Express API |
| Frontend | `frontend/index.html` + `js/` | Single-page app, no build system required |

---

## Tech Stack

### Backend
| Technology | Version | Purpose |
|-----------|---------|---------|
| Python | 3.13 | Core runtime |
| FastAPI | latest | REST API + WebSocket server |
| uvicorn | latest | ASGI server |
| faster-whisper | latest | Voice transcription (CPU, int8) |

### Frontend
| Technology | Purpose |
|-----------|---------|
| Vanilla JS | Zero-dependency UI |
| CSS Custom Properties | Dark Glassmorphism theme |
| WebSocket API | Real-time agent communication |
| noVNC | In-browser VNC client |

### Desktop / System
| Technology | Purpose |
|-----------|---------|
| Xvfb | Virtual framebuffer (headless X11) |
| Openbox | Lightweight window ma

[truncated…]

PUBLIC HISTORY

First discoveredMar 21, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub
first seenMar 3, 2026
last updatedMar 20, 2026
last crawledtoday
version

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:dev-core-busy/jarvis)