nano-ai

provenance:github:axelmitschidev/nano-ai

WHAT THIS AGENT DOES

nano-ai is an artificial intelligence assistant that can handle complex projects by breaking them down into smaller steps and taking action. It can search the internet, write and execute code, and manage files, all while remembering what it has done previously. This agent is useful for businesses or individuals who need help with research, data analysis, or automating repetitive tasks. What makes it unique is that it operates entirely on your own computer, without sending any data to the cloud, ensuring privacy and security. It’s a powerful tool for anyone looking to boost productivity and tackle challenging assignments.

View Source ↗First seen 3mo agoNot yet hireable

USE CASES

Automation Research Code Generation

README

<p align="center">
  <img src="assets/banner.png" alt="nano-ai" width="600">
</p>

<h3 align="center">An autonomous AI agent that runs on your laptop.<br>4B parameters. 13/14 autonomous tasks passed.</h3>

<p align="center">
  <a href="https://github.com/axelmitschidev/nano-ai/stargazers"><img src="https://img.shields.io/github/stars/axelmitschidev/nano-ai?style=social" alt="Stars"></a>
  <a href="https://github.com/axelmitschidev/nano-ai/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="License"></a>
  <img src="https://img.shields.io/badge/python-3.10+-blue.svg" alt="Python">
  <img src="https://img.shields.io/badge/model-4B%20params-green.svg" alt="Model">
  <img src="https://img.shields.io/badge/E2E-13%2F14%20tasks%20passed-brightgreen.svg" alt="Benchmark">
  <img src="https://img.shields.io/badge/runs%20on-your%20machine-black.svg" alt="Local">
</p>

---

> A fully autonomous AI agent — plans tasks, browses the web, writes and runs code, remembers across sessions — powered by a **4B parameter model** running 100% locally. No cloud. No API key. No telemetry. **One script to install.**

---

## Quickstart

```bash
git clone https://github.com/axelmitschidev/nano-ai.git
cd nano-ai
./setup.sh            # checks Ollama, pulls model, installs deps + browser
.venv/bin/python -m app.main
```

Requires [Ollama](https://ollama.com) (local LLM runtime). `setup.sh` checks for it, pulls the model (~3 GB), creates the venv, and installs dependencies + Chromium.

<details>
<summary>Docker alternative</summary>

```bash
# Mac (Ollama must be running natively for Metal GPU)
docker compose --profile mac up -d --build

# Linux (Ollama included in container)
docker compose --profile linux up -d --build
```

</details>

<details>
<summary>API mode</summary>

```bash
.venv/bin/uvicorn app.server:app --host 0.0.0.0 --port 8000
```

```bash
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "search the web for the latest AI news and save a summary"}'
```

</details>

---

## What can it do?

| Capability | How |
|-----------|-----|
| **Plan & execute** | Breaks complex tasks into steps before acting (Plan-then-Execute) |
| **Browse the web** | Stealth Chromium browser. Search, read pages, fill forms, click buttons |
| **Write & run code** | Generates Python/JS, executes in sandbox, reads output, iterates |
| **Run shell commands** | Controlled shell access: pip, curl, grep, jq, etc. (allowlist-based) |
| **Manage files** | Sandboxed workspace — create, read, delete, persist across sessions |
| **Remember** | Persistent memory across sessions — saves and recalls facts automatically |
| **Track progress** | Agent scratchpad for long tasks — notes what's done and what's next |
| **Self-correct** | Error-guided retry with strategy hints, loop detection, context compaction |
| **Stream in real-time** | SSE endpoint streams thinking, tool calls, and responses live |
| **Adapt to any model** | Auto-detects model family (Qwen, Llama, Mistral, Phi, Gemma, DeepSeek) |

---

## Benchmark

We built an [end-to-end benchmark](e2e_bench.py) that tests the agent on real autonomous tasks across 6 difficulty levels. Results on a MacBook with **Qwen 3.5 4B**:

| Level | Task | Tools used | Result |
|-------|------|-----------|--------|
| L1 | Answer a factual question | — | **PASS** |
| L1 | Use `get_date` tool | `get_date` | **PASS** |
| L2 | Write a file to workspace | `write_file` | **PASS** |
| L2 | Read back the file | `read_file` | **PASS** |
| L2 | List workspace contents | `list_files` | **PASS** |
| L3 | Write Python + execute it | `write_file → run_file` | **PASS** |
| L3 | Debug a buggy script | `read → write → run` | **PASS** |
| L4 | Search the web | `web_search` | **PASS** |
| L4 | Read a web page + summarize | `web_read` | **PASS** |
| L5 | Compute primes + save results | `write → run → read` | **PASS** |
| L5 | Research + synthesize + save | `search → read → write` | **PASS** |
| L6 | Plan a multi-step research task | `search → read → write → run` | **PASS** |
| L6 | Remember facts across conversation | `remember → recall` | **PASS** |
| L6 | Use shell commands for data tasks | `run_command` | FAIL |

```
  13/14 tasks passed — 7.1 tok/s — peak context 52%
```

Run it yourself (start the server first): `.venv/bin/python e2e_bench.py`

---

## How it works

```
User → "find Python news and save a summary"
         │
         ▼
   ┌─────────────┐
   │   Planner    │ ← breaks task into steps (if complex)
   └──────┬──────┘
          │
   ┌──────▼──────┐
   │ Orchestrator │ ← ReAct loop (max 10 rounds)
   └──────┬──────┘
          │
          ├─→ web_search("Python news")     → results
          ├─→ web_read(best_url)            → markdown
          ├─→ write_file("summary.txt", …)  → saved
          ├─→ remember("task", "done")      → persisted
          │
          ▼
   "Done. Saved summary to summary.txt."
```

The agent **plans** complex tasks before executing, then loops through **observe → think → act → observe** until done. It handles errors with strategy hints, detects loops, compacts context via LLM when the window fills up, and remembers across sessions.

---

## Architecture

```
app/
├── main.py              # CLI client (connects via SSE)
├── server.py            # FastAPI + SSE streaming
├── session.py           # Session store (TTL, eviction, locks)
├── config.py            # Env-based configuration
├── agent/
│   ├── orchestrator.py  # ReAct loop + parallel tools + loop detection
│   ├── planner.py       # Plan-then-Execute for complex tasks
│   ├── context.py       # Sliding window + LLM compaction
│   ├── memory.py        # Persistent memory + scratchpad
│   ├── prompt.py        # System prompt + memory injection
│   └── display.py       # Terminal rendering (Rich)
├── llm/
│   ├── port.py          # LLM protocol (swappable backend)
│   ├── ollama.py        # Async Ollama client (httpx streaming)
│   └── profiles.py      # Auto-detected model profiles
├── tools/
│   ├── registry.py      # Tool registry (add tools without touching core)
│   ├── workspace.py     # Sandboxed file operations
│   ├── browser.py       # Stealth Playwright + DuckDuckGo
│   └── system.py        # Code execution + shell commands
├── logger/
│   └── logger.py        # JSONL audit logs
└── prompts/
    ├── system.md        # Default agent prompt (structured)
    ├── coder.md         # Coding-focused profile
    └── researcher.md    # Research-focused profile
```

Fully async. Each module does one thing. The LLM backend is swappable — implement `chat()` and `close()`, plug it in.

---

## Tools

| Tool | What it does |
|------|-------------|
| `web_search` | Search via DuckDuckGo |
| `web_read` | Read a page as clean markdown (smart extraction) |
| `web_go` | Navigate and list interactive elements |
| `web_click` | Click buttons, links |
| `web_type` | Type into form fields |
| `write_file` | Create/overwrite files in workspace |
| `read_file` | Read files (4000 char cap) |
| `list_files` | List workspace contents |
| `delete_file` | Delete files or directories |
| `run_file` | Execute .py/.js scripts (30s timeout, sandboxed) |
| `run_command` | Shell commands: pip, curl, grep, jq, ls, etc. (allowlist) |
| `get_date` | Current date and time |
| `remember` | Save a fact to persistent memory |
| `recall` | Search persistent memory |
| `note_progress` | Track progress on long tasks (read/append/clear) |

---

## Under the hood

| Feature | What it does |
|---------|-------------|
| **Plan-then-Execute** | Complex tasks get a 3-7 step plan before the agent starts acting |
| **Context compaction** | When context hits 80%, the LLM summarizes old messages to free space |
| **Parallel tool execution** | Independent tools (reads, searches) run simultaneously via `asyncio.gather` |
| **Dynamic tool filtering** | Only shows relevant tools per round (web tools during browsing, file tools during editing) |
| **Err

[truncated…]

PUBLIC HISTORY

First discoveredMar 23, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenMar 22, 2026

last updatedMar 22, 2026

last crawled3 months ago

version—

RELATED AGENTS

askimo

Askimo is a platform that lets you interact with artificial intelligence in a simple way, whether through chatting, sear

blog-writer-multi-agents

Here's a plain English summary of the blog-writer-multi-agents AI agent: This agent automatically creates professional-

boss-skill

This agent, boss-skill, is designed to help employees navigate challenging workplace dynamics, particularly those involv

strands-multi-engineer-agent

This agent helps businesses understand how different AI models perform when tackling the same engineering task. It runs

J.E.L.L.Y._AI

J.E.L.L.Y._AI is an article writing AI developed by its creator. This repository is publicly available for job seeking p

More Automation agents →

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:axelmitschidev/nano-ai)