pebkac-chrome

provenance:github:michaelsoftmd/pebkac-chrome

WHAT THIS AGENT DOES

This agent acts like a digital assistant that can browse the internet and complete tasks for you. It’s designed to handle complex online jobs, like gathering information from multiple websites or navigating tricky forms, without triggering security measures that often block automated programs. Business analysts, researchers, or anyone needing to regularly collect data from the web would find it helpful. What sets it apart is its ability to think through problems step-by-step and learn from its mistakes, all while operating securely and remembering your login details. It essentially gives a large language model the ability to use a web browser in a more human-like way.

View Source ↗First seen 10mo agoNot yet hireable

README

# pebkac: The AI-Powered Web Automaton Without The Automation

Update: I've written a more detailed guide on setting up the pebkac environment. Not required, but helpful. It's on [Medium.](https://medium.com/ai-in-plain-english/building-your-own-secure-local-ai-web-co-browser-in-linux-mint-7bd2144fd64e)

Update 2: This project is now on hold, and I am moving on to other, more exciting projects. Watch this space. Check out the release notes for more project info.

Update 3: https://www.youtube.com/watch?v=BVRAr1iQyQQ watch this one. I've been paying attention to Cory Doctorow since 2009.

Update 4: New thing: https://akickintheteeth.com/ SLOP FIGHTER

## **What This Is**

pebkac browses the web for you. It is a web nonautomation framework powered by SmolAgents and Zendriver. Synchronous communication becomes asynchronous communication in an elegant double helix of English language-powered Python interpretation driven by you, the user. There is no MCP, no n8n, no LangChain or LangGraph. pebkac employs the LLM's native ability to control a web browser by writing Python directly into it.

- Zendriver is described as "A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver."
- SmolAgents is "a barebones library for agents that think in code."

Together, they fit to give your localised, secure, rambunctiously stupid LLM a manual and a set of tools to operate a web browser.

## ✨ Features

### Core Capabilities

**Autonomous Intelligence**
- 🧠 **Code-Writing LLM** - Writes Python with loops, conditions, error handling (not limited to sequential tool calls like LangChain)
- 🎯 **Multi-Step Reasoning** - Works through complex tasks independently over 10 configurable steps
- 🔄 **Self-Correction** - Tries alternative strategies when approaches fail

**Stealth & Persistence**
- 👻 **Undetectable Automation** - Bypasses anti-bot detection (Cloudflare, etc.) using real Chrome instead of WebDriver
- 🔐 **Persistent Sessions** - Remembers logins and cookies across restarts (sign in once, done)
- 🛡️ **Anti-Bot Bypass** - Automatically handles challenges and verification pages

**Smart Data Extraction**
- 📊 **Intelligent Content Extraction** - Trafilatura parses web pages like a human reader (ignores ads, navigation, footers)
- 🎯 **API Response Capture** - Extracts structured JSON from modern websites instead of scraping messy HTML
- ⚡ **Extreme Caching** - 500-2000x faster on repeat visits (Redis + DuckDB two-tier cache with hit rate tracking)
- 📑 **Tab Management** - Opens relevant pages (max 3) in background after extracting their content for user exploration
- 📜 **Execution History** - Automatically saves all agent runs to SQLite with query, result, and step tracking

**User Experience**
- 💬 **Chat Interface** - Type natural language commands at localhost:8888
- 👁️ **Live Browser View** - Watch what it's doing via noVNC (1280x720)
- 📝 **Detailed Logging** - See every decision and action in real-time
- 🔍 **Web Search Integration** - Searches DuckDuckGo and filters out junk results
- 📊 **Cache Statistics** - Monitor L1/L2 cache performance, hit rates, memory usage, and execution history

**Performance Features**
- 🚀 **Parallel Operations** - Extracts multiple page elements simultaneously
- 🎨 **Form Automation** - Types with human-like delays, handles keyboard navigation
- 📸 **Screenshot Capture** - Visual verification of page state
- 📊 **Selector Learning** - Remembers which CSS selectors work per site and reuses them automatically (survives restarts)

### 🚀 Why pebkac Outperforms Traditional Solutions
**The Game-Changer: LLMs Write Python, Not JSON**

Unlike LangChain's rigid JSON tool-calling or MCP's predefined functions, pebkac's LLM writes actual Python code that executes browser actions. This means your AI will look at its own tools and write Python code to utilise them. This is impossible with LangChain/MCP's approach. They can only call predefined tools sequentially. pebkac's LLM can write loops, conditions, error handling, and complex logic.

This also means that pebkac is only as capable as the LLM that runs it, and the prompts you give it! It is fundamentally of no mind. It has no real understanding of what it is asked to do. All it has is Google Chrome dev tools, a couple libraries, and an API.

Frankly, no LLM has been made that **is supposed to** fully operate Google Chrome.

The browser runs with noVNC and loads about:blank on startup. You are warned. pebkac is not C-3P0. pebkac is a garden path. pebkac will click the wrong buttons. It will go off on tangents. It works independently through ten (adjustable) steps using its own logic and processes, providing entirely self-directed browsing. While pebkac is active you can check the highly detailed log output below the browser window to see what your LLM is up to.

Or just give it a job and go do something else. Eat an apple. [Read a book.](https://www.amazon.com/Wells-Rest-Mitch-Davis/dp/0646826778?ref_=ast_author_mpb)

You operate it simply by opening the pebkac Control Panel in your browser (localhost:8888) and typing into the chat window. The control panel displays the browser via noVNC and shows live logs from both the browser automation service and the LLM. pebkac will perform its duties and return nicely-formatted results in the chat window.

### 🚀 How does pebkac know what to do?
By reading the page, of course, same as you. State of the art extraction technologies are built in to Zendriver's existing framework, giving it an enormous capability boost. I used Trafilatura to achieve this.

- Trafilatura is "a cutting-edge Python package and command-line tool designed to gather text on the Web and simplify the process of turning raw HTML into structured, meaningful data."

Basically, pebkac's vision is augmented. Not only is it excellent at text/data extraction (check it out on github: https://github.com/adbar/trafilatura) it utilises its extraction (along with native Zendriver CSS detection) to figure out what to do! This makes things like handling Cloudflare and popups a lot easier.

YOU CAN ALSO interact with the Chrome browser pebkac uses. You can manually sign into websites and ask pebkac to perform actions on the page. Think of it like a co-browser. It can go off on its own, collect the day's news, find out about things, and (maybe) handle little jobs while you do other things, or you can drop in, hang ten over the keyboard, and surf collaboratively. Remember, pebkac and its browser are fully contained, so there's no way the LLM can access your host PC.

This whole project is both an entirely useful web co-browsing service and a stark artistic reminder of the realities of our modular, chronically online existences. We all exist in our little boxes with internet connections to view the outside world, and now more than ever our little boxes are subject to oversight and control by forces far more intelligent than us. I view this project as a black mirror (lol) to our modern life.

It's also never been done before.

It's also incredibly capable.

## ✨ Technicals

With a powerful enough LLM behind it, this setup is capable of:
- Thinking (via LLM)
- Seeing (via CSS selection/Trafilatura)
- Acting (via SmolAgents and Zendriver)
- Remembering (via elaborate, lightweight caching)
- Learning (via CSS selector tracking)

Here's what it does:
- Avoids the need to pay for API calls. The LLM now works like you do.
- Remembers your logins across Podman/Docker sessions.
- Interprets your commands with versatility. If you ask it to "search amazon", it'll go to Amazon and search. If you ask it to "wait 1min and reload", it will figure it out.
- Coordinates its own tool use so it doesn't get confused. It won't extract before navigating, and knows what page it's already on.
- Combines its usage of tools mid-step (with async). Remember how I said it has ten steps to complete a task? Inside each of those steps the LLM makes its own dec

[truncated…]

PUBLIC HISTORY

First discoveredMar 22, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenSep 11, 2025

last updatedMar 21, 2026

last crawled1 day ago

version—

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:michaelsoftmd/pebkac-chrome)