AGENTS / GITHUB / financial-document-analyzer-debug
githubinferredactive

financial-document-analyzer-debug

provenance:github:honestlyBroke/financial-document-analyzer-debug
WHAT THIS AGENT DOES

Here's a plain English summary of the financial-document-analyzer-debug agent: This agent is like having a financial expert at your fingertips. You upload a financial document, like a company's earnings report, and it automatically analyzes it for you. It pulls out key financial numbers, gives investment recommendations, and highlights potential risks – all in a clear and understandable way. It solves the problem of having to spend hours manually reviewing complex financial documents to make informed decisions. Investors, financial analysts, or anyone needing to quickly understand a company's financial health would find this incredibly useful.

View Source ↗First seen 1mo agoNot yet hireable
README
# Financial Document Analyzer

🚀 **Live Demo:** [https://finance.yato.foo/](https://finance.yato.foo/)

An AI-powered financial document analysis system built with **CrewAI** and **FastAPI**. Upload a financial PDF (e.g., Tesla Q2 2025 earnings report) and receive a comprehensive analysis including document verification, financial metrics extraction, investment recommendations, and risk assessment.

## Table of Contents

- [Architecture](#architecture)
- [Frontend](#frontend)
- [Setup & Installation](#setup--installation)
- [Docker Deployment](#docker-deployment)
- [API Documentation](#api-documentation)
- [Bugs Found & Fixes Applied](#bugs-found--fixes-applied)
- [Prompt Engineering Improvements](#prompt-engineering-improvements)
- [Bonus: Scaling Architecture](#bonus-scaling-architecture)

---

## Architecture

The system uses a **CrewAI sequential pipeline** with four specialized agents:

```
Upload PDF → Verifier → Financial Analyst → Investment Advisor → Risk Assessor → Response
```

| Agent | Role |
|---|---|
| **Verifier** | Validates the document is a legitimate financial report |
| **Financial Analyst** | Extracts key metrics (revenue, EPS, margins, cash flow) |
| **Investment Advisor** | Provides data-driven investment recommendations |
| **Risk Assessor** | Identifies and rates financial risks with mitigation strategies |

---

## Frontend

The app includes a retro **NES.css**-styled single-page frontend served directly by FastAPI at the root URL (`/`). No build step required.

**Features:**
- **Analyze** — Upload a PDF, enter a query, choose sync or async mode, view results
- **History** — Browse past analyses stored in MongoDB
- **Status** — Live health check showing API, MongoDB, and Celery worker status
- Async mode includes real-time progress polling with a retro progress bar

The frontend uses [NES.css](https://nostalgic-css.github.io/NES.css/) v2.3.0 with the Press Start 2P font for an 8-bit aesthetic.

---

## Setup & Installation

### Prerequisites

- **Python 3.12** (required — `crewai-tools==0.47.1` depends on `embedchain` which does not support Python 3.13+)
- An [OpenRouter API key](https://openrouter.ai/) (used for Gemini 2.0 Flash via OpenRouter)
- A [Serper API key](https://serper.dev/) (for web search)

### Steps

```bash
# 1. Clone the repository
git clone https://github.com/honestlyBroke/financial-document-analyzer-debug.git
cd financial-document-analyzer-debug

# 2. Create and activate a virtual environment with Python 3.12
python3.12 -m venv venv       # or: /path/to/python3.12 -m venv venv
source venv/bin/activate       # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure environment variables
cp .env.example .env
# Edit .env and add your API keys:
#   OPENROUTER_API_KEY=sk-or-...
#   SERPER_API_KEY=your_serper_api_key

# 5. Run the server
cd src
python main.py
```

The API will be available at `http://localhost:8000`.

### Sample Document

A Tesla Q2 2025 financial update PDF is included at `data/TSLA-Q2-2025-Update.pdf`.

---

## Docker Deployment

The project includes a full Docker Compose setup for production deployment.

```bash
# 1. Clone and configure
git clone https://github.com/honestlyBroke/financial-document-analyzer-debug.git
cd financial-document-analyzer-debug
cp .env.example .env
# Edit .env with your API keys

# 2. Build and start all services
docker compose up -d --build

# 3. Check logs
docker compose logs -f
```

This starts 4 services:
- **app** — FastAPI server (port 8000)
- **celery_worker** — Background task processor
- **redis** — Message broker for Celery
- **mongodb** — Persistent storage for analysis results

All containers join the `nginx-network` for use with Nginx Proxy Manager.

---

## API Documentation

### `GET /`

Serves the frontend UI.

### `GET /api/health`

Health check endpoint.

**Response:**
```json
{"message": "Financial Document Analyzer API is running"}
```

### `POST /analyze`

Upload a financial PDF and receive a comprehensive analysis.

**Request** (multipart/form-data):
| Field | Type | Required | Description |
|---|---|---|---|
| `file` | File (PDF) | Yes | The financial document to analyze |
| `query` | String | No | Specific analysis question (default: "Analyze this financial document for investment insights") |

**Example using cURL:**
```bash
curl -X POST http://localhost:8000/analyze \
  -F "file=@data/TSLA-Q2-2025-Update.pdf" \
  -F "query=What are Tesla's key financial metrics for Q2 2025?"
```

**Response:**
```json
{
  "status": "success",
  "query": "What are Tesla's key financial metrics for Q2 2025?",
  "analysis": "...(comprehensive multi-agent analysis)...",
  "file_processed": "TSLA-Q2-2025-Update.pdf",
  "output_saved": "outputs/analysis_<uuid>.json"
}
```

### `POST /analyze/async`

Submit a document for background analysis via Celery. Returns immediately. Requires Redis + Celery worker.

**Request:** Same as `POST /analyze`.

**Response:**
```json
{
  "status": "queued",
  "task_id": "a1b2c3d4-...",
  "celery_task_id": "...",
  "message": "Analysis submitted. Poll GET /result/{task_id} for results."
}
```

### `GET /result/{task_id}`

Poll for the result of an async analysis task.

**Response (complete):**
```json
{
  "status": "success",
  "task_id": "a1b2c3d4-...",
  "query": "...",
  "filename": "TSLA-Q2-2025-Update.pdf",
  "analysis": "...(comprehensive multi-agent analysis)..."
}
```

### `GET /analyses`

List past analyses from MongoDB (most recent first). Supports `?limit=20&skip=0` pagination.

### `GET /analyses/{task_id}`

Retrieve a specific past analysis from MongoDB by its `task_id`.

**Interactive docs:** Visit `http://localhost:8000/docs` for the Swagger UI.

---

## Bugs Found & Fixes Applied

### tools.py (6 bugs)

| # | Bug | Fix |
|---|---|---|
| 1 | `from crewai_tools import tools` — wrong import, `tools` does not exist as a module export | Changed to `from crewai.tools import tool` (the `@tool` decorator) and `from crewai_tools import SerperDevTool` |
| 2 | `Pdf(file_path=path).load()` — `Pdf` class is never imported and does not exist | Replaced with `pypdf.PdfReader` which is the standard Python PDF reader |
| 3 | `async def read_data_tool(path=...)` inside a class — CrewAI tools cannot be async coroutines | Converted to a synchronous function decorated with `@tool` |
| 4 | Methods inside classes missing `self` parameter (`read_data_tool`, `analyze_investment_tool`, `create_risk_assessment_tool`) | Converted from class methods to standalone `@tool`-decorated functions (CrewAI pattern) |
| 5 | Tools defined as class methods but CrewAI expects callable tool objects | Used `@tool("Tool Name")` decorator which is the correct CrewAI tool pattern |
| 6 | `SerperDevTool` imported via wrong path `from crewai_tools.tools.serper_dev_tool import SerperDevTool` | Changed to `from crewai_tools import SerperDevTool` (public API) |

### agents.py (7 bugs)

| # | Bug | Fix |
|---|---|---|
| 1 | `llm = llm` — self-referencing undefined variable, causes `NameError` | Changed to `llm = LLM(model="openrouter/google/gemini-2.0-flash-001", api_key=os.getenv("OPENROUTER_API_KEY"))` |
| 2 | `from crewai.agents import Agent` — wrong import path | Changed to `from crewai import Agent, LLM` |
| 3 | `tool=[FinancialDocumentTool.read_data_tool]` — parameter name is `tools` (plural), and the value was a class method reference | Changed to `tools=[read_data_tool, search_tool]` with proper tool function imports |
| 4 | `max_iter=1` on all agents — limits agents to 1 iteration, making them unable to complete multi-step analysis | Increased to `max_iter=25` |
| 5 | `max_rpm=1` on all agents — limits to 1 request per minute, causing extreme throttling | Increased to `max_rpm=10` |
| 6 | `verifier`, `investment_advisor`, `risk_assessor` agents have no tools assigned | Added appropriate tools (`read_data_tool`, `search_tool`) to each agent |
| 7 | `from tools import search_tool, FinancialDocumentTool` — i

[truncated…]

PUBLIC HISTORY

First discoveredMar 21, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub
first seenFeb 24, 2026
last updatedFeb 24, 2026
last crawled27 days ago
version

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:honestlyBroke/financial-document-analyzer-debug)