autonomous-deep-research-flow

provenance:github:Pradeep-Kumar25th/autonomous-deep-research-flow

An autonomous AI research pipeline that routes queries, runs parallel multi-agent research, fact-checks results and generates structured reports — built with CrewAI Flows & OpenAI

View Source ↗First seen 3mo agoNot yet hireable

README

<div align="center">

# 🔬 Autonomous Deep Research Flow

### *An AI-powered research pipeline that thinks, plans, researches, fact-checks and reports — automatically*

---

![Python](https://img.shields.io/badge/Python-3.10+-3776AB?style=for-the-badge&logo=python&logoColor=white)
![CrewAI](https://img.shields.io/badge/CrewAI-1.3.0-FF6B6B?style=for-the-badge&logo=robot&logoColor=white)
![OpenAI](https://img.shields.io/badge/OpenAI-GPT--4o--mini-412991?style=for-the-badge&logo=openai&logoColor=white)
![Exa](https://img.shields.io/badge/Search-Exa_AI-00B4D8?style=for-the-badge&logoColor=white)
![Jupyter](https://img.shields.io/badge/Jupyter-Notebook-F37626?style=for-the-badge&logo=jupyter&logoColor=white)
![License](https://img.shields.io/badge/License-MIT-22C55E?style=for-the-badge)

---

> 💡 *"I wanted to build something that could do real, deep research on any topic — not just a single web search, but a full pipeline that plans, researches in parallel, fact-checks, and produces a structured report. So I built it."*
>
> **— Pradeep Kumar**

---

![Banner](https://images.unsplash.com/photo-1507003211169-0a1dd7228f2d?w=1200&h=400&fit=crop&q=80)

</div>

---

## 📌 Table of Contents

- [🌟 Why I Built This](#-why-i-built-this)
- [✨ Key Features](#-key-features)
- [🏗️ System Architecture](#-system-architecture)
- [🔄 How the Flow Works](#-how-the-flow-works)
- [🤖 The AI Agents](#-the-ai-agents)
- [🛠️ Tools Used](#-tools-used)
- [⚙️ Installation & Setup](#-installation--setup)
- [🚀 How to Run](#-how-to-run)
- [📊 Example Output](#-example-output)
- [📁 Project Structure](#-project-structure)
- [🔮 Future Improvements](#-future-improvements)
- [👤 Author](#-author)

---

## 🌟 Why I Built This

Standard AI chatbots give you a single answer from a single model. But real research requires planning, exploring multiple angles, verifying facts, and synthesizing everything into a structured report.

I built this system to do exactly that — autonomously. It uses a **CrewAI Flow** to intelligently decide whether a query needs deep research or a simple answer, then routes it through a **parallel multi-agent pipeline** that plans, researches, fact-checks, and produces a professional report — all without human intervention.

The system even **remembers previous sessions** (via persistence) and **generates charts** from research data automatically.

---

## ✨ Key Features

| Feature | Description |
|--------|-------------|
| 🧠 **Intelligent Query Routing** | Automatically decides: simple answer vs. deep research |
| 🔄 **CrewAI Flow Orchestration** | Full stateful flow with persistence across sessions |
| ⚡ **Parallel Research** | Main and secondary topics researched simultaneously |
| 🔍 **Exa AI Search** | Semantic web search for high-quality, relevant sources |
| ✅ **Built-in Guardrails** | Enforces Summary, Insights, and Citations sections |
| 📊 **Auto Chart Generation** | Automatically creates charts from research data |
| 💾 **Session Persistence** | Remembers your previous queries across runs |
| 🔎 **Observability & Tracing** | Full tracing enabled for monitoring and debugging |
| 📄 **Markdown Report Output** | Saves complete research report to `research_report.md` |

---

## 🏗️ System Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                    DEEP RESEARCH FLOW                           │
│                                                                 │
│  [User Query]                                                   │
│       │                                                         │
│  ┌────▼──────────┐                                              │
│  │ start_        │ ← Entry point, remembers past sessions       │
│  │ conversation  │                                              │
│  └────┬──────────┘                                              │
│       │                                                         │
│  ┌────▼──────────┐                                              │
│  │ analyze_query │ ← Router: SIMPLE or RESEARCH?               │
│  └────┬──────────┘                                              │
│       │                                                         │
│  ┌────┴──────────────────────┐                                  │
│  │                           │                                  │
│  ▼ "SIMPLE"              ▼ "RESEARCH"                          │
│  ┌──────────┐        ┌───────────────┐                         │
│  │ simple_  │        │ clarify_query │                         │
│  │ answer   │        └───────┬───────┘                         │
│  └────┬─────┘                │                                  │
│       │               ┌──────▼──────────────────────────────┐  │
│       │               │     PARALLEL DEEP RESEARCH CREW     │  │
│       │               │                                     │  │
│       │               │  Research Planner → breaks query    │  │
│       │               │  Topic Researcher → main topics ──┐ │  │
│       │               │  Topic Researcher → secondary   ──┤ │  │
│       │               │  Fact Checker → validates main  ◄─┤ │  │
│       │               │  Fact Checker → validates sec.  ◄─┘ │  │
│       │               │  Report Writer → final report       │  │
│       │               └──────┬──────────────────────────────┘  │
│       │               ┌──────▼──────────┐                      │
│       │               │ save_report_    │ ← Saves .md + charts │
│       │               │ and_summarize   │                      │
│       │               └──────┬──────────┘                      │
│       │                      │                                  │
│  ┌────▼──────────────────────▼──┐                               │
│  │     return_final_answer      │ ← Returns to user             │
│  └──────────────────────────────┘                               │
└─────────────────────────────────────────────────────────────────┘
```

---

## 🔄 How the Flow Works

```
1. 🎤  User enters research query
         │
2. 🤔  Flow analyzes complexity → routes to SIMPLE or RESEARCH
         │
   ┌─────┴──────────────────────┐
   │ SIMPLE                RESEARCH
   │                            │
3a.💬 Direct LLM answer   3b.❓ Clarify query if needed
         │                      │
         │               4. 🚀  Parallel research crew kicks off:
         │                      ├── Research Planner splits topics
         │                      ├── Researcher covers main topics ─┐
         │                      ├── Researcher covers secondary  ──┤ parallel
         │                      ├── Fact checker validates main  ◄─┘
         │                      └── Report writer synthesizes all
         │                      │
         │               5. 💾  Full report saved to research_report.md
         │                      │
         └──────────────────────┘
                  │
6. 📤  Final answer returned to user
```

---

## 🤖 The AI Agents

### 🗺️ Agent 1 — Research Planner
- **Role:** Breaks the query into MAIN (core) and SECONDARY (supporting) topics
- **Goal:** Create a strategic research plan for parallel execution

### 🔍 Agent 2 — Topic Researcher
- **Role:** Researches both topic branches simultaneously using live web search
- **Goal:** Gather comprehensive, cited information from credible sources
- **Tools:** `EXASearchTool`, `ScrapeWebsiteTool`

### ✅ Agent 3 — Fact Checker
- **Role:** Validates all research data for accuracy and consistency
- **Goal:** Identify misinformation, hallucinations, and unsupported claims
- **Tools:** `EXASearchTool`, `ScrapeWebsiteTool`

### 📝 Agent 4 — Report Writer
- **Role:** Synthesizes all validated data into a structured report
- **Goal:** Produce a clear, well-cited report that answers the original query
- **Guardrail:** Report must include Summary, Insights, and Citations sections

---

## 🛠️ Tools Used

| Tool | Purpose |
|------|---------|
| 🔍 `EXASearchTool` | Semantic AI-power

[truncated…]

PUBLIC HISTORY

First discoveredMar 21, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenMar 4, 2026

last updatedMar 4, 2026

last crawled2 months ago

version—

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:Pradeep-Kumar25th/autonomous-deep-research-flow)