An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models.
The goal of this repo is to provide the simplest implementation of a deep research agent - e.g. an agent that can refine its research direction over time and deep dive into a topic. Goal is to keep the repo size at <500 LoC so it is easy to understand and build on top of.
If you like this project, please consider starring it and giving me a follow on X/Twitter. This project is sponsored by Aomni.
How It Works
flowchart TB
subgraph Input
Q[User Query]
B[Breadth Parameter]
D[Depth Parameter]
end
DR[Deep Research] -->
SQ[SERP Queries] -->
PR[Process Results]
subgraph Results[Results]
direction TB
NL((Learnings))
ND((Directions))
end
PR --> NL
PR --> ND
DP{depth > 0?}
RD["Next Direction:
- Prior Goals
- New Questions
- Learnings"]
MR[Markdown Report]
%% Main Flow
Q & B & D --> DR
%% Results to Decision
NL & ND --> DP
%% Circular Flow
DP -->|Yes| RD
RD -->|New Context| DR
%% Final Output
DP -->|No| MR
%% Styling
classDef input fill:#7bed9f,stroke:#2ed573,color:black
classDef process fill:#70a1ff,stroke:#1e90ff,color:black
classDef recursive fill:#ffa502,stroke:#ff7f50,color:black
classDef output fill:#ff4757,stroke:#ff6b81,color:black
classDef results fill:#a8e6cf,stroke:#3b7a57,color:black
class Q,B,D input
class DR,SQ,PR process
class DP,RD recursive
class MR output
class NL,ND results
Features
Iterative Research: Performs deep research by iteratively generating search queries, processing results, and diving deeper based on findings
Intelligent Query Generation: Uses LLMs to generate targeted search queries based on research goals and previous findings
Depth & Breadth Control: Configurable parameters to control how wide (breadth) and deep (depth) the research goes
Smart Follow-up: Generates follow-up questions to better understand research needs
Comprehensive Reports: Produces detailed markdown reports with findings and sources
Concurrent Processing: Handles multiple searches and result processing in parallel for efficiency
Requirements
Node.js environment
API keys for:
Firecrawl API (for web search and content extraction)
OpenAI API (for o3 mini model)
Setup
Node.js
Clone the repository
Install dependencies:
npm install
Set up environment variables in a .env.local file:
FIRECRAWL_KEY="your_firecrawl_key"
# If you want to use your self-hosted Firecrawl, add the following below:
# FIRECRAWL_BASE_URL="http://localhost:3002"
OPENAI_KEY="your_openai_key"
To use local LLM, comment out OPENAI_KEY and instead uncomment OPENAI_ENDPOINT and OPENAI_MODEL:
Set OPENAI_MODEL to the name of the model loaded in your local server.
Docker
Clone the repository
Rename .env.example to .env.local and set your API keys
Run docker build -f Dockerfile
Run the Docker image:
docker compose up -d
Execute npm run docker in the docker service:
docker exec -it deep-research npm run docker
Usage
Run the research assistant:
npm start
You’ll be prompted to:
Enter your research query
Specify research breadth (recommended: 3-10, default: 4)
Specify research depth (recommended: 1-5, default: 2)
Answer follow-up questions to refine the research direction
The system will then:
Generate and execute search queries
Process and analyze search results
Recursively explore deeper based on findings
Generate a comprehensive markdown report
The final report will be saved as report.md or answer.md in your working directory, depending on which modes you selected.
Concurrency
If you have a paid version of Firecrawl or a local version, feel free to increase the ConcurrencyLimit by setting the CONCURRENCY_LIMIT environment variable so it runs faster.
If you have a free version, you may sometimes run into rate limit errors, you can reduce the limit to 1 (but it will run a lot slower).
DeepSeek R1
Deep research performs great on R1! We use Fireworks as the main provider for the R1 model. To use R1, simply set a Fireworks API key:
FIREWORKS_KEY="api_key"
The system will automatically switch over to use R1 instead of o3-mini when the key is detected.
Custom endpoints and models
There are 2 other optional env vars that lets you tweak the endpoint (for other OpenAI compatible APIs like OpenRouter or Gemini) as well as the model string.
Open Deep Research
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models.
The goal of this repo is to provide the simplest implementation of a deep research agent - e.g. an agent that can refine its research direction over time and deep dive into a topic. Goal is to keep the repo size at <500 LoC so it is easy to understand and build on top of.
If you like this project, please consider starring it and giving me a follow on X/Twitter. This project is sponsored by Aomni.
How It Works
Features
Requirements
Setup
Node.js
.env.local
file:To use local LLM, comment out
OPENAI_KEY
and instead uncommentOPENAI_ENDPOINT
andOPENAI_MODEL
:OPENAI_ENDPOINT
to the address of your local server (eg.”http://localhost:1234/v1")OPENAI_MODEL
to the name of the model loaded in your local server.Docker
Clone the repository
Rename
.env.example
to.env.local
and set your API keysRun
docker build -f Dockerfile
Run the Docker image:
npm run docker
in the docker service:Usage
Run the research assistant:
You’ll be prompted to:
The system will then:
The final report will be saved as
report.md
oranswer.md
in your working directory, depending on which modes you selected.Concurrency
If you have a paid version of Firecrawl or a local version, feel free to increase the
ConcurrencyLimit
by setting theCONCURRENCY_LIMIT
environment variable so it runs faster.If you have a free version, you may sometimes run into rate limit errors, you can reduce the limit to 1 (but it will run a lot slower).
DeepSeek R1
Deep research performs great on R1! We use Fireworks as the main provider for the R1 model. To use R1, simply set a Fireworks API key:
The system will automatically switch over to use R1 instead of
o3-mini
when the key is detected.Custom endpoints and models
There are 2 other optional env vars that lets you tweak the endpoint (for other OpenAI compatible APIs like OpenRouter or Gemini) as well as the model string.
How It Works
Initial Setup
Deep Research Process
Recursive Exploration
Report Generation
License
MIT License - feel free to use and modify as needed.