AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations [ICLR 2026]
From Text to Publication-Ready Diagrams
AutoFigure is an intelligent system that leverages Large Language Models (LLMs) with iterative refinement to generate high-quality scientific figures from text descriptions or research papers.
[2026.03.24] 🧠 Our sister project DeepScientist v1.5 is now officially released. It is a local-first open-source autonomous research system for end-to-end scientific discovery. Explore it on GitHub or read the ICLR 2026 paper.
[2026.03.11] 📄 Our AutoFigure-Edit paper is now available on arXiv and featured in 🤗Hugging Face Daily Papers! If you find our work helpful, please consider giving us an upvote on Hugging Face and citing our paper. Thank you! ❤️
[2026.02.17] 🚀 The AutoFigure-Edit online platform is now live! It is free for all scholars to use. Try it out at deepscientist.cc or check out our open-source code on GitHub. This new Edit version achieves much better performance!
[2026.01.26] 🎉 AutoFigure has been accepted to ICLR 2026! You can read the paper on arXiv.
✨ Features
Feature
Description
📝 Text-to-Figure
Generate figures directly from natural language descriptions.
📄 Paper-to-Figure
Extract methodology from PDFs and create visual diagrams automatically.
🔄 Iterative Refinement
Dual-agent system (Generation + Evaluation) for continuous quality optimization.
🎨 Multiple Formats
Output as SVG or mxGraph XML (fully compatible with draw.io).
💅 Image Enhancement
Optional AI-powered post-processing for aesthetic beautification.
🖥️ Web Interface
Interactive Next.js frontend for easy generation and editing.
🚀 How It Works
AutoFigure employs a Review-Refine loop to ensure high accuracy and aesthetic quality.
Process:
Generate: The agent creates initial SVG/XML based on description & references.
Evaluate: The critic scores quality (0-10) and provides specific feedback.
Refine: The loop continues until the figure meets publication standards.
🌟 Generated Examples
Here are examples of figures generated by AutoFigure across different domains, showcasing its versatility in handling various levels of complexity.
Category & Visualization
📄 Paper Case
📊 Survey Case
📝 Blog Case
📘 Textbook Case
⚡ Quick Start
Option 1: Python SDK (Recommended)
You can install via cloning the repo:
git clone https://github.com/ResearAI/AutoFigure.git
cd AutoFigure
pip install -e .
playwright install chromium # Required for rendering
1. Basic Usage (Text-to-Figure)
from autofigure import AutoFigureAgent, Config
# 1. Configure
config = Config(
generation_api_key="your-api-key",
generation_provider="openrouter", # 'openrouter', 'gemini', 'bianxie', or any custom name
generation_model="google/gemini-3.1-pro-preview",
)
# 2. Generate
agent = AutoFigureAgent(config)
result = agent.generate(
description="A flowchart showing transformer training pipeline",
max_iterations=5,
output_format="svg",
topic="paper" # 'paper', 'survey', 'blog', 'textbook'
)
print(f"✅ Generated: {result.svg_path} (Score: {result.final_score}/10)")
Using a custom OpenAI-compatible provider:
config = Config(
generation_api_key="your-api-key",
generation_provider="my-provider", # any name you choose
generation_base_url="https://api.example.com/v1",
generation_model="gpt-4o",
)
2. Generate from Paper (PDF/Markdown)
Extract methodology from a paper and generate a figure automatically.
# Generate figure from paper (PDF or Markdown)
result = agent.generate_from_paper(
paper_path="./paper.pdf",
max_iterations=5,
output_format="svg",
enable_enhancement=True, # Enhance the result
)
if result.success:
print(f"Extracted methodology: {result.methodology_text[:200]}...")
print(f"Generated figure: {result.svg_path}")
3. With Image Enhancement
Generate multiple enhanced aesthetic variants of the figure.
result = agent.generate(
description="Neural network architecture diagram",
enable_enhancement=True,
enhancement_count=3, # Generate 3 variants
art_style="Modern scientific illustration with clean lines",
enhancement_input_type="code2prompt" # Best quality mode
)
if result.success:
print(f"Original Preview: {result.preview_path}")
print(f"Enhanced variants: {result.enhanced_paths}")
Option 2: Web Interface
Ideally suited for visual interaction and editing.
./start.sh
# Then open http://localhost:6002 in your browser
📊 FigureBench Dataset
We introduce FigureBench, the first large-scale benchmark for generating scientific illustrations from long-form text.
Dataset Overview
Category
Samples
Avg. Tokens
Text Density
Complexity
📄 Paper
3,200
12,732
42.1%
High
📝 Blog
20
4,047
46.0%
Med
📊 Survey
40
2,179
43.8%
High
📘 Textbook
40
352
25.0%
Low
Total
3,300
10k+
41.2%
~5.3 Components
Download
from datasets import load_dataset
dataset = load_dataset("WestlakeNLP/FigureBench")
⚙️ Configuration
AutoFigure is highly configurable. You can set these in Config() or via environment variables.
Supported LLM Providers
Provider
Base URL
Recommended Text / SVG Model
Recommended Image Model
OpenRouter
openrouter.ai/api/v1
google/gemini-3.1-pro-preview
google/gemini-3.1-flash-image-preview
Bianxie
api.bianxie.ai/v1
gemini-3.1-pro-preview
gemini-3.1-flash-image-preview
Google
generativelanguage...
gemini-3.1-pro-preview
gemini-3.1-flash-image-preview
Custom
Any OpenAI-compatible endpoint
Any model
Any image-capable model
Generation Settings
Option
Description
Default
generation_api_key
API key for figure generation
Required
generation_base_url
Base URL for API
Provider default
generation_model
Model name
Provider default
generation_provider
Provider name: ‘openrouter’, ‘bianxie’, ‘gemini’, or any custom name for OpenAI-compatible APIs
‘openrouter’
Methodology Extraction Settings
Option
Description
Default
methodology_api_key
API key for methodology extraction
Same as generation
methodology_model
Model for methodology extraction
Same as generation
methodology_provider
Provider for methodology extraction
Same as generation
Enhancement Settings
Option
Description
Default
enhancement_api_key
API key for image enhancement
None
enhancement_provider
Enhancement provider
‘openrouter’
enhancement_model
Model for image enhancement
Provider default
enhancement_input_type
Input type: ‘none’, ‘code’, ‘code2prompt’
‘code2prompt’
enhancement_count
Number of enhanced variants to generate
1
art_style
Art style description for enhancement
‘’
Pipeline Settings
Option
Description
Default
max_iterations
Maximum refinement iterations
5
quality_threshold
Quality threshold (0-10)
9.0
output_dir
Output directory
‘./autofigure_output’
custom_references
Custom reference figure paths
None
📚 API Reference
generate() Parameters
Parameter
Description
description
Text description of the figure to generate
max_iterations
Maximum iterations (overrides config)
output_format
‘svg’ or ‘mxgraphxml’
quality_threshold
Quality threshold (overrides config)
enable_enhancement
Whether to enhance the final image
art_style
Art style for enhancement (overrides config)
enhancement_input_type
‘none’, ‘code’, or ‘code2prompt’ (overrides config)
WeChat Discussion Group Scan the QR code to join our community. If the code is expired, please add WeChat ID nauhcutnil or contact tuchuan@mail.hfut.edu.cn.
---
📜 Citation & License
If you use AutoFigure, AutoFigure-Edit, or FigureBench in your research, please cite:
@inproceedings{
zhu2026autofigure,
title={AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations},
author={Minjun Zhu and Zhen Lin and Yixuan Weng and Panzhong Lu and Qiujie Xie and Yifan Wei and Sifan Liu and Qiyao Sun and Yue Zhang},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=5N3z9JQJKq}
}
@misc{lin2026autofigureeditgeneratingeditablescientific,
title={AutoFigure-Edit: Generating Editable Scientific Illustration},
author={Zhen Lin and Qiujie Xie and Minjun Zhu and Shichen Li and Qiyao Sun and Enhao Gu and Yiran Ding and Ke Sun and Fang Guo and Panzhong Lu and Zhiyuan Ning and Yixuan Weng and Yue Zhang},
year={2026},
eprint={2603.06674},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2603.06674},
}
The optimal configuration for this project uses gemini-3.1-flash-image-preview from Google AI Studio [https://aistudio.google.com/] as the image generation model and gemini-3.1-pro-preview as the Text model. Each run costs approximately $0.50, consumes about 30,000 tokens, and takes around 20 minutes.
[Mainland China Notice] Gemini’s Terms of Service do not permit access or usage by users in mainland China. If OpenRouter throws an error, it is often because an account registered in mainland China lacks the necessary permissions to use Gemini. It is recommended to use an OpenRouter account registered in the United States or Europe and to ensure compliant usage.
AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations [ICLR 2026]
From Text to Publication-Ready Diagrams
AutoFigure is an intelligent system that leverages Large Language Models (LLMs) with iterative refinement to generate high-quality scientific figures from text descriptions or research papers.
Quick Start • Web Interface • Configuration • API Reference
https://github.com/user-attachments/assets/d0c954a9-9cf3-4c8b-8b04-71d75a68854c
🔥 News
✨ Features
🚀 How It Works
AutoFigure employs a Review-Refine loop to ensure high accuracy and aesthetic quality.
🌟 Generated Examples
Here are examples of figures generated by AutoFigure across different domains, showcasing its versatility in handling various levels of complexity.
⚡ Quick Start
Option 1: Python SDK (Recommended)
You can install via cloning the repo:
1. Basic Usage (Text-to-Figure)
Using a custom OpenAI-compatible provider:
2. Generate from Paper (PDF/Markdown)
Extract methodology from a paper and generate a figure automatically.
3. With Image Enhancement
Generate multiple enhanced aesthetic variants of the figure.
Option 2: Web Interface
Ideally suited for visual interaction and editing.
📊 FigureBench Dataset
We introduce FigureBench, the first large-scale benchmark for generating scientific illustrations from long-form text.
Dataset Overview
Download
⚙️ Configuration
AutoFigure is highly configurable. You can set these in
Config()or via environment variables.Supported LLM Providers
openrouter.ai/api/v1google/gemini-3.1-pro-previewgoogle/gemini-3.1-flash-image-previewapi.bianxie.ai/v1gemini-3.1-pro-previewgemini-3.1-flash-image-previewgenerativelanguage...gemini-3.1-pro-previewgemini-3.1-flash-image-previewGeneration Settings
generation_api_keygeneration_base_urlgeneration_modelgeneration_providerMethodology Extraction Settings
methodology_api_keymethodology_modelmethodology_providerEnhancement Settings
enhancement_api_keyenhancement_providerenhancement_modelenhancement_input_typeenhancement_countart_stylePipeline Settings
max_iterationsquality_thresholdoutput_dircustom_references📚 API Reference
generate()Parametersdescriptionmax_iterationsoutput_formatquality_thresholdenable_enhancementart_styleenhancement_input_typeenhancement_counttopiccustom_referencesgenerate_from_paper()ParametersAccepts all parameters from
generate()plus:paper_pathmethodology_api_keymethodology_providerResult Object (
GenerationResult)successsvg_pathmxgraph_pathpreview_pathenhanced_pathsfinal_scoremethodology_texterrorEnhancement Modes
nonecodecode2prompt📁 Project Structure
Click to expand directory tree
🤝 Community & Support
WeChat Discussion Group
Scan the QR code to join our community. If the code is expired, please add WeChat ID
nauhcutnilor contacttuchuan@mail.hfut.edu.cn.📜 Citation & License
If you use AutoFigure, AutoFigure-Edit, or FigureBench in your research, please cite:
Repository metadata and usage guidance:
This project is licensed under the MIT License - see
LICENSEfor details. Name and logo usage are covered separately inTRADEMARK.md.More From ResearAI
Explore more open-source research tools from ResearAI:
The optimal configuration for this project uses
gemini-3.1-flash-image-previewfrom Google AI Studio [https://aistudio.google.com/] as the image generation model andgemini-3.1-pro-previewas the Text model. Each run costs approximately $0.50, consumes about 30,000 tokens, and takes around 20 minutes.[Mainland China Notice] Gemini’s Terms of Service do not permit access or usage by users in mainland China. If OpenRouter throws an error, it is often because an account registered in mainland China lacks the necessary permissions to use Gemini. It is recommended to use an OpenRouter account registered in the United States or Europe and to ensure compliant usage.