目录

Magi Next

A TypeScript-first AI coding agent for the terminal. Run a smart agent locally, control it from your phone over the LAN after pairing a device, and dispatch sub-agents to peer machines.

$ magi
  △   Magi · 90 tools
 /✦\  cwd: ~/code/my-project
▔▔▔   model: openai:gpt-5.5

  /help for commands · Ctrl+C to interrupt · /exit to quit

> refactor src/auth.ts to use the new session API

Quick start

# Install from this repository
git clone https://github.com/EDLee01/magi.git
cd magi
npm install
npm run build
npm link

# Set one provider key
export OPENAI_API_KEY="<your-key>"
# or: export ANTHROPIC_AUTH_TOKEN="<your-key>"
# or: export DEEPSEEK_API_KEY="<your-key>"

# Configure (interactive)
magi init

# Use it
magi                            # Interactive TUI
magi -p "explain this repo"     # One-shot prompt

If you don’t set a key first, magi init will tell you which env var to set and bail out cleanly.

What it does well

  • Real agent loop with parallel tool calls — file ops, shell, git, web, MCP servers, sub-agents.
  • Context-safe file editingFilePatch applies unified-diff hunks with exact context matching for multi-line edits.
  • Smart routing/model auto picks the configured fast/main/deep aliases by task kind.
  • Plan modeEnterPlanMode for non-trivial work, ask for approval before implementing.
  • Cross-machine agents — discover other Magi daemons via mDNS, dispatch sub-agents with target: 'peer-name'.
  • Mobile control — start a LAN-bound daemon, run magi pair, open the printed /panel URL on your phone, then enter the Device ID and Token.
  • Persistent memory graph — durable Memory is indexed into weighted nodes and edges; recall reinforces useful paths, and magi memory feedback lets users mark memories useful, irrelevant, stale, or wrong.
  • Learning Loop v1 — recalls relevant prior sessions, memory, and skills before work; creates reviewable LearningDrafts after reusable lessons.
  • Skills — bundled verify / debug / stuck / commit-msg / review-pr. Add your own by dropping a SKILL.md file or applying an approved skill LearningDraft.

Five-minute tutorial

magi tutorial

Walks through 8 sections (basics, models, files, sessions, skills, memory, multi-machine, sub-agents). Press q to quit early.

Common commands

Command What it does
magi Start interactive TUI
magi -p "<prompt>" One-shot prompt, stream output
magi init Interactive provider setup
magi doctor Show config + paths
magi sessions List recent sessions
magi resume <id> Resume a session
`magi plan [list adopt
magi memory search <q> Search durable Memory
magi memory link --from <node> --to <node> Link Memory graph nodes
magi memory feedback --target <node> --signal useful Reinforce or dispute Memory
magi memory feedback trends Show Memory nodes shaped by user feedback
magi memory eval --case-file <file> Run Memory recall quality cases
magi learning list List reviewable LearningDrafts
`magi learning draft <show apply
magi ps List recent jobs
magi logs <job-id> Show events for a job
magi daemon start Run control API in background
magi pair <name> Generate a token for phone access
magi peers Discover Magi daemons on the LAN
magi tutorial Walkthrough

Inside the TUI, type /help to list slash commands. Type /help <name> for details on one.

Learning Loop

Magi now performs a local-first recall pass before provider calls. It retrieves relevant durable Memory, installed skills, and prior session snippets, then injects them as fenced background context. Recalled text is context, not a new user instruction.

After explicit learning requests or sufficiently complex tasks, Magi can create a pending LearningDraft under ~/.magi-next/state/learning-drafts/. Drafts can target Memory, new skills, skill patches, or do_not_save; they do not mutate Memory or skills until you apply them.

magi learning list
magi learning draft show <id>
magi learning draft apply <id>
magi learning draft reject <id>

Agents can also discover the deferred SessionSearch, LearningDraft, and SkillManage tools through ToolSearch. SkillManage is path-limited to the configured skills root and requires normal write approval outside bypass modes.

Memory recall quality can be checked with reusable case files:

magi memory eval --case-file tests/fixtures/memory-recall-business.json --min-score 1 --report .magi-reports/memory-recall-eval.json
npm run test:memory-eval

npm run test:memory-eval seeds an isolated Memory root through the public CLI, then checks recall after restart, graph links, correction replacement, Dream reject/apply behavior, and maintenance weight decay.

Patch Engine behavior can be checked with:

npm run test:patch-eval

That eval runs a real headless CLI session against a mock provider and verifies FilePatch ranking, failed-patch recovery, successful retry, exact FileEdit use, and that FileWrite is not used for existing-file edits.

Goal/Plan lifecycle behavior can be checked with:

npm run test:goal-plan-eval

That eval runs a real headless CLI session against a mock provider and verifies active goal injection, completed-goal suppression, plan mode mutation denial, submitted plan persistence, and goal completion state.

Tool Discovery behavior can be checked with:

npm run test:tool-discovery-eval

That eval runs a real headless CLI session against a mock provider and verifies core/deferred tool exposure, ToolSearch intent ranking, select:<tool> schema reveal, and persisted tool usage feedback affecting later ranking.

Control API behavior can be checked with:

npm run test:control-api-eval

That eval starts magi serve from the built CLI, pairs a device, verifies SSE events, resolves a mobile approval for FileWrite, cancels a streaming background job, cancels an active approval, resumes a panel session, and checks durable audit evidence.

Complex task harness behavior can be checked with:

npm run test:complex-harness

That harness runs isolated H1-H10 business fixtures through the built CLI and mock providers, then validates stream-json lifecycle, SQLite session/audit evidence, file diffs, forbidden paths, multi-agent write conflicts, Bash approval control, and provider retry/fallback routing.

Live provider behavior can be checked with an opt-in smoke task:

MAGI_LIVE_SMOKE=1 MAGI_OPENAI_API_KEY=... npm run test:live-smoke

The live smoke creates an isolated fixture, asks the configured model to fix a failing test, reruns that test, and writes a short report. Without MAGI_LIVE_SMOKE=1, the script skips and writes a skipped report so normal CI does not depend on upstream model availability.

After the eval scripts run, aggregate the current capability evidence with:

npm run report:capability
npm run report:capability:nightly

npm run verify runs the aggregate report last and fails if blackbox, Memory, Patch Engine, Goal/Plan, Tool Discovery, Control API, model task, or complex harness gates miss their required thresholds. The default capability trend profile is strict for CI. report:capability:nightly uses the same evidence with a wider efficiency budget for scheduled longer benchmark runs, while still failing on score, regression, and excessive provider/tool call growth.

Configuration

~/.magi-next/config.yaml:

providers:
  openai:
    type: openai
    apiKeyEnv: OPENAI_API_KEY
    baseUrl: https://api.openai.com/v1
    defaultModel: gpt-5.5
models:
  aliases:
    fast:   openai:gpt-5.5
    main:   openai:gpt-5.5
    review: openai:gpt-5.5
    deep:   openai:gpt-5.5
  router:               # used when alias = "auto"
    fast:   { family: gpt, role: haiku,  contextWindow: 200000, supportsVision: true }
    main:   { family: gpt, role: sonnet, contextWindow: 200000, supportsVision: true }
    deep:   { family: gpt, role: opus,   contextWindow: 200000, supportsVision: true }
memory:
  selectionModel: fast       # optional relevance selector for large memory sets
  writeDecisionModel: fast   # optional judge for remember/correction requests

Run magi init to generate a working config and skip the manual yaml. It supports OpenAI, Anthropic, and DeepSeek credentials.

Cross-machine setup

# On each machine you want to use:
MAGI_CONTROL_BIND=0.0.0.0 magi daemon start

# On your "main" machine, see who's around:
magi peers

# Pair another machine (run on the peer to get a token):
magi pair from-peer
# → outputs a Device ID + Token

# On main, save the credentials:
magi peers add peer-2 http://192.168.1.50:8765 <device-id> <token>

# Now in the TUI, the agent can dispatch to peer-2:
> compare the auth modules in this repo and the one on peer-2

The agent uses the Agent tool with target: "peer-2" to dispatch sub-agents. Multiple targets in the same response run in parallel.

Phone access

# The default daemon bind is 127.0.0.1, which a phone cannot reach.
magi daemon stop
MAGI_CONTROL_BIND=0.0.0.0 magi daemon start

magi pair my-phone
# → prints Device ID, Token, and one or more URLs like:
#   http://192.168.1.10:8765/panel

Open the printed /panel URL on a phone connected to the same LAN, then enter the printed Device ID and Token. Tokens are not placed in the URL. The current CLI prints URLs and credentials; it does not generate a QR code.

Documentation

State and isolation

Everything lives at ~/.magi-next/ by default:

~/.magi-next/
  config.yaml          # provider + model setup
  state/sessions.sqlite  # persisted sessions, jobs, audit, usage
  state/learning-drafts/ # reviewable post-task learning proposals
  memory/              # formal review-applied Memory files
  memdir/              # typed long-term memory (user/feedback/project/reference)
  skills/<name>/SKILL.md
  logs/                # daemon logs
  cache/
  plugins/
  devices/

Override the root with MAGI_CONFIG_DIR=/path for testing or sandboxing.

Building from source

git clone <this-repo>
cd magi-next
npm install
npm run build
npm test
npm run test:memory-eval
npm run test:patch-eval
npm run test:goal-plan-eval
npm run test:tool-discovery-eval
npm run test:control-api-eval
npm run test:live-smoke
npm run report:capability

Requires Node ≥ 20.

Status

Active development. The core agent loop, routing, MCP, daemon, multi-machine dispatch, and mobile web panel are implemented and covered by tests. Beta quality; APIs and UX may still change.

Filing bugs: open a GitHub issue with output of magi doctor and magi --version.