AgentOS Runtime is a lightweight operating-system-style runtime for multi-agent execution. It upgrades the original single-router Skill demo into a runtime prototype with Agent registration, DAG scheduling, resource-aware execution, context compression, fault isolation, and observability.
Competition Fit and Scoring Strategy
This project directly maps to the Agent Runtime competition requirements:
Baseline comparison in benchmarks/, generated JSON and Markdown results.
Runtime Environment
Verified local development environment:
Windows 11
Python 3.12.8
Streamlit dashboard
Domestic Linux compatibility target:
openEuler 22.03 LTS or later
openKylin 2.0 or later
Python 3.10+
The runtime core is pure Python and avoids OS-specific APIs in scheduler, context management, message bus, Agent Control Block management, and benchmark modules. Domestic Linux verification commands are provided in docs/linux_openEuler_test.md.
Fault injection triggers one patch_agent failure and retry recovery.
ACB metrics, direct messages, broadcast messages, context compression, and system resource snapshots are reported.
Dashboard opens at http://127.0.0.1:8501.
Run
Install dependencies:
pip install -r requirements.txt
Optional dependencies for framework comparison:
pip install -r requirements-framework.txt
Run the CLI demo:
python run_demo.py
Run the benchmark:
python -m benchmarks.run_benchmark
Run the real local repository demo:
python run_repo_demo.py
Run the Agent message bus demo:
python run_message_demo.py
Run the real code repair loop:
python run_repair_demo.py
Run the concurrent stress benchmark:
python -m benchmarks.run_stress_benchmark
Run the KV Cache COW demo:
python run_kv_cache_demo.py
Run the framework comparison benchmark:
python -m benchmarks.run_framework_comparison
If langgraph is installed, the comparison runs a real StateGraph repair adapter. If AutoGen AgentChat is installed but no model client is configured, the AutoGen row is explicitly marked as local fallback rather than claimed as official framework runtime data.
Run the dashboard:
streamlit run demo/gui.py
Demo Story for Judges
Enter a complex code-repair goal.
Enable fault injection.
Run the runtime.
Show the DAG: planner -> analyzer -> patch -> test -> report.
Show the event log where patch_agent fails once and is retried or falls back.
Show context metrics: shared memory items, compression count, saved characters.
Compare this with the old single-router demo: the new version manages execution instead of only choosing one expert.
Key Files
runtime/models.py: Agent, task, state, event, and report data model.
runtime/acb.py: Agent Control Block table, modeled after OS PCB.
runtime/message_bus.py: direct message and publish/subscribe communication.
runtime/resource_monitor.py: CPU/memory pressure snapshot for resource-aware scheduling.
runtime/process_isolation.py: optional child-process Agent runner for crash containment.
runtime/kv_cache.py: copy-on-write shared KV Cache pool prototype.
runtime/scheduler.py: dependency-aware scheduling, retry, fallback, state transitions.
runtime/context_manager.py: shared memory compression and dependency-local isolation.
runtime/repo_tools.py: safe local repository scanner used by the real repo demo.
agents/*/agent.yaml: declarative Agent Registry.
examples/buggy_math/: small failing repository used by the real repair loop.
run_repair_demo.py: scans a real bug, patches code, runs tests, and exports diff.
benchmarks/run_benchmark.py: comparison against single-Agent and fixed-workflow baselines.
AgentOS Runtime
AgentOS Runtime is a lightweight operating-system-style runtime for multi-agent execution. It upgrades the original single-router Skill demo into a runtime prototype with Agent registration, DAG scheduling, resource-aware execution, context compression, fault isolation, and observability.
Competition Fit and Scoring Strategy
This project directly maps to the Agent Runtime competition requirements:
runtime/planner.pycreates task DAGs;runtime/scheduler.pyresolves dependencies.dependencies; the planner chooses a graph from the user goal.agents/*/agent.yamldeclares role, tools, retry, timeout, cost, fallback.runtime/acb.pymanages Agent lifecycle, pid, quota, mailbox, and token usage.runtime/message_bus.pysupports direct point-to-point messages and pub/sub broadcast.TaskStatecovers pending, ready, running, succeeded, failed, skipped.RuntimeScheduler._run_one.ContextManagerisolates local memory and compresses shared memory.runtime/resource_monitor.pyreads CPU/memory pressure and recommends concurrency.RuntimeSchedulercan insert new DAG nodes at runtime based on Agent output.runtime/kv_cache.pysimulates shared KV Cache pool with copy-on-write private deltas.It also maps to the official scoring criteria:
benchmarks/, generated JSON and Markdown results.Runtime Environment
Verified local development environment:
Domestic Linux compatibility target:
The runtime core is pure Python and avoids OS-specific APIs in scheduler, context management, message bus, Agent Control Block management, and benchmark modules. Domestic Linux verification commands are provided in
docs/linux_openEuler_test.md.Linux setup:
Expected result:
patch_agentfailure and retry recovery.http://127.0.0.1:8501.Run
Install dependencies:
Optional dependencies for framework comparison:
Run the CLI demo:
Run the benchmark:
Run the real local repository demo:
Run the Agent message bus demo:
Run the real code repair loop:
Run the concurrent stress benchmark:
Run the KV Cache COW demo:
Run the framework comparison benchmark:
If
langgraphis installed, the comparison runs a realStateGraphrepair adapter. If AutoGen AgentChat is installed but no model client is configured, the AutoGen row is explicitly marked as local fallback rather than claimed as official framework runtime data.Run the dashboard:
Demo Story for Judges
patch_agentfails once and is retried or falls back.Key Files
runtime/models.py: Agent, task, state, event, and report data model.runtime/acb.py: Agent Control Block table, modeled after OS PCB.runtime/message_bus.py: direct message and publish/subscribe communication.runtime/resource_monitor.py: CPU/memory pressure snapshot for resource-aware scheduling.runtime/process_isolation.py: optional child-process Agent runner for crash containment.runtime/kv_cache.py: copy-on-write shared KV Cache pool prototype.runtime/scheduler.py: dependency-aware scheduling, retry, fallback, state transitions.runtime/context_manager.py: shared memory compression and dependency-local isolation.runtime/repo_tools.py: safe local repository scanner used by the real repo demo.agents/*/agent.yaml: declarative Agent Registry.examples/buggy_math/: small failing repository used by the real repair loop.run_repair_demo.py: scans a real bug, patches code, runs tests, and exports diff.benchmarks/run_benchmark.py: comparison against single-Agent and fixed-workflow baselines.benchmarks/run_stress_benchmark.py: 10/50/100 concurrent-task stress benchmark.benchmarks/run_framework_comparison.py: comparison harness for AgentOS Runtime vs application-layer framework adapters.docs/scoring_matrix.md: direct mapping to the judging rubric.docs/presentation_outline.md: suggested defense structure.Current Scope and Next Upgrades
run_repo_demo.py.