[CICD] Support Agent asistant in Workflow (#1192)
PR Category
CICD
PR Types
New Features
PR Description
Add AI-Powered CI Failure Analysis and Interactive Code Assistant
Workflows
Summary
Adds two Claude AI-powered GitHub Actions workflows to automate test failure analysis and enable interactive code collaboration via @claude mentions.
Changes
test_failure_analysis.yml— Automated Test Failure AnalysisTriggered when any of the following workflows fail:
ascend_tests,cuda_tests,metax_c500_tests,format_check.What it does:
Flaky test detection — Analyzes CI logs to determine if a failure is intermittent (timeouts, race conditions, network errors, etc.)
- High confidence (≥0.7): automatically reruns the failed workflow
- Low confidence: logs a warning, recommends manual review
Deep failure analysis (PR only) — When the failure is a real bug on a PR:
- Fetches failed job logs and PR diff
- Performs root cause analysis correlating code changes with test failures
- Generates actionable fix suggestions with file paths and code snippets
- Posts a structured analysis comment on the PR
Safety:
- Runs in read-only mode — Claude cannot modify files, push commits, or create branches
- Tool whitelist: only
Bash(gh *),Read,Glob,Grepare allowed- Explicit read-only instruction in the prompt as an additional safeguard
Configuration:
- Model controlled via
CLAUDE_MODELrepository variable- API endpoint via
ANTHROPIC_BASE_URLsecret
claude.yml— Interactive AI Code AssistantTriggered when
@claudeis mentioned in issue comments, PR review comments, PR reviews, or issue titles/bodies.
- Responds to user requests for code review, debugging, code changes, etc.
- Uses
mco-4model with write permissions for code modification scenariosTechnical Details
- Structured output: Uses
--json-schemato enforce standardized JSON responses from Claude, enabling reliable downstream parsing- Shell-safe JSON handling: Writes Claude output to temp files via heredoc, then reads with
jq— avoids single-quote breakage when JSON contains apostrophes (e.g.what's,can't)- Stable PR identification: Uses
github.event.workflow_run.pull_requests[0].numberdirectly from the event payload instead ofgh pr list --headwhich is unreliable for fork PRs- Debug step: Includes a
Debug detect outputstep withif: always()to surface Claude’s raw response even when subsequent steps failPR Comment Example
When a real bug is detected, the workflow posts:
## ❌ Test Failure Analysis 🔴 **Severity**: critical ### Failed Tests - `test_distributed_optimizer` ### Root Cause The learning rate scheduler change causes NaN gradients in distributed training with mixed precision. ### Suggested Fixes ### 1. Set minimum learning rate to avoid zero division 📄 **File**: `flagscale/train/optimizer.py` --- **Flaky Detection**: Not a flaky test (confidence: 0.9)Permissions
test_failure_analysis.yml:contents: read,actions: write,pull-requests: write,id-token: writeclaude.yml:contents: write,pull-requests: write,issues: write,id-token: write,actions: read
版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9
京公网安备 11010802032778号
[中文版|English]
介绍
FlagScale 是 FlagOS 的核心组件。FlagOS 是一个统一的开源 AI 系统软件栈,通过无缝集成各类模型、系统与芯片,构建开放的技术生态。秉承”一次开发,多芯迁移”的理念,FlagOS 旨在充分释放硬件算力潜能,打破不同芯片软件栈之间的壁垒,有效降低迁移成本。
作为该生态的核心工具包,FlagScale 提供统一的接口,覆盖大语言模型、多模态模型及具身智能模型的完整生命周期。它在统一的配置项和命令行界面下集成了多个开源后端引擎,支持模型训练、强化学习和推理等关键工作流,并在多种芯片厂商间保持一致的运行体验。快速上手请参阅 快速入门指南。
在 FlagOS 生态中,FlagScale 与以下组件协同工作:
FlagOS 插件项目基于广泛使用的上游开源框架构建,并对其进行扩展以支持多种 AI 芯片,为训练、强化学习和推理提供硬件兼容性和运行时集成。
下表列出了 FlagOS 插件与对应上游项目的映射关系:
TransformerEngine-FL
TransformerEngine
资源
支持列表
模型训练
服务、推理
参与贡献
请加入我们的微信群
授权许可
FlagScale 采用 Apache License (Version 2.0) 授权许可。 本项目中也包含一些使用其他开源授权许可的第三方组件。