目录
yuzhuoLi

[CICD] Support Agent asistant in Workflow (#1192)

PR Category

CICD

PR Types

New Features

PR Description

Add AI-Powered CI Failure Analysis and Interactive Code Assistant

Workflows

Summary

Adds two Claude AI-powered GitHub Actions workflows to automate test failure analysis and enable interactive code collaboration via @claude mentions.

Changes

test_failure_analysis.yml — Automated Test Failure Analysis

Triggered when any of the following workflows fail: ascend_tests, cuda_tests, metax_c500_tests, format_check.

What it does:

  1. Flaky test detection — Analyzes CI logs to determine if a failure is intermittent (timeouts, race conditions, network errors, etc.)

    • High confidence (≥0.7): automatically reruns the failed workflow
    • Low confidence: logs a warning, recommends manual review
  2. Deep failure analysis (PR only) — When the failure is a real bug on a PR:

    • Fetches failed job logs and PR diff
  • Performs root cause analysis correlating code changes with test failures
  • Generates actionable fix suggestions with file paths and code snippets
    • Posts a structured analysis comment on the PR

Safety:

  • Runs in read-only mode — Claude cannot modify files, push commits, or create branches
  • Tool whitelist: only Bash(gh *), Read, Glob, Grep are allowed
  • Explicit read-only instruction in the prompt as an additional safeguard

Configuration:

  • Model controlled via CLAUDE_MODEL repository variable
  • API endpoint via ANTHROPIC_BASE_URL secret

claude.yml — Interactive AI Code Assistant

Triggered when @claude is mentioned in issue comments, PR review comments, PR reviews, or issue titles/bodies.

  • Responds to user requests for code review, debugging, code changes, etc.
  • Uses mco-4 model with write permissions for code modification scenarios

Technical Details

  • Structured output: Uses --json-schema to enforce standardized JSON responses from Claude, enabling reliable downstream parsing
  • Shell-safe JSON handling: Writes Claude output to temp files via heredoc, then reads with jq — avoids single-quote breakage when JSON contains apostrophes (e.g. what's, can't)
  • Stable PR identification: Uses github.event.workflow_run.pull_requests[0].number directly from the event payload instead of gh pr list --head which is unreliable for fork PRs
  • Debug step: Includes a Debug detect output step with if: always() to surface Claude’s raw response even when subsequent steps fail

PR Comment Example

When a real bug is detected, the workflow posts:

## ❌ Test Failure Analysis

🔴 **Severity**: critical

### Failed Tests
- `test_distributed_optimizer`

### Root Cause
The learning rate scheduler change causes NaN gradients in distributed training with mixed precision.

### Suggested Fixes
### 1. Set minimum learning rate to avoid zero division
📄 **File**: `flagscale/train/optimizer.py`

---
**Flaky Detection**: Not a flaky test (confidence: 0.9)

Permissions

  • test_failure_analysis.yml: contents: read, actions: write, pull-requests: write, id-token: write
  • claude.yml: contents: write, pull-requests: write, issues: write, id-token: write, actions: read
4天前91次提交

github+banner-20260130

[中文版|English]

[!IMPORTANT]

2026/03 更新

v1.0.0 现已正式发布,这是首个稳定版本。 自 v1.0.0-alpha.0 起,代码库已进行重大重构。 针对特定硬件的多芯片支持已迁移至插件仓库,例如 TransformerEngine-FLvllm-plugin-FL。 这些插件基于 FlagOS(统一的开源 AI 系统软件栈)构建。 如果您正在使用或从早于 v1.0.0-alpha.0 的版本升级,请使用 main-legacy 分支。 该分支将在一段时间内继续接收关键错误修复和小版本更新。

介绍

FlagScale 是 FlagOS 的核心组件。FlagOS 是一个统一的开源 AI 系统软件栈,通过无缝集成各类模型、系统与芯片,构建开放的技术生态。秉承”一次开发,多芯迁移”的理念,FlagOS 旨在充分释放硬件算力潜能,打破不同芯片软件栈之间的壁垒,有效降低迁移成本。

作为该生态的核心工具包,FlagScale 提供统一的接口,覆盖大语言模型、多模态模型及具身智能模型的完整生命周期。它在统一的配置项和命令行界面下集成了多个开源后端引擎,支持模型训练、强化学习和推理等关键工作流,并在多种芯片厂商间保持一致的运行体验。快速上手请参阅 快速入门指南

在 FlagOS 生态中,FlagScale 与以下组件协同工作:

  • FlagOS 插件 — 对上游 AI 框架进行硬件适配的集成组件
  • FlagCX — 可扩展的自适应跨芯片通信库
  • FlagOS-Robo — 具身智能工作负载的基础设施

FlagOS 插件项目基于广泛使用的上游开源框架构建,并对其进行扩展以支持多种 AI 芯片,为训练、强化学习和推理提供硬件兼容性和运行时集成。

下表列出了 FlagOS 插件与对应上游项目的映射关系:

任务 FlagOS 插件项目 上游项目
训练 Megatron-LM-FL
TransformerEngine-FL
Megatron-LM
TransformerEngine
强化学习 VeRL-FL veRL
推理 / 服务 vllm-plugin-FL vllm

资源

支持列表

模型训练

模型 示例配置文件
DeepSeek-V3 16b_a3b.yaml
Qwen2/2.5/3 235b_a22b.yaml
Qwen2.5-VL 7b.yaml
QwQ 32b.yaml
LLaMA2 7b.yaml
LLaMA3/3.1 70b.yaml
LLaVA-OneVision 7b.yaml
LLaVA1.5 7b.yaml
Mixtral 8x7b.yaml
RWKV 7b.yaml
Aquila 7b.yaml

服务、推理

模型 示例配置文件
DeepSeek-V3 671b.yaml
DeepSeek-R1 671b.yaml
Qwen2.5 72b.yaml
Qwen3 8b.yaml
Qwen2.5-VL 32b_instruct.yaml
Qwen3-Omni 30b.yaml
QwQ 32b.yaml
Grok2 270b.yaml
Kimi-K2 1t.yaml

参与贡献

请加入我们的微信群

开源小助手

授权许可

FlagScale 采用 Apache License (Version 2.0) 授权许可。 本项目中也包含一些使用其他开源授权许可的第三方组件。

关于
914.9 MB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号