gitlink-gatekeeper

Policy-as-Code PR merge gate for GitLink — aggregate multi-signal evidence into a transparent 0–100 scorecard and a reproducible three-state verdict (PASS / REQUEST_CHANGES / COMMENT). Never auto-merges by default.

Real-platform validated — run end-to-end against live GitLink PRs via gitlink-cli: a real PR scored PASS 90/100 (dry-run) and a fork PR scored REQUEST_CHANGES 40/100 with a real scorecard comment + tracking issue written back. Evidence below.

Adopted upstream and shipped in the official release — all four upstream PRs merged into Gitlink/gitlink-cli master:

PR #89 (label shortcut group + gitlink-label Skill, Sub-task 1) — merged 2026-05-31 (a91d553);

PR #90 (this gitlink-gatekeeper Skill, Sub-task 2) — merged 2026-06-08 (f3e5d4a);

PR #219 (pr-quality-gatekeeper end-to-end workflow, Sub-task 3) — merged into official examples/ 2026-06-14 (d3485fce);

PR #220 (gitlink-issueops Skill, the only delivery for maintainer-requested Issue #6) — merged 2026-06-14 (6454645f).

The first two ship in the official npm package @gitlink-ai/cli 0.2.0 (published 2026-06-08). The author is listed on the upstream README contributors wall; the reported bug #20 is formally referenced by two independent fix PRs.

AI-agent verified (Claude Code) — given nothing but the merged SKILL.md and a live open PR number, a Claude Code subagent independently completed the whole loop: policy fallback → five-signal collection → real code review → digit-by-digit scoring 54/100 → REQUEST_CHANGES verdict (strict dry-run). Full transcript: workflow/demo/agent-session-2026-06-12/.

Languages: English · 简体中文 Jump to: SKILL.md · REFERENCE.md · Design (SSOT) · LICENSE Workflows (Sub-task 3): 1 · Route · 2 · Decide · 3 · Write-back & follow-up

Demo

2-minute screen recording: docs/assets/demo/demo.mp4 — official npm 0.2.0 → dry-run scorecard on a live PR (PASS 90) → real review findings flip the verdict (REQUEST_CHANGES 55) → repo-wide triage of 113 open PRs → determinism tests green.

Real write-back on the platform	Adopted upstream

More evidence: workflow/demo/ (six real-platform runs, incl. the Claude Code agent-session transcript).

Why gitlink-gatekeeper

PR governance on open-source repos suffers from three real pain points:

Inconsistent standards. Merge bars drift across reviewers and over time; new contributors have no fixed target.
AI review cannot decide. Existing AI code review only emits subjective comments — gitlink-code-review produces review remarks with no thresholds and no clear “pass/fail” conclusion.
Opaque gate. Whether a PR is mergeable is judged by feel, with no reproducible, auditable evidence traceable to a concrete rule.

gitlink-gatekeeper turns the merge bar into versioned code. A team writes its standards into gatekeeper.yaml; the gate then aggregates multi-signal evidence for a PR, computes a transparent 0–100 scorecard, issues a three-state verdict, and writes the conclusion back to the PR as a structured comment plus a verdict label.

Core values: consistency · auditability · reproducibility · safety.

Architecture: the PR gatekeeper loop — policy + PR signals → route → decide (0–100 scorecard) → write-back, safe by default

How it differs from existing Skills

Dimension	gitlink-code-review	gitlink-commit-quality	gitlink-insight	gitlink-gatekeeper
Output	Subjective comments	Commit-convention check	Repo health report	Reproducible merge verdict
Decision	None (comments only, no pass/fail)	None	None	Three-state + scorecard
Source of truth	Implicit in the prompt	Fixed rules	Fixed weights	Versioned YAML policy (auditable / shareable)
Merge gate	None	None	None	Yes, with a safe default (never auto-merges)
Reproducibility	Low	Medium	Medium	High (same policy + same input → same verdict)

How it differs from the wider industry

Compared with mainstream PR-gate tooling — Danger.js, reviewdog, Mergify, and GitHub branch-protection rulesets — gitlink-gatekeeper differentiates on four points:

Deterministic, hand-recomputable scoring. A transparent 0–100 scorecard whose every number can be reproduced digit-for-digit by hand from the policy — not a pass/fail boolean or an opaque bot vote.
GitLink-native by design. It reuses GitLink’s PR-behind-an-issue label model and respects GitLink’s common/approved/rejected three-state review status, rather than assuming a GitHub-style data model.
Safe default: never auto-merges. Even when auto_merge: true, a merge still requires verdict == PASS and an explicit --apply; the default writes nothing.
Policy-as-Code, versioned. The merge bar lives in a versioned gatekeeper.yaml that is auditable, diffable, and shareable across repos.

(These are complementary positioning notes, not a knock on those tools — each excels in its own ecosystem.)

Install & prerequisites

# 1. Install the GitLink CLI (provides the gitlink-cli binary)
npm install -g @gitlink-ai/cli

# 2. Authenticate (GitLink tokens last 7 days; re-run when expired)
gitlink-cli auth login
gitlink-cli auth status        # verify you are logged in

The Skill drives the GitLink platform entirely through gitlink-cli. The companion gitlink-shared Skill (installed alongside gitlink-cli) covers authentication, global flags, and platform-specific API caveats — read it before first use; see also SKILL.md.

Note: GitLink’s main branch is master (not main). GitLink PR review status is common/approved/rejected; the gate deliberately writes every automated verdict as an advisory common comment (carrying the verdict in the scorecard title + a label) and leaves the stronger approved/rejected to humans — see Design §7.

Directory structure

gitlink-gatekeeper/
├── README.md / README.zh-CN.md      # this front door (bilingual)
├── LICENSE                          # MulanPSL-2.0
├── docs/
│   └── design.md                    # single source of truth (SSOT)
├── skills/
│   └── gitlink-gatekeeper/
│       ├── SKILL.md                 # the Agent workflow (core)
│       ├── REFERENCE.md             # full gatekeeper.yaml schema + scoring algorithm + API map
│       ├── TROUBLESHOOTING.md       # common issues
│       └── examples/
│           ├── gatekeeper.yaml          # annotated default policy
│           ├── gatekeeper.strict.yaml   # strict preset
│           ├── gatekeeper.lenient.yaml  # lenient preset
│           ├── scorecard-sample.md      # sample scorecard output
│           └── decision-*.md            # real-PR verdict records
└── workflow/                        # Sub-task 3: end-to-end "PR gatekeeper loop"
    ├── README.md                    # workflow overview & quick start
    ├── config.example.yaml          # workflow config (owner/repo/policy/rules)
    ├── owner-rules.yaml             # step 1: file-path → reviewer routing
    ├── scripts/
    │   └── gatekeeper_workflow.py   # runnable orchestrator (pure stdlib)
    └── docs/
        ├── architecture.md          # architecture & data flow
        ├── runbook.md               # operations runbook
        └── verification.md          # real-repo verification record

Quick start (dry-run — see the scorecard, change nothing)

The gate is dry-run by default: without --apply it only prints the scorecard. It does not comment, label, or merge.

# Score one PR against the repo-root gatekeeper.yaml (or the built-in default policy)
gitlink-cli pr +view -i <pr_id> --format json     # gatekeeper collects PR context, files, diff, commits, CI
# → then runs the deterministic 5-dimension scoring and prints a scorecard like:

## 🛡️ Gatekeeper Report — PR #42 feat: add retry to client

**Verdict: 💬 COMMENT**  ·  Score: 68/100  ·  policy: gatekeeper.yaml@v1

| Dimension       | Weight | Score | Notes |
|-----------------|:------:|:-----:|-------|
| Review findings | 40 | 23/40 | 0 blocker / 0 major / 3 minor / 2 nit |
| Test coverage   | 20 | 15/20 | 2 src / 1 test files |
| PR hygiene      | 15 | 10/15 | desc ✓ / linked issue ✗ / size ✓ |
| Commit quality  | 15 | 10/15 | 2/3 conventional |
| CI status       | 10 | 10/10 | passing |

Every number is reproducible from Design §3: review_findings round(40×(1−(3×5+2×1)/40))=23, test_coverage round(20×(0.5+0.5×min(1,1/2)))=15, pr_hygiene round(15×2/3)=10, commit_quality round(15×2/3)=10, ci 10; total 68 ∈ [60,85) and no hard-gate failure → COMMENT. Same case as examples/decision-comment.md.

Once you’ve reviewed the verdict, write it back (comment + label) by adding --apply. The gate still will not merge unless auto_merge: true and verdict is PASS and you pass --apply explicitly.

Lock the scoring (reproducible): run python3 workflow/tests/test_scoring.py to recompute the four authoritative verdict cases (decision-pass → 90/PASS, decision-request-changes → 38/REQUEST_CHANGES, decision-comment → 68/COMMENT, scorecard-sample → 35/REQUEST_CHANGES) digit-for-digit — your guarantee that same policy + same input → same score.

Real-platform validation (not a mock)

The scorecard above is not a hand-written sample — the gate has been run end-to-end against live GitLink PRs through gitlink-cli. Two runs, two opposite verdicts, both reproducible:

① Dry-run on a real upstream PR → ✅ PASS 90/100 · Gitlink/gitlink-cli #15222 — “feat(org): add team project binding shortcuts” · evidence: workflow/demo/real-run-pr15222/scorecard.md

Dimension	Weight	Score	Notes
Review findings	40	40/40	0 blocker / 0 major / 0 minor / 0 nit
Test coverage	20	20/20	1 src / 1 test files
PR hygiene	15	10/15	desc ✓ / linked issue ✗ / size ✓
Commit quality	15	15/15	conventional
CI status	10	5/10	unknown

→ total 90 ≥ pass(85) and no hard-gate failure → PASS. Dry-run, so nothing was written to the upstream repo.

② Apply on a fork PR → ❌ REQUEST_CHANGES 40/100, with a real write-back · recorder/gitlink-cli #1 (pull_request_id 15289) — a deliberately under-tested change · evidence: workflow/demo/real-run-apply-pr15289/scorecard.md

Dimension	Weight	Score	Notes
Review findings	40	10/40	0 blocker / 1 major / 1 minor / 0 nit
Test coverage	20	0/20	1 src / 0 test files
PR hygiene	15	10/15	desc ✓ / linked issue ✗ / size ✓
Commit quality	15	15/15	conventional
CI status	10	5/10	unknown

→ hard gate require_tests_for_src_changes fired (source changed, no tests), short-circuiting to REQUEST_CHANGES at total 40. With --apply, the gate actually wrote back to the live platform:

a scorecard comment posted on the PR (GitLink comment id 472741);
a tracking issue opened on recorder/gitlink-cli summarising the must-fix items (the divide-by-zero panic when qps=0) and the failed hard gate, linked back to the PR.

Determinism, machine-checked. Same policy + same PR input → the same scorecard, bit-for-bit. The runnable orchestrator’s output (scripts/gatekeeper_workflow.py, pure stdlib) matches every number in these documented scorecards exactly — see workflow/docs/verification.md for the full reproduction commands and the parse-bug found and fixed during the live run.

`gatekeeper.yaml` at a glance

The policy file is read from the repo root gatekeeper.yaml (override with --policy <path>), falling back to a built-in default. Five weighted dimensions must sum to 100; hard gates short-circuit to REQUEST_CHANGES; thresholds map the total to a verdict.

version: 1
weights:                  # must sum to 100
  review_findings: 40     # AI review findings, penalised by severity
  test_coverage:   20     # do changed source files ship with tests?
  pr_hygiene:      15     # description / linked issue / size
  commit_quality:  15     # Conventional Commits conformance
  ci_status:       10     # did CI pass?
hard_gates:               # any hit → REQUEST_CHANGES regardless of score
  forbid_blocker_findings: true
  require_ci_pass: true
  require_tests_for_src_changes: true
  require_linked_issue: false
  max_changed_files: 80
thresholds:
  pass: 85                # total ≥ pass AND no hard-gate failure → PASS
  request_changes: 60     # total < request_changes → REQUEST_CHANGES; in between → COMMENT
behavior:
  dry_run_default: true   # default: preview only, write nothing
  auto_merge: false       # NEVER auto-merge by default

See REFERENCE.md for every field and the full scoring algorithm.

Three-state verdict

Verdict	When	Action
✅ PASS	`total ≥ thresholds.pass` and no hard-gate failure	label `gatekeeper:pass`; merge only if `auto_merge: true` + `--apply`
❌ REQUEST_CHANGES	any hard gate fails, or `total < thresholds.request_changes`	label `gatekeeper:needs-changes`; posted as a labelled `COMMENT`
💬 COMMENT	score in between	label `gatekeeper:review`; advisory comment

Safe defaults (never auto-merge)

Dry-run by default — no --apply means nothing is written, labelled, or merged.
Never auto-merges by default — auto_merge is false; even when true, a merge requires verdict == PASS and an explicit --apply.
Every write operation is restated to the user before it runs; tokens are never echoed.

Contest mapping (CCF GitLink Contribution Track)

gitlink-gatekeeper is the flagship deliverable of Sub-task 2 (intelligent PR governance) for the CCF GitLink intelligent-service contribution track, and it threads through all three sub-tasks:

Sub-task 1 — label command. The gate writes verdicts as labels (gatekeeper:pass / :needs-changes / :review), which is exactly why this work contributes the new label shortcut group to gitlink-cli — upstream PR Gitlink/gitlink-cli #89, merged 2026-05-31 (together with the companion gitlink-label Skill) and shipped in official npm 0.2.0. Label CRUD has been separately verified all-green on the real platform. The gate writes every verdict as an advisory common comment + a label, leaving strong approve/reject to humans.
Sub-task 2 — the gatekeeper Skill. gatekeeper.yaml policy → deterministic scorecard → three-state verdict → structured write-back. This README + SKILL.md + REFERENCE.md are the core. Upstream PR #90 was officially adopted into skills/ on 2026-06-08 and ships in npm 0.2.0.
Sub-task 3 — end-to-end workflow/. A reproducible “PR gatekeeper loop” of ≥3 steps: route (suggest reviewers by changed-file path) → decide (run the gate) → write-back & follow-up (post the scorecard, apply the label, and on REQUEST_CHANGES open a tracking issue summarising the must-fix items, linked to the PR).

License

MulanPSL-2.0 — consistent with the upstream gitlink-cli.

219 6/14 into official examples/, #220 6/14 answering maintainer Issue #6)