Thesis Skills is not an AI writing assistant, not a thesis template, and not a tool that writes thesis content for you.
It is a CLI workflow system that connects the tools many graduate students and researchers already use: Word, Zotero, EndNote, LaTeX, structured check reports, safe fix patches, review handoff artifacts, and pre-submission readiness checks.
The goal is simple: turn scattered, manual, error-prone thesis finishing work into a workflow that is checkable, repeatable, and auditable.
For repetitive finishing work, the expected time savings are concrete:
Workflow
Manual baseline
With Thesis Skills
Speedup
Bibliography intake
30-60 min
2-5 min
~10× faster
Word ↔ LaTeX review handoff
1-3 hrs
5-10 min
~15× faster
Deterministic format checks
1-3 hrs
2-5 min
~20× faster
Safe report-driven fixes
1-2 hrs
5-10 min
~10× faster
Pre-submission readiness review
30-60 min
1-2 min
~30× faster
Defense prep inventory
2-4 hrs
10-15 min
~15× faster
Time savings are conservative estimates for repetitive formatting and handoff work. Thesis Skills does not replace writing, thinking, advisor judgment, or institutional confirmation.
What’s new in v3.4.0
Readiness Gate Integration remains in place from V3.2, and V3.4 extends that citation evidence stack with final-audit and local HTML report surfaces.
Final-audit surfaces: new deterministic final cleanup, statistical consistency, and manual-anchor checks feed reports/final-audit-report.json.
Reference audit handoff: 28-reference-audit-ledger/build_reference_audit_ledger.py writes a spreadsheet-friendly reports/reference-audit-ledger.csv from existing reference evidence.
Static local report UX: reports/index.html, reports/final-audit-report.html, and reports/reference-audit-ledger.html make JSON / CSV artifacts easier to review without replacing them as source of truth.
Claim-citation support review now includes conservative advisory signals such as possible_topic_mismatch, possible_outdated_support, and possible_overclaim.
V3.3 reference verification hardening remains in place: final reference set parsing, DOI candidates, URL verification, scoped/resumable external verification, and the unified evidence pipeline runner run_evidence_pipeline.py.
Quickstart
Run the built-in sample project through the check pipeline:
git clone https://github.com/quzhiii/thesis-skills.git
cd thesis-skills
test -d examples/minimal-latex-project
python run_check_once.py \
--project-root examples/minimal-latex-project \
--ruleset university-generic \
--skip-compile
Expected result: JSON reports are written to examples/minimal-latex-project/reports/, including run-summary.json and readiness-report.json, without requiring a local LaTeX installation.
1. Intake 2. Check 3. Fix safely 4. Gate 5. Handoff
────────── ─────────── ───────────── ───────── ─────────────
Zotero references dry-run patches PASS advisor Word
EndNote → language → preview first → WARN → review TODOs
Word/LaTeX format apply explicitly BLOCK defense pack
Readiness gate preview
┌──────────────────────────────────────────────────────────────┐
│ Readiness verdict: WARN │
├───────────────────────┬────────┬─────────────────────────────┤
│ Dimension │ Status │ Why it matters │
├───────────────────────┼────────┼─────────────────────────────┤
│ References │ PASS │ all cite keys resolve │
│ Language │ WARN │ 2 style warnings remain │
│ Format │ PASS │ labels and refs are stable │
│ Compile evidence │ WARN │ skipped in demo mode │
│ Export evidence │ WARN │ not produced by smoke test │
│ Review-loop evidence │ WARN │ not produced by smoke test │
└───────────────────────┴────────┴─────────────────────────────┘
Next actions:
1. Review reports/check_language-report.json
2. Generate Word export / review-loop artifacts when those handoffs are needed
3. Re-run without --skip-compile before final submission
The baseline run_check_once.py command writes machine-readable artifacts such as:
reports/check_bib_quality-report.json
reports/check_references-report.json
reports/citation-integrity-report.json
reports/citation-integrity-report.md
reports/citation-issues.csv
reports/check_language-report.json
reports/check_language_deep-report.json
reports/check_format-report.json
reports/check_content-report.json
reports/readiness-report.json
reports/run-summary.json
Optional final-audit foundation artifact:
reports/final-cleanup-report.json from 23-check-final-cleanup/check_final_cleanup.py
reports/statistical-consistency-report.json from 25-check-statistical-consistency/check_statistical_consistency.py
reports/manual-anchor-report.json from 26-check-manual-anchor/check_manual_anchor.py
reports/final-audit-report.json from 27-final-audit-report/build_final_audit_report.py
reports/reference-audit-ledger.csv from 28-reference-audit-ledger/build_reference_audit_ledger.py
reports/index.html from 29-report-index/build_report_index.py
reports/final-audit-report.html from 30-final-audit-html/build_final_audit_html.py
reports/reference-audit-ledger.html from 31-reference-ledger-html/build_reference_audit_ledger_html.py
reports/claim-citation-triage.html from 32-claim-citation-html/build_claim_citation_html.py
The optional v3.3 evidence pipeline writes the citation evidence artifacts:
reports/final-reference-set-report.json
reports/final-reference-set-report.csv
reports/external-verification-report.json when external verification is not skipped
reports/missing-doi-candidates.json when external verification is not skipped
reports/missing-doi-candidates.csv when external verification is not skipped
reports/url-verification-report.json when external verification is not skipped
reports/url-verification-flagged.csv when external verification is not skipped
The current v3.4.0 release line keeps local Citation Integrity as the first layer of pre-submission reference checking:
References: BLOCK
- cited keys missing from bibliography files
- duplicate citation keys with conflicting metadata
- DOI/year field warnings
- LaTeX undefined-citation warnings from local compile logs
Boundary: the current Citation Integrity workflow only checks local citation integrity. It does not query external databases and never auto-inserts or rewrites citations. Use the external verification and hallucination risk layers for evidence-based screening.
External Verification (v2.0.0)
An optional external metadata verification layer queries CrossRef, OpenAlex, and Semantic Scholar for each bibliography entry and writes reports/external-verification-report.json.
Use this when you want a fast authenticity screen for AI-drafted or suspicious-looking references before a manual final check.
reports/missing-doi-candidates.json and .csv for likely DOI additions
reports/url-verification-report.json and reports/url-verification-flagged.csv for URL resolution checks
Boundaries:
No LLM usage.
No automatic DOI write-back to .bib files.
No automatic URL replacement.
URL verification checks whether a URL resolves; it does not judge document authenticity.
Hallucination Risk (v3.0.0)
Score each bibliography entry for hallucination risk using local metadata and optional external verification evidence. The hallucination risk scorer reads reports/external-verification-report.json if present and writes reports/hallucination-risk-report.json plus reports/high-risk-references.csv.
Chinese-language or non-standard entry that cannot be auto-verified
V3.0 boundaries:
No LLM usage. Scoring is deterministic based on local metadata and external verification evidence.
No automatic citation or bibliography rewriting.
No live network calls. Reads external-verification-report.json if present.
UNSUPPORTED means “cannot be automatically judged by enabled evidence,” not “safe.”
HIGH_RISK means “manual verification strongly recommended,” not “fake.”
Claim-Citation Support Triage (v3.1.0)
Extract the sentence surrounding each \cite{} command from .tex files and pair it with cited bibliography metadata and V3.0 hallucination risk data. Produce deterministic triage labels that help identify claim-citation pairs that may lack credible structural support — without LLM.
Cited reference PASS in V3.0, complete metadata, substantive context
SUPPORTED
Reference PASS/WARN in V3.0, minor risk signals
WEAK
Reference REVIEW in V3.0, or vague context, or incomplete metadata
ORPHANED
Citation key not found in bibliography files
UNVERIFIABLE
Cited reference UNSUPPORTED in V3.0 (CJK, thesis type)
The report also includes a backward-compatible support-review layer: claim_type, support_review_label, support_review_reason, support_signals, risk_signals, cluster_keys, cluster_risk_summary, and next_actions. These fields explain why a pair or grouped citation cluster deserves manual review; they do not replace the original triage_label or make final truth claims. Local lexical evidence can use title, abstract, and keyword token overlap when those .bib fields are present. Conservative risk signals such as possible_topic_mismatch, possible_outdated_support, and possible_overclaim are advisory prompts for human review, not automatic judgments. The JSON/Markdown reports may also include advisory citation_needed_candidates for uncited high-assertion sentences; these are manual review prompts, not blocking findings.
V3.1 boundaries:
No LLM usage. Scoring is deterministic based on V3.0 risk labels, metadata, context quality, grouping, and citation frequency.
No semantic similarity between claim text and reference content.
No automatic citation rewrite or suggestion.
Reads reports/hallucination-risk-report.json if present; treats missing it conservatively.
Exit code 1 when any pair is ORPHANED.
Final Cleanup Checker
Before final PDF or submission handoff, scan LaTeX sources for process residue such as TODO, FIXME, ???, \textcolor{blue}, \color{blue}, draft, debug, and Chinese review notes like 待修改 or 待核查.
Output: reports/final-cleanup-report.json. This checker is report-only: it does not delete markers, rewrite prose, or change source files. The JSON artifact is designed to be folded later into reports/final-audit-report.json and static HTML report surfaces.
Statistical Consistency Checker
Before final submission, scan for mixed statistical notation such as p值/P值, p=/P=, 95%CI/95\%CI/95%置信区间, Bootstrap/自助法, and SMD/标准化均数差.
Output: reports/statistical-consistency-report.json. The checker reports the dominant style in the current project and flags deviations; it does not force a universal notation preference or rewrite source files.
Manual Anchor Checker
If the project uses manual contents entries, scan for \addcontentsline commands that may be missing a nearby preceding \phantomsection anchor.
Output: reports/manual-anchor-report.json. The checker reports likely TOC / LOF / LOT hyperlink-jump risks, but it does not repair labels, captions, numbering, figures, tables, or references.
Final Audit Report
After generating the source-of-truth JSON reports, aggregate them into a single final-audit handoff artifact:
Output: reports/final-audit-report.json. This report imports existing JSON evidence and groups dimensions, blockers, warnings, next actions, and source links. It does not rerun checks, call external services, modify thesis sources, or replace the raw JSON reports.
Reference Audit Ledger
For spreadsheet review and advisor/service handoff, aggregate existing reference evidence into one CSV ledger:
Output: reports/reference-audit-ledger.csv. The ledger preserves source-specific statuses from local citation integrity, final reference set, external verification, DOI candidates, URL verification, and hallucination-risk reports. It does not edit .bib, insert DOI values, replace URLs, call external services, or treat NO_CANDIDATE as fake.
Static Report Index
Generate a local HTML landing page for the reports directory:
Output: reports/index.html. This page links available JSON / CSV artifacts and shows present / missing / unreadable counts. It is a local reading surface only; JSON and CSV remain the source of truth.
Final Audit HTML
Generate a readable local detail page for the aggregated final-audit JSON:
Output: reports/final-audit-report.html. This static page is generated from final-audit-report.json and shows the overall verdict, KPI row, dimension matrix, issues, next actions, and source links. JSON remains authoritative.
Reference Audit Ledger HTML
Generate a readable local detail page for the reference-audit CSV ledger:
Output: reports/reference-audit-ledger.html. This static page is generated from reference-audit-ledger.csv and shows summary stats, scope slices, citation-key groupings, and the full ledger table. CSV remains authoritative.
Claim-Citation HTML
Generate a readable local detail page for claim-citation support review:
Output: reports/claim-citation-triage.html. This static page is generated from claim-citation-triage-report.json and shows triage groups, citation-needed candidates, uncited references, cluster review details, support/risk signals, and next actions. JSON remains authoritative.
Use this when you want a fast authenticity screen for references drafted by AI or copied from sources you do not fully trust. It produces a hallucination_risk_score per entry and a high-risk-references.csv for manual review, without rewriting the bibliography. Chinese-language references are marked UNSUPPORTED since external databases do not cover them.
Rule packs are the most important concept in Thesis Skills: they encode your institution’s formatting requirements as structured YAML so the checkers know what counts as “correct” and what counts as an issue.
Built-in Packs
90-rules/packs/
├── university-generic/ # Generic university thesis starter (default, permissive)
├── journal-generic/ # Generic journal article starter (English, minimal)
├── tsinghua-thesis/ # Tsinghua University Master's/PhD thesis pack
│ # First-pass calibrated against 《研究生学位论文写作指南(202503)》
│ # CJK/English rules, figure numbering, and reference defaults tuned to the guide
└── demo-university-thesis/ # Concrete non-Tsinghua example pack
university-generic is suitable for most Chinese universities — broad coverage, moderate thresholds.
tsinghua-thesis is specifically calibrated for Tsinghua students: GB/T 7714 reference style, mixed CJK/English rules per the university writing guide, and Chinese chapter naming conventions. For many Tsinghua thesis projects this works as a direct starting point, but you should still verify against your department template and local requirements.
journal-generic targets English journal submissions, with CJK-specific rules disabled.
Inside a Rule Pack
Each pack is a folder with three files:
90-rules/packs/your-school/
├── pack.yaml # Metadata: name, kind, version
├── rules.yaml # Rules: what to check, severity, thresholds
└── mappings.yaml # File/path mappings (main tex candidates, bib paths)
rules.yaml is organized by dimension:
Section
Controls
Examples
project
Project structure: main tex file names, chapter globs, bib paths
If you are not a Tsinghua student, or your department/journal has specific requirements, create a custom pack from one of the built-in starters.
Step 1: Scaffold the pack
python 90-rules/create_pack.py \
--pack-id my-university \
--display-name "My University Master's Thesis" \
--starter university-generic \
--kind university-thesis
This generates three files under 90-rules/packs/my-university/, copied from university-generic as a starting point.
Step 2: Adjust project structure
Edit rules.yaml → project to match your thesis directory layout:
project:
main_tex_candidates: # Possible names for your main tex file, in priority order
- thesis.tex
- main.tex
chapter_globs: # Where chapter files live and their naming pattern
- chapters/*.tex
bibliography_files: # Paths to .bib files
- ref/refs.bib
Step 3: Tune rules to your school’s guide
Check your institutional thesis writing guide and decide rule by rule:
Keep enabled: Rules that your guide explicitly requires and the checker can reliably detect (e.g., missing citation keys, figure/table numbering)
Demote: Rules your guide does not mandate — change severity from warning to info (e.g., CJK/Latin spacing if not required)
Disable: Rules clearly irrelevant to your institution or discipline — set enabled: false (e.g., CJK rules for English-only theses)
Example — demoting CJK spacing when your guide doesn’t require it:
# Before
cjk_latin_spacing:
enabled: true
severity: warning
# After (school guide does not mandate CJK-Latin spacing)
cjk_latin_spacing:
enabled: true
severity: info
Step 4: Update required section names
If your thesis uses Chinese section naming (not English IMRaD), sync the content rules:
After running, inspect the JSON reports under reports/. If you notice:
Too many false positives in a category → demote or disable that rule
Real issues not detected → check if the rule is enabled and severity is set high enough
Project discovery failed → adjust main_tex_candidates or chapter_globs
Tweak → re-run → review reports. Most packs converge in 1–2 calibration rounds.
For non-Tsinghua users: If your calibrated rule pack is stable and you’d like it featured, PRs adding new packs to 90-rules/packs/ are welcome. Future students from your school won’t have to start from scratch.
Tested on
Python 3.10+
Windows / macOS / Linux
LaTeX optional for the --skip-compile demo; run without --skip-compile when you want compile-log diagnostics
Boundaries
Thesis Skills is
Thesis Skills is not
A bridge between Word, Zotero, EndNote, and LaTeX
A thesis template or document class
A deterministic checker for formatting and structural rules
An AI writing assistant that generates thesis content
A report-driven workflow with dry-run previews
A replacement for Grammarly or other prose editors
A pre-submission readiness gate
An automatic final defense PPT generator
Extensible through institution-specific rule packs
A guarantee that every school or journal rule is already supported
v3.3.0: hardened reference verification with final reference set parsing, resumeable external verification, DOI candidate suggestions, and URL verification.
v3.2.0: integrated hallucination risk and claim-citation triage into readiness gate, added unified evidence pipeline runner, run_evidence_pipeline.py.
v3.1.0: added claim-citation support triage, claim-citation-triage-report.json, deterministic triage scoring, and three demo projects.
v3.0.0: added hallucination risk scoring, hallucination-risk-report.json, high-risk-references.csv, Chinese-language UNSUPPORTED handling, and three demo projects.
v2.0.0: added CrossRef / OpenAlex / Semantic Scholar external verification, consensus candidates, and an external_verification readiness advisory.
v1.0.0: stabilized the public workflow story across README, roadmap, site, examples, and code paths.
v1.1.0: added the local-first Citation Integrity engine and readiness integration.
v1.2.0: added Markdown/CSV Citation Integrity outputs, clean/broken demos, and public version-line alignment.
Special thanks to tuna/thuthesis and other open-source thesis template projects. These projects make high-quality LaTeX thesis writing more accessible and inspired the workflow design of Thesis Skills.
Thesis Skills v3.4.0
Deterministic thesis workflow tools for citation sync, format checks, review handoff, and pre-submission readiness.
Spend your time thinking, not fixing formatting.
中文文档 · English · Showcase
What’s New · Quickstart · Outputs · Scenarios · Updating · Rule Packs · Creating Your Own · Boundaries
What is this?
Thesis Skills is not an AI writing assistant, not a thesis template, and not a tool that writes thesis content for you.
It is a CLI workflow system that connects the tools many graduate students and researchers already use: Word, Zotero, EndNote, LaTeX, structured check reports, safe fix patches, review handoff artifacts, and pre-submission readiness checks.
The goal is simple: turn scattered, manual, error-prone thesis finishing work into a workflow that is checkable, repeatable, and auditable.
For repetitive finishing work, the expected time savings are concrete:
What’s new in v3.4.0
reports/final-audit-report.json.28-reference-audit-ledger/build_reference_audit_ledger.pywrites a spreadsheet-friendlyreports/reference-audit-ledger.csvfrom existing reference evidence.reports/index.html,reports/final-audit-report.html, andreports/reference-audit-ledger.htmlmake JSON / CSV artifacts easier to review without replacing them as source of truth.possible_topic_mismatch,possible_outdated_support, andpossible_overclaim.run_evidence_pipeline.py.Quickstart
Run the built-in sample project through the check pipeline:
Expected result: JSON reports are written to
examples/minimal-latex-project/reports/, includingrun-summary.jsonandreadiness-report.json, without requiring a local LaTeX installation.If you already have a LaTeX thesis project:
More details:
docs/quickstart.md.Outputs
Hero workflow
Readiness gate preview
The baseline
run_check_once.pycommand writes machine-readable artifacts such as:reports/check_bib_quality-report.jsonreports/check_references-report.jsonreports/citation-integrity-report.jsonreports/citation-integrity-report.mdreports/citation-issues.csvreports/check_language-report.jsonreports/check_language_deep-report.jsonreports/check_format-report.jsonreports/check_content-report.jsonreports/readiness-report.jsonreports/run-summary.jsonOptional final-audit foundation artifact:
reports/final-cleanup-report.jsonfrom23-check-final-cleanup/check_final_cleanup.pyreports/statistical-consistency-report.jsonfrom25-check-statistical-consistency/check_statistical_consistency.pyreports/manual-anchor-report.jsonfrom26-check-manual-anchor/check_manual_anchor.pyreports/final-audit-report.jsonfrom27-final-audit-report/build_final_audit_report.pyreports/reference-audit-ledger.csvfrom28-reference-audit-ledger/build_reference_audit_ledger.pyreports/index.htmlfrom29-report-index/build_report_index.pyreports/final-audit-report.htmlfrom30-final-audit-html/build_final_audit_html.pyreports/reference-audit-ledger.htmlfrom31-reference-ledger-html/build_reference_audit_ledger_html.pyreports/claim-citation-triage.htmlfrom32-claim-citation-html/build_claim_citation_html.pyThe optional v3.3 evidence pipeline writes the citation evidence artifacts:
reports/final-reference-set-report.jsonreports/final-reference-set-report.csvreports/external-verification-report.jsonwhen external verification is not skippedreports/missing-doi-candidates.jsonwhen external verification is not skippedreports/missing-doi-candidates.csvwhen external verification is not skippedreports/url-verification-report.jsonwhen external verification is not skippedreports/url-verification-flagged.csvwhen external verification is not skippedreports/hallucination-risk-report.jsonreports/high-risk-references.csvreports/claim-citation-triage-report.jsonreports/claim-citation-triage.mdreports/claim-citation-triage.csvExample JSON snippets and demo walkthroughs:
docs/examples.md.Citation Integrity preview
The current v3.4.0 release line keeps local Citation Integrity as the first layer of pre-submission reference checking:
Boundary: the current Citation Integrity workflow only checks local citation integrity. It does not query external databases and never auto-inserts or rewrites citations. Use the external verification and hallucination risk layers for evidence-based screening.
External Verification (v2.0.0)
An optional external metadata verification layer queries CrossRef, OpenAlex, and Semantic Scholar for each bibliography entry and writes
reports/external-verification-report.json.Use this when you want a fast authenticity screen for AI-drafted or suspicious-looking references before a manual final check.
Or via the existing reference checker with an explicit flag:
V2.0 boundaries:
external_verificationis advisory only.UNAVAILABLE, never crash.Final Reference Set + DOI / URL checks (v3.3.0)
V3.3 hardens citation evidence by separating three scopes:
final: references that actually entered the compiled bibliography via.aux/.bblcited: citation keys extracted from TeX source\cite{}commandsall: every entry in active.bibfilesThe final reference set builder writes:
reports/final-reference-set-report.jsonreports/final-reference-set-report.csvThe external verification layer can now resume long runs and write partial results safely:
V3.3 also adds advisory follow-up reports:
reports/missing-doi-candidates.jsonand.csvfor likely DOI additionsreports/url-verification-report.jsonandreports/url-verification-flagged.csvfor URL resolution checksBoundaries:
.bibfiles.Hallucination Risk (v3.0.0)
Score each bibliography entry for hallucination risk using local metadata and optional external verification evidence. The hallucination risk scorer reads
reports/external-verification-report.jsonif present and writesreports/hallucination-risk-report.jsonplusreports/high-risk-references.csv.Risk labels:
PASSWARNREVIEWHIGH_RISKUNSUPPORTEDV3.0 boundaries:
external-verification-report.jsonif present.UNSUPPORTEDmeans “cannot be automatically judged by enabled evidence,” not “safe.”HIGH_RISKmeans “manual verification strongly recommended,” not “fake.”Claim-Citation Support Triage (v3.1.0)
Extract the sentence surrounding each
\cite{}command from.texfiles and pair it with cited bibliography metadata and V3.0 hallucination risk data. Produce deterministic triage labels that help identify claim-citation pairs that may lack credible structural support — without LLM.Triage labels:
WELL_SUPPORTEDSUPPORTEDWEAKORPHANEDUNVERIFIABLEThe report also includes a backward-compatible support-review layer:
claim_type,support_review_label,support_review_reason,support_signals,risk_signals,cluster_keys,cluster_risk_summary, andnext_actions. These fields explain why a pair or grouped citation cluster deserves manual review; they do not replace the originaltriage_labelor make final truth claims. Local lexical evidence can use title, abstract, and keyword token overlap when those.bibfields are present. Conservative risk signals such aspossible_topic_mismatch,possible_outdated_support, andpossible_overclaimare advisory prompts for human review, not automatic judgments. The JSON/Markdown reports may also include advisorycitation_needed_candidatesfor uncited high-assertion sentences; these are manual review prompts, not blocking findings.V3.1 boundaries:
reports/hallucination-risk-report.jsonif present; treats missing it conservatively.ORPHANED.Final Cleanup Checker
Before final PDF or submission handoff, scan LaTeX sources for process residue such as
TODO,FIXME,???,\textcolor{blue},\color{blue},draft,debug, and Chinese review notes like待修改or待核查.Output:
reports/final-cleanup-report.json. This checker is report-only: it does not delete markers, rewrite prose, or change source files. The JSON artifact is designed to be folded later intoreports/final-audit-report.jsonand static HTML report surfaces.Statistical Consistency Checker
Before final submission, scan for mixed statistical notation such as
p值/P值,p=/P=,95%CI/95\%CI/95%置信区间,Bootstrap/自助法, andSMD/标准化均数差.Output:
reports/statistical-consistency-report.json. The checker reports the dominant style in the current project and flags deviations; it does not force a universal notation preference or rewrite source files.Manual Anchor Checker
If the project uses manual contents entries, scan for
\addcontentslinecommands that may be missing a nearby preceding\phantomsectionanchor.Output:
reports/manual-anchor-report.json. The checker reports likely TOC / LOF / LOT hyperlink-jump risks, but it does not repair labels, captions, numbering, figures, tables, or references.Final Audit Report
After generating the source-of-truth JSON reports, aggregate them into a single final-audit handoff artifact:
Output:
reports/final-audit-report.json. This report imports existing JSON evidence and groups dimensions, blockers, warnings, next actions, and source links. It does not rerun checks, call external services, modify thesis sources, or replace the raw JSON reports.Reference Audit Ledger
For spreadsheet review and advisor/service handoff, aggregate existing reference evidence into one CSV ledger:
Output:
reports/reference-audit-ledger.csv. The ledger preserves source-specific statuses from local citation integrity, final reference set, external verification, DOI candidates, URL verification, and hallucination-risk reports. It does not edit.bib, insert DOI values, replace URLs, call external services, or treatNO_CANDIDATEas fake.Static Report Index
Generate a local HTML landing page for the reports directory:
Output:
reports/index.html. This page links available JSON / CSV artifacts and shows present / missing / unreadable counts. It is a local reading surface only; JSON and CSV remain the source of truth.Final Audit HTML
Generate a readable local detail page for the aggregated final-audit JSON:
Output:
reports/final-audit-report.html. This static page is generated fromfinal-audit-report.jsonand shows the overall verdict, KPI row, dimension matrix, issues, next actions, and source links. JSON remains authoritative.Reference Audit Ledger HTML
Generate a readable local detail page for the reference-audit CSV ledger:
Output:
reports/reference-audit-ledger.html. This static page is generated fromreference-audit-ledger.csvand shows summary stats, scope slices, citation-key groupings, and the full ledger table. CSV remains authoritative.Claim-Citation HTML
Generate a readable local detail page for claim-citation support review:
Output:
reports/claim-citation-triage.html. This static page is generated fromclaim-citation-triage-report.jsonand shows triage groups, citation-needed candidates, uncited references, cluster review details, support/risk signals, and next actions. JSON remains authoritative.Scenarios
1. I just switched from Word to LaTeX
2. I already use LaTeX and want to check my thesis
3. My advisor needs a Word version for review
4. I received Word feedback and need to update LaTeX
5. I am preparing for defense
6. I want to screen AI-generated or suspicious references
Use this when you want a fast authenticity screen for references drafted by AI or copied from sources you do not fully trust. It produces a
hallucination_risk_scoreper entry and ahigh-risk-references.csvfor manual review, without rewriting the bibliography. Chinese-language references are markedUNSUPPORTEDsince external databases do not cover them.More scenarios:
docs/examples.md.Rule Packs
Rule packs are the most important concept in Thesis Skills: they encode your institution’s formatting requirements as structured YAML so the checkers know what counts as “correct” and what counts as an issue.
Built-in Packs
university-genericis suitable for most Chinese universities — broad coverage, moderate thresholds.tsinghua-thesisis specifically calibrated for Tsinghua students: GB/T 7714 reference style, mixed CJK/English rules per the university writing guide, and Chinese chapter naming conventions. For many Tsinghua thesis projects this works as a direct starting point, but you should still verify against your department template and local requirements.journal-generictargets English journal submissions, with CJK-specific rules disabled.Inside a Rule Pack
Each pack is a folder with three files:
rules.yamlis organized by dimension:projectmain_tex_candidates,chapter_globsreferencemissing_key: errorlanguagecjk_latin_spacing,bracket_mismatchlanguage_deepinference_overclaim,boundary_signpostconsistencyterminology_consistencyformatrequire_list_of_figurescontentrequired_sectionscompileengine_hint: xelatexCreating Your Own School Rule Pack
If you are not a Tsinghua student, or your department/journal has specific requirements, create a custom pack from one of the built-in starters.
Step 1: Scaffold the pack
This generates three files under
90-rules/packs/my-university/, copied fromuniversity-genericas a starting point.Step 2: Adjust project structure
Edit
rules.yaml→projectto match your thesis directory layout:Step 3: Tune rules to your school’s guide
Check your institutional thesis writing guide and decide rule by rule:
severityfromwarningtoinfo(e.g., CJK/Latin spacing if not required)enabled: false(e.g., CJK rules for English-only theses)Example — demoting CJK spacing when your guide doesn’t require it:
Step 4: Update required section names
If your thesis uses Chinese section naming (not English IMRaD), sync the content rules:
Step 5: Run checks with your custom pack
Step 6: Validate and iterate
After running, inspect the JSON reports under
reports/. If you notice:main_tex_candidatesorchapter_globsTweak → re-run → review reports. Most packs converge in 1–2 calibration rounds.
Tested on
--skip-compiledemo; run without--skip-compilewhen you want compile-log diagnosticsBoundaries
Documentation
docs/quickstart.mddocs/examples.mddocs/modules.mddocs/architecture.mddocs/getting-started-zh.mdCHANGELOG.mdRelease history
v3.4.0: added final-audit report surfaces, reference-audit ledger HTML, and conservative claim-citation support-risk signals.v3.3.0: hardened reference verification with final reference set parsing, resumeable external verification, DOI candidate suggestions, and URL verification.v3.2.0: integrated hallucination risk and claim-citation triage into readiness gate, added unified evidence pipeline runner,run_evidence_pipeline.py.v3.1.0: added claim-citation support triage,claim-citation-triage-report.json, deterministic triage scoring, and three demo projects.v3.0.0: added hallucination risk scoring,hallucination-risk-report.json,high-risk-references.csv, Chinese-languageUNSUPPORTEDhandling, and three demo projects.v2.0.0: added CrossRef / OpenAlex / Semantic Scholar external verification, consensus candidates, and anexternal_verificationreadiness advisory.v1.0.0: stabilized the public workflow story across README, roadmap, site, examples, and code paths.v1.1.0: added the local-first Citation Integrity engine and readiness integration.v1.2.0: added Markdown/CSV Citation Integrity outputs, clean/broken demos, and public version-line alignment.CHANGELOG.mdfor the full changelog.Updating your local copy
Downloading or cloning the repository once does not make future updates appear automatically on your machine.
Choose the update path that matches how you got Thesis Skills:
If you cloned with Git
Run:
This fetches the newest committed changes from GitHub into your local checkout.
If you want to see what changed before pulling:
If you downloaded a ZIP
A ZIP download is just a snapshot. It will not sync by itself.
To get updates, either:
git pullIf you edited the repository locally
Pulling new changes is easiest when your local copy has no uncommitted edits.
Before updating, check:
If you have local modifications, commit or back them up first so
git pulldoes not create conflicts unexpectedly.Module reference
The long module table lives in
docs/modules.mdso this README stays focused on the product workflow.Template recommendations
Thesis Skills is designed to work alongside mature templates and institution-specific document classes.
Acknowledgments
Special thanks to tuna/thuthesis and other open-source thesis template projects. These projects make high-quality LaTeX thesis writing more accessible and inspired the workflow design of Thesis Skills.
License
MIT License