Author: Li Jiabin, School of Computer Science, Wuhan University
This project implements an offline, local audio event detection pipeline for
openvela-class embedded systems. It detects cough and glass_break events
against a background class, then emits local alarm records through terminal,
serial-style logs, JSONL files, or board-level hooks.
What Is Included
Pure C inference core with a fixed public API.
Host CLI for file and stream-style verification.
Python tooling for reproducible sample generation, feature inspection,
training/export, and evaluation.
openvela/NuttX integration fragments and board-porting notes.
Initial-round report, no-hardware submission strategy, model card,
reproducibility guide, scoring map, evidence manifest, and demo script.
Reviewer pitch deck, voiceover script, and reproducible no-hardware demo
video generation notes.
PPT/video quality audit for layout, stream, audio, and narration-boundary checks.
The runtime path is fully offline: audio input -> features -> model inference
-> local alarm. No cloud service or network request is used during detection.
One-Command Verification
Run from WSL2 Ubuntu-26.04:
cd /mnt/d/06_MYD/竞赛/CCF开源创新大赛2/基于\ openvela\ 的本地音频事件检测系统/openvela-local-audio-event-detector
python3 scripts/run_tests.py
The script creates .venv, installs development dependencies inside the
repository, generates deterministic sample WAV files, runs Python tests, builds
the C code with CMake, runs CTest, executes CLI smoke tests, checks submission
readiness, and creates a review ZIP under dist/.
If python3.14-venv is missing while another apt process holds the package
manager lock, the script reports VENV_PIP_UNAVAILABLE=1 and falls back to an
existing pytest executable. Installing python3.14-venv later restores the
preferred isolated .venv path.
This repository is prepared for the initial round when no physical board is
available. The submission evidence is:
the same pure C detector core that the openvela port calls;
deterministic 16 kHz PCM samples that exercise background, cough, and
glass-break cases;
reports/evaluation.json with clean/noisy metrics;
reports/robustness.json and .csv with gain/noise/time-shift/background-mix
stress evidence;
reports/runtime_parity.json proving Python reference and compiled C runtime
agree on labels and alarms;
reports/latency_profile.json and .csv with repeatable host CLI latency
statistics;
reports/footprint.json with host-build binary size evidence for embedded
review;
reports/embedded_readiness.json verifying fixed-size API constants,
compiled model constants, no dynamic allocation/cloud/network in the core
path, and a static openvela PCM buffer instead of a one-second stack buffer;
reports/artifact_manifest.json with SHA-256 hashes for review-relevant
source and evidence files;
reports/reviewer_summary.md as a one-page evidence entry point;
reports/openvela_port_check.json verifying transplant-ready openvela hooks,
CMake, and Kconfig fragments;
reports/consistency_check.json verifying labels, API constants, and
documentation stay aligned;
reports/alarm_log.jsonl with repeatable local alarm events;
openvela/aed_openvela_app.c with board hooks for microphone, LED, buzzer,
serial, or screen output and a 32 KB static PCM input buffer.
deliverables/openvela_audio_event_detector_pitch_deck.pptx for the
reviewer-facing presentation;
deliverables/demo_video_readme.md documenting how the real no-hardware MP4
was generated from rendered slides, live CLI output, and Chinese narration.
deliverables/demo_voiceover_script.md with the reviewable narration text.
deliverables/demo_video_zh_cn.srt with reviewable Chinese subtitle timing.
deliverables/ppt_video_quality_audit.md with the PPT/video layout, stream,
audio, and narration-boundary audit.
This mode does not claim that a real DshanPixVela-DevKit V1 run has already
been completed. Hardware integration remains a direct porting step documented
in docs/openvela_porting.md.
Public C API
int aed_init(const aed_model_t *model);
int aed_process_pcm16(const int16_t *pcm,
size_t count,
uint32_t sample_rate,
aed_result_t *out);
aed_result_t contains label, confidence, start_ms, end_ms, alarm,
and feature values. Passing NULL to aed_init() loads the default embedded
model.
Repository Layout
include/aed/ - public C API.
src/ - C inference, feature extraction, default model, WAV reader.
app/ - host CLI.
tools/ - Python data, training, export, evaluation, and deck/video tools.
tests/ - Python and C tests.
openvela/ - openvela/NuttX integration fragments.
docs/ - reproducibility, model, architecture, and porting documents.
For reviewers, docs/scoring_map.md maps the initial-round scoring criteria to
specific files and commands in this repository.
The generated no-hardware demo videos are stored under dist/ locally to avoid
putting large media files in Git. Regenerate the PPT and slide PNGs with
tools/build_pitch_deck.mjs in an artifact-tool workspace, create the silent
base video with tools/build_demo_video.py, then create the narrated MP4,
subtitle sidecar, and embedded optional subtitle track with
tools/build_demo_voiceover.py.
Hardware Status
The selected initial-round mode is no-hardware participation. The host-side
path is complete and repeatable without a board. If a DshanPixVela-DevKit V1 or
another openvela board becomes available later, follow docs/openvela_porting.md
to wire the PCM input and alarm hooks to the board microphone, LED, buzzer,
screen, or serial output.
关于
离线本地音频事件检测系统,支持咳嗽声与玻璃破碎声识别,包含纯 C 推理核心、WSL2 可复现验证、openvela 移植代码、评估报告与初赛交付文档。
openvela Local Audio Event Detector
Author: Li Jiabin, School of Computer Science, Wuhan University
This project implements an offline, local audio event detection pipeline for openvela-class embedded systems. It detects
coughandglass_breakevents against abackgroundclass, then emits local alarm records through terminal, serial-style logs, JSONL files, or board-level hooks.What Is Included
The runtime path is fully offline: audio input -> features -> model inference -> local alarm. No cloud service or network request is used during detection.
One-Command Verification
Run from WSL2
Ubuntu-26.04:The script creates
.venv, installs development dependencies inside the repository, generates deterministic sample WAV files, runs Python tests, builds the C code with CMake, runs CTest, executes CLI smoke tests, checks submission readiness, and creates a review ZIP underdist/.If
python3.14-venvis missing while another apt process holds the package manager lock, the script reportsVENV_PIP_UNAVAILABLE=1and falls back to an existingpytestexecutable. Installingpython3.14-venvlater restores the preferred isolated.venvpath.Manual Quick Start
Expected event labels:
samples/cough.wav->coughsamples/glass_break.wav->glass_breaksamples/background.wav->backgroundNo-Hardware Initial Round Mode
This repository is prepared for the initial round when no physical board is available. The submission evidence is:
reports/evaluation.jsonwith clean/noisy metrics;reports/robustness.jsonand.csvwith gain/noise/time-shift/background-mix stress evidence;reports/runtime_parity.jsonproving Python reference and compiled C runtime agree on labels and alarms;reports/latency_profile.jsonand.csvwith repeatable host CLI latency statistics;reports/footprint.jsonwith host-build binary size evidence for embedded review;reports/embedded_readiness.jsonverifying fixed-size API constants, compiled model constants, no dynamic allocation/cloud/network in the core path, and a static openvela PCM buffer instead of a one-second stack buffer;reports/artifact_manifest.jsonwith SHA-256 hashes for review-relevant source and evidence files;reports/reviewer_summary.mdas a one-page evidence entry point;reports/openvela_port_check.jsonverifying transplant-ready openvela hooks, CMake, and Kconfig fragments;reports/consistency_check.jsonverifying labels, API constants, and documentation stay aligned;reports/demo_run.mdcapturing deterministic demo command outputs;reports/alarm_log.jsonlwith repeatable local alarm events;openvela/aed_openvela_app.cwith board hooks for microphone, LED, buzzer, serial, or screen output and a 32 KB static PCM input buffer.deliverables/openvela_audio_event_detector_pitch_deck.pptxfor the reviewer-facing presentation;deliverables/demo_video_readme.mddocumenting how the real no-hardware MP4 was generated from rendered slides, live CLI output, and Chinese narration.deliverables/demo_voiceover_script.mdwith the reviewable narration text.deliverables/demo_video_zh_cn.srtwith reviewable Chinese subtitle timing.deliverables/ppt_video_quality_audit.mdwith the PPT/video layout, stream, audio, and narration-boundary audit.This mode does not claim that a real DshanPixVela-DevKit V1 run has already been completed. Hardware integration remains a direct porting step documented in
docs/openvela_porting.md.Public C API
aed_result_tcontainslabel,confidence,start_ms,end_ms,alarm, and feature values. PassingNULLtoaed_init()loads the default embedded model.Repository Layout
include/aed/- public C API.src/- C inference, feature extraction, default model, WAV reader.app/- host CLI.tools/- Python data, training, export, evaluation, and deck/video tools.tests/- Python and C tests.openvela/- openvela/NuttX integration fragments.docs/- reproducibility, model, architecture, and porting documents.deliverables/- initial-round report, 5-minute demo script, checklist.For reviewers,
docs/scoring_map.mdmaps the initial-round scoring criteria to specific files and commands in this repository.The generated no-hardware demo videos are stored under
dist/locally to avoid putting large media files in Git. Regenerate the PPT and slide PNGs withtools/build_pitch_deck.mjsin an artifact-tool workspace, create the silent base video withtools/build_demo_video.py, then create the narrated MP4, subtitle sidecar, and embedded optional subtitle track withtools/build_demo_voiceover.py.Hardware Status
The selected initial-round mode is no-hardware participation. The host-side path is complete and repeatable without a board. If a DshanPixVela-DevKit V1 or another openvela board becomes available later, follow
docs/openvela_porting.mdto wire the PCM input and alarm hooks to the board microphone, LED, buzzer, screen, or serial output.