INSPIRE: Instruction-based Multi-Task Speech and Audio Processing Benchmark

Introduction

INSPIRE is an INstruction-based multi-task SPeech and audIo pRocessing bEnchmark. INSPIRE is built to help benchmark speech foundation models and it includes dataset and models. INSPIRE can be used for cross-modal tasks including speech-to-text, text-to-speech, speech-to-speech, and audio-to-text tasks in the range from recognition, understanding and generation.

Dataset

INSPIRE dataset (coming soon)

Models

(coming soon)
## License This project is licensed under [The MIT License](https://opensource.org/licenses/MIT). INSPIRE also contains various third-party components and some code modified from other repos under other open source licenses.