🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
-
Updated
Jun 6, 2024
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
Tadabur is a large-scale, high-diversity Qur'anic speech dataset designed to advance Arabic speech research.
The Abuse Project Audio Dataset (TAPAD). Think MNIST for audio profanity.
[AAAI 2023] AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work
Heart and Lung Sounds Dataset Recorded from a Clinical Manikin using Digital Stethoscope (HLS-CMDS)
Voice activity detection and speaker gender segmentation audiovisual corpus
Lo-Fi Chords Dataset is an open audio dataset containing 8,000 chord progressions.
LibriVox dataset for Bulgarian language TTS
ParquetToHuggingFace processes raw audio data, converts it into Parquet files, and uploads them to Hugging Face. The README explains how to set up the environment, configure paths, and run the scripts to generate and upload the data.
A utility for wrapping the Free Spoken Digit Dataset into PyTorch-ready data set splits.
This repository contains data preprocessing and analysis techniques for audio data.
A comprehensive voice persona dataset for character consistency in voice synthesis, generated using advanced audio-language model Qwen2-Audio-7B, with a GPU-optimized pipeline
HH-TRP is an open dataset of 15,000 trap-style hip hop drum loops.
Source code for baseline obtenience
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🎼️🎷️ The audio:music:vocals category for AI2001, containing music datasets
This repository contains a custom Arabic digits (0-9) dataset contributed by multiple individuals and a neural network model designed to accurately recognize these digits.
top dataset for voice conversion models
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🎼️🎶️ The audio:sound effects category for AI2001, containing sound effect (SFX) datasets
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🎼️🎷️ The audio:music category for AI2001, containing music datasets
Add a description, image, and links to the audio-dataset topic page so that developers can more easily learn about it.
To associate your repository with the audio-dataset topic, visit your repo's landing page and select "manage topics."