Welcome to the SVM-Based Cough Detector project! This tool uses machine learning to automatically detect coughs in audio recordings.
- Loads and Segments Audio: Takes in 
.wavfiles and fetch positive (cough) segments based on the labels, and negative (not cough) segments that has the similar length distribution as positive samples. - Extracts Features: Pulls out MGCC features from each segment and aggregates them with mean and standard deviation to create uniform feature vectors.
 - Trains an SVM Model: Uses extracted features to train a Support Vector Machine (SVM) that can differentiate between cough and non-cough sounds.
 - Predicts coughs in audio collected by myself: Applies the trained model to new audio files to identify and timestamp cough events.
 - Store results: Generates label files where coughs are detected in the audio waveform.
 
- Clone the Repository:
git clone https://github.com/yourusername/svm-cough-detector.git cd svm-cough-detector - Install Dependencies: (recommend to do it in a virtual enviroment)
pip install -r requirements.txt
 - Inference:
python3 src/inference.py --model svm --input-dir dir/contains/one/data.wav
 - Want to record your own audio files?
python3 src/record_audio.py --output-dir dir/to/save/the/recording --duration 5
 
- File type: The audio file is saved in .wav formate with sample rate at 16KHz and is monophonic sound.
 - File name: The audio file is saved as 
data.wav 
- Usage example:
# Record a 5 second audio and save it to my_data/positive/cough_example python3 src/record_audio.py --output-dir my_data/positive/cough_example --duration 5 # Use the recorded audio for inferencing python3 src/inference.py --model svm --input-dir my_data/positive/cough_example
 
- Loading: Utilizes 
librosato load audio files at a consistent sampling rate. - Segmentation: Splits audio into positive and negative segments. During this step, we try to keep both side has simialr distrubution in length and number of samaples.
 
- MFCCs: Extracts Mel-Frequency Cepstral Coefficients (MFCCs) from each segment.
 - Aggregation: Calculates the mean and standard deviation of MFCCs to create a fixed-length feature vector for each segment.
 
- Scaling: Standardizes features using 
StandardScalerfor better SVM performance. - SVM Training: Trains an SVM classifier with hyperparameter tuning using 
GridSearchCVto find the best settings. 
- Segmenting New Audio: Processes new 
.wavfiles undermy_datafolder. Breaks audio signal down into fixed-size (0.1s) chunks - Feature Processing: Extracts and scales features from new segments.
 - Classification: Predicts whether each segment contains a cough.
 - Mapping: Associates predictions with their corresponding time frames and filters out non-cough segments.