VOCOVID uses a Convolutional Neural Network (CNN) to identify if your cough recording shows symptoms of COVID-19. While it is intended to screen whether or not users may be presenting symptoms of COVID-19, it should be noted that VOCOVID is merely a screening tool, and all results cannot be considered an official diagnosis.
Mel Spectrograms create a visual representation of the audio frequencies in a recording, helping identify restricted breathing or distinguishing between “wet” and “dry” coughs. The x-axis represents time, the y-axis represents frequency, and the color intensity represents amplitude. This method reduces data dimensionality, making it computationally efficient for our CNN.
Mel Spectrograms allow for efficient identification of COVID-19 cough patterns by focusing on specific audio frequencies and amplitudes. This also helps train the CNN more effectively by enhancing features needed for accurate testing.
CNNs are the backbone of VOCOVID due to their ability to process Mel Spectrograms. Although CNNs can lose information in some cases, they are still preferred over RNNs for this project due to their ability to produce feature maps. These maps enable dataset augmentation and enhance key features for identifying COVID-19 symptoms in audio data.
The VOCOVID model uses K-fold cross-validation and pre-processing on Mel-Spectrograms for training. Currently, the model has two layers for processing data during testing, with plans to increase complexity as testing progresses.
ResNet50 serves as the benchmark model for VOCOVID due to its proven accuracy in audio data processing without requiring preprocessing for imbalance handling. Results from testing with ResNet50 on our dataset will be included soon.
Disclaimer: VOCOVID is a screening tool and cannot provide an official COVID-19 diagnosis. Please consult a medical professional for an official diagnosis.