ESC: Dataset for Environmental Sound Classification
K. Piczak. Proceedings of the 23rd ACM International Conference on Multimedia, Seite 1015–1018. New York, NY, USA, Association for Computing Machinery, (2015)
DOI: 10.1145/2733373.2806390
Zusammenfassung
One of the obstacles in research activities concentrating on environmental sound classification is the scarcity of suitable and publicly available datasets. This paper tries to address that issue by presenting a new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project. The paper also provides an evaluation of human accuracy in classifying environmental sounds and compares it to the performance of selected baseline classifiers using features derived from mel-frequency cepstral coefficients and zero-crossing rate.
%0 Conference Paper
%1 piczak2015dataset
%A Piczak, Karol J.
%B Proceedings of the 23rd ACM International Conference on Multimedia
%C New York, NY, USA
%D 2015
%I Association for Computing Machinery
%K audio_classification dataset thema:cnn_and_attention_methods_for_audio_classification
%P 1015–1018
%R 10.1145/2733373.2806390
%T ESC: Dataset for Environmental Sound Classification
%U https://doi.org/10.1145/2733373.2806390
%X One of the obstacles in research activities concentrating on environmental sound classification is the scarcity of suitable and publicly available datasets. This paper tries to address that issue by presenting a new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project. The paper also provides an evaluation of human accuracy in classifying environmental sounds and compares it to the performance of selected baseline classifiers using features derived from mel-frequency cepstral coefficients and zero-crossing rate.
%@ 9781450334594
@inproceedings{piczak2015dataset,
abstract = {One of the obstacles in research activities concentrating on environmental sound classification is the scarcity of suitable and publicly available datasets. This paper tries to address that issue by presenting a new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project. The paper also provides an evaluation of human accuracy in classifying environmental sounds and compares it to the performance of selected baseline classifiers using features derived from mel-frequency cepstral coefficients and zero-crossing rate.},
added-at = {2022-07-12T01:36:53.000+0200},
address = {New York, NY, USA},
author = {Piczak, Karol J.},
biburl = {https://www.bibsonomy.org/bibtex/2fadb5f654132f1fa1f7325327a436201/fachter},
booktitle = {Proceedings of the 23rd ACM International Conference on Multimedia},
doi = {10.1145/2733373.2806390},
interhash = {77f6ded04d5e43923e3bdf9622812e2d},
intrahash = {fadb5f654132f1fa1f7325327a436201},
isbn = {9781450334594},
keywords = {audio_classification dataset thema:cnn_and_attention_methods_for_audio_classification},
location = {Brisbane, Australia},
numpages = {4},
pages = {1015–1018},
publisher = {Association for Computing Machinery},
series = {MM '15},
timestamp = {2022-07-12T10:20:53.000+0200},
title = {ESC: Dataset for Environmental Sound Classification},
url = {https://doi.org/10.1145/2733373.2806390},
year = 2015
}