Inferring social contexts from audio recordings using deep neural networks

Meysam Asgari; Izhak Shafran; Alireza Bayestehtashk

doi:10.1109/MLSP.2014.6958853

Inferring social contexts from audio recordings using deep neural networks

Meysam Asgari, Izhak Shafran, Alireza Bayestehtashk

Institute on Development and Disability

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

6 Scopus citations

Abstract

In this paper, we investigate the problem of detecting social contexts from the audio recordings of everyday life such as in life-logs. Unlike the standard corpora of telephone speech or broadcast news, these recordings have a wide variety of background noise. By nature, in such applications, it is difficult to collect and label all the representative noise for learning models in a fully supervised manner. The amount of labeled data that can be expected is relatively small compared to the available recordings. This lends itself naturally to unsupervised feature extraction using sparse auto-encoders, followed by supervised learning of a classifier for social contexts. We investigate different strategies for training these models and report results on a real-world application.

Original language	English (US)
Title of host publication	IEEE International Workshop on Machine Learning for Signal Processing, MLSP
Editors	Mamadou Mboup, Tulay Adali, Eric Moreau, Jan Larsen
Publisher	IEEE Computer Society
ISBN (Electronic)	9781479936946
DOIs	https://doi.org/10.1109/MLSP.2014.6958853
State	Published - Nov 14 2014
Event	2014 24th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2014 - Reims, France Duration: Sep 21 2014 → Sep 24 2014

Publication series

Name	IEEE International Workshop on Machine Learning for Signal Processing, MLSP
ISSN (Print)	2161-0363
ISSN (Electronic)	2161-0371

Conference

Conference	2014 24th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2014
Country/Territory	France
City	Reims
Period	9/21/14 → 9/24/14

Keywords

Deep neural networks
Harmonic model
Multi-label classification

ASJC Scopus subject areas

Human-Computer Interaction
Signal Processing

Access to Document

10.1109/MLSP.2014.6958853

Cite this

Asgari, M., Shafran, I., & Bayestehtashk, A. (2014). Inferring social contexts from audio recordings using deep neural networks. In M. Mboup, T. Adali, E. Moreau, & J. Larsen (Eds.), IEEE International Workshop on Machine Learning for Signal Processing, MLSP Article 6958853 (IEEE International Workshop on Machine Learning for Signal Processing, MLSP). IEEE Computer Society. https://doi.org/10.1109/MLSP.2014.6958853

Inferring social contexts from audio recordings using deep neural networks. / Asgari, Meysam; Shafran, Izhak; Bayestehtashk, Alireza.
IEEE International Workshop on Machine Learning for Signal Processing, MLSP. ed. / Mamadou Mboup; Tulay Adali; Eric Moreau; Jan Larsen. IEEE Computer Society, 2014. 6958853 (IEEE International Workshop on Machine Learning for Signal Processing, MLSP).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Asgari, M, Shafran, I & Bayestehtashk, A 2014, Inferring social contexts from audio recordings using deep neural networks. in M Mboup, T Adali, E Moreau & J Larsen (eds), IEEE International Workshop on Machine Learning for Signal Processing, MLSP., 6958853, IEEE International Workshop on Machine Learning for Signal Processing, MLSP, IEEE Computer Society, 2014 24th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2014, Reims, France, 9/21/14. https://doi.org/10.1109/MLSP.2014.6958853

Asgari M, Shafran I, Bayestehtashk A. Inferring social contexts from audio recordings using deep neural networks. In Mboup M, Adali T, Moreau E, Larsen J, editors, IEEE International Workshop on Machine Learning for Signal Processing, MLSP. IEEE Computer Society. 2014. 6958853. (IEEE International Workshop on Machine Learning for Signal Processing, MLSP). doi: 10.1109/MLSP.2014.6958853

Asgari, Meysam ; Shafran, Izhak ; Bayestehtashk, Alireza. / Inferring social contexts from audio recordings using deep neural networks. IEEE International Workshop on Machine Learning for Signal Processing, MLSP. editor / Mamadou Mboup ; Tulay Adali ; Eric Moreau ; Jan Larsen. IEEE Computer Society, 2014. (IEEE International Workshop on Machine Learning for Signal Processing, MLSP).

@inproceedings{6c8d302d6d3342e4bc3c4f68cc504c83,

title = "Inferring social contexts from audio recordings using deep neural networks",

abstract = "In this paper, we investigate the problem of detecting social contexts from the audio recordings of everyday life such as in life-logs. Unlike the standard corpora of telephone speech or broadcast news, these recordings have a wide variety of background noise. By nature, in such applications, it is difficult to collect and label all the representative noise for learning models in a fully supervised manner. The amount of labeled data that can be expected is relatively small compared to the available recordings. This lends itself naturally to unsupervised feature extraction using sparse auto-encoders, followed by supervised learning of a classifier for social contexts. We investigate different strategies for training these models and report results on a real-world application.",

keywords = "Deep neural networks, Harmonic model, Multi-label classification",

author = "Meysam Asgari and Izhak Shafran and Alireza Bayestehtashk",

note = "Publisher Copyright: {\textcopyright} 2014 IEEE.; 2014 24th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2014 ; Conference date: 21-09-2014 Through 24-09-2014",

year = "2014",

month = nov,

day = "14",

doi = "10.1109/MLSP.2014.6958853",

language = "English (US)",

series = "IEEE International Workshop on Machine Learning for Signal Processing, MLSP",

publisher = "IEEE Computer Society",

editor = "Mamadou Mboup and Tulay Adali and Eric Moreau and Jan Larsen",

booktitle = "IEEE International Workshop on Machine Learning for Signal Processing, MLSP",

}

TY - GEN

T1 - Inferring social contexts from audio recordings using deep neural networks

AU - Asgari, Meysam

AU - Shafran, Izhak

AU - Bayestehtashk, Alireza

PY - 2014/11/14

Y1 - 2014/11/14

N2 - In this paper, we investigate the problem of detecting social contexts from the audio recordings of everyday life such as in life-logs. Unlike the standard corpora of telephone speech or broadcast news, these recordings have a wide variety of background noise. By nature, in such applications, it is difficult to collect and label all the representative noise for learning models in a fully supervised manner. The amount of labeled data that can be expected is relatively small compared to the available recordings. This lends itself naturally to unsupervised feature extraction using sparse auto-encoders, followed by supervised learning of a classifier for social contexts. We investigate different strategies for training these models and report results on a real-world application.

AB - In this paper, we investigate the problem of detecting social contexts from the audio recordings of everyday life such as in life-logs. Unlike the standard corpora of telephone speech or broadcast news, these recordings have a wide variety of background noise. By nature, in such applications, it is difficult to collect and label all the representative noise for learning models in a fully supervised manner. The amount of labeled data that can be expected is relatively small compared to the available recordings. This lends itself naturally to unsupervised feature extraction using sparse auto-encoders, followed by supervised learning of a classifier for social contexts. We investigate different strategies for training these models and report results on a real-world application.

KW - Deep neural networks

KW - Harmonic model

KW - Multi-label classification

UR - http://www.scopus.com/inward/record.url?scp=84912544456&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84912544456&partnerID=8YFLogxK

U2 - 10.1109/MLSP.2014.6958853

DO - 10.1109/MLSP.2014.6958853

M3 - Conference contribution

AN - SCOPUS:84912544456

T3 - IEEE International Workshop on Machine Learning for Signal Processing, MLSP

BT - IEEE International Workshop on Machine Learning for Signal Processing, MLSP

A2 - Mboup, Mamadou

A2 - Adali, Tulay

A2 - Moreau, Eric

A2 - Larsen, Jan

PB - IEEE Computer Society

T2 - 2014 24th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2014

Y2 - 21 September 2014 through 24 September 2014

ER -

Inferring social contexts from audio recordings using deep neural networks

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this