Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation.

S. Wu, X. Chang, G. Wichern, J. weon Jung, F. Germain, J. Roux, and S. Watanabe. CoRR, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Shinji Koiwa

Wataru Watanabe

Shinji Maeda

Rie Watanabe

Norihiro Watanabe

Other publications of authors with the same name

Bayesian Speech and Language ProcessingS. Watanabe, and J. Chien. Cambridge University Press, (2015)High-accuracy user identification using EEG biometrics.T. Koike-Akino, R. Mahajan, T. Marks, Y. Wang, S. Watanabe, O. Tuzel, and P. Orlik. EMBC, page 854-858. IEEE, (2016)Structural Bayesian Linear Regression for Hidden Markov Models.S. Watanabe, A. Nakamura, and B. Juang. J. Signal Process. Syst., 74 (3): 341-358 (2014)Speech Recognition Based on Student's t-Distribution Derived from Total Bayesian Framework.S. Watanabe, and A. Nakamura. IEICE Trans. Inf. Syst., 89-D (3): 970-980 (2006)Language independent end-to-end architecture for joint language identification and speech recognition.S. Watanabe, T. Hori, and J. Hershey. ASRU, page 265-271. IEEE, (2017)Effectiveness of discriminative training and feature transformation for reverberated and noisy speech.Y. Tachioka, S. Watanabe, and J. Hershey. ICASSP, page 6935-6939. IEEE, (2013)Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks.H. Erdogan, J. Hershey, S. Watanabe, and J. Roux. ICASSP, page 708-712. IEEE, (2015)Bag Of ARCS: New representation of speech segment features based on finite state machines.S. Watanabe, Y. Kubo, T. Oba, T. Hori, and A. Nakamura. ICASSP, page 4201-4204. IEEE, (2012)End-to-end Speech Recognition With Word-Based Rnn Language Models.T. Hori, J. Cho, and S. Watanabe. SLT, page 389-396. IEEE, (2018)Application of topic tracking model to language model adaptation and meeting analysis.S. Watanabe, T. Iwata, T. Hori, A. Sako, and Y. Ariki. SLT, page 378-383. IEEE, (2010)

BibSonomy

Disambiguation of "Watanabe, Shinji"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation.

Please choose a person to relate this publication to

Shinji Koiwa

Wataru Watanabe

Shinji Maeda

Rie Watanabe

Norihiro Watanabe

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Watanabe, Shinji"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation.

Please choose a person to relate this publication to

Shinji Koiwa

Wataru Watanabe

Shinji Maeda

Rie Watanabe

Norihiro Watanabe

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation.