There are currently few datasets appropriate for training and evaluating models for non-goal-oriented dialogue systems (chatbots); and equally problematic, there is currently no standard procedure for evaluating such models beyond the classic Turing test.
The aim of our competition is therefore to establish a concrete scenario for testing chatbots that aim to engage humans, and become a standard evaluation tool in order to make such systems directly comparable.
The Natural Language Decathlon (decaNLP) is a new benchmark for studying general NLP models that can perform a variety of complex, natural language tasks.
R. Jäschke, A. Hotho, F. Mitzlaff, and G. Stumme. Recommender Systems for the Social Web, volume 32 of Intelligent Systems Reference Library, Springer, Berlin/Heidelberg, (2012)
R. Jäschke, A. Hotho, F. Mitzlaff, and G. Stumme. Recommender Systems for the Social Web, volume 32 of Intelligent Systems Reference Library, Springer, Berlin/Heidelberg, (2012)
R. Jäschke, A. Hotho, F. Mitzlaff, and G. Stumme. Recommender Systems for the Social Web, volume 32 of Intelligent Systems Reference Library, Springer, Berlin/Heidelberg, (2012)
R. Jäschke, A. Hotho, F. Mitzlaff, and G. Stumme. Recommender Systems for the Social Web, volume 32 of Intelligent Systems Reference Library, Springer, Berlin/Heidelberg, (2012)
R. Jäschke, A. Hotho, F. Mitzlaff, and G. Stumme. Recommender Systems for the Social Web, volume 32 of Intelligent Systems Reference Library, Springer, Berlin/Heidelberg, (2012)
A. Hotho, D. Benz, R. Jäschke, and B. Krause (Eds.) Workshop at 18th Europ. Conf. on Machine Learning (ECML'08) / 11th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'08), (2008)