In order to develop machine learning and deep learning models that take into account the guidelines and principles of trustworthy AI, a novel information theoretic approach is introduced in this article. A unified approach to privacy-preserving interpretable and transferable learning is considered for studying and optimizing the trade-offs between the privacy, interpretability, and transferability aspects of trustworthy AI. A variational membership-mapping Bayesian model is used for the analytical approximation of the defined information theoretic measures for privacy leakage, interpretability, and transferability. The approach consists of approximating the information theoretic measures by maximizing a lower-bound using variational optimization. The approach is demonstrated through numerous experiments on benchmark datasets and a real-world biomedical application concerned with the detection of mental stress in individuals using heart rate variability analysis.
S. Prasanna, A. Rogers, and A. Rumshisky. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), page 3208--3229. Online, Association for Computational Linguistics, (November 2020)
H. Wu, Y. Liu, and S. Shi. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), page 2786--2792. Online, Association for Computational Linguistics, (November 2020)
D. Gaddy, M. Stern, and D. Klein. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), page 999--1010. New Orleans, Louisiana, Association for Computational Linguistics, (June 2018)
Z. Wang, R. Turko, O. Shaikh, H. Park, N. Das, F. Hohman, M. Kahng, and D. Chau. (2020)cite arxiv:2004.15004Comment: 11 pages, 14 figures. For a demo video, see https://youtu.be/HnWIHWFbuUQ For a live demo, visit https://poloclub.github.io/cnn-explainer/.