Attended by over 70 participants, the second Workshop on Knowledge Graphs for Social Good (KG4SG) featured an amazing variety of applications of knowledge graphs as well as lively discussion by our speakers and panel of experts.
The purpose of these datasets is to support equivalence and subsumption ontology matching. There are five ontology pairs extracted from MONDO and UMLS: Source Ontology Pair Category MONDO OMIM-ORDO Disease MONDO NCIT-DOID Disease UMLS SNOMED-FMA Body UMLS SNOMED-NCIT Pharm UMLS SNOMED-NCIT Neoplas Each pair is associated with three folders: "raw_data", "equiv_match", and "subs_match", corresponding to the downloaded source ontologies, the package for equivalence matching, and the package for subsumption matching. See detailed documentation at: https://krr-oxford.github.io/DeepOnto/#/om_resources. See the incoming OAEI Bio-ML track at: https://www.cs.ox.ac.uk/isg/projects/ConCur/oaei/. See our resource paper at: https://arxiv.org/abs/2205.03447.
Much of the knowledge and information needed for enabling high-quality clinical research is stored in free-text format. Natural language processing (NLP) has been used to extract information from these sources at scale for several decades. This paper aims to present a comprehensive review of clinical NLP for the past 15 years in the UK to identify the community, depict its evolution, analyse methodologies and applications, and identify the main barriers. We collect a dataset of clinical NLP projects (n = 94; £ = 41.97 m) funded by UK funders or the European Union’s funding programmes. Additionally, we extract details on 9 funders, 137 organisations, 139 persons and 431 research papers. Networks are created from timestamped data interlinking all entities, and network analysis is subsequently applied to generate insights. 431 publications are identified as part of a literature review, of which 107 are eligible for final analysis. Results show, not surprisingly, clinical NLP in the UK has increased substantially in the last 15 years: the total budget in the period of 2019–2022 was 80 times that of 2007–2010. However, the effort is required to deepen areas such as disease (sub-)phenotyping and broaden application domains. There is also a need to improve links between academia and industry and enable deployments in real-world settings for the realisation of clinical NLP’s great potential in care delivery. The major barriers include research and development access to hospital data, lack of capable computational resources in the right places, the scarcity of labelled data and barriers to sharing of pretrained models.
Implementation and demo of explainable coding of clinical notes with Hierarchical Label-wise Attention Networks (HLAN) - acadTags/Explainable-Automated-Medical-Coding
We propose a novel attention network for document annotation with user-generated tags. The network is designed according to the human reading and annotation behaviour. Usually, users try to digest the title and obtain a rough idea about the topic first, and then read the content of the document. Present research shows that the title metadata could largely affect the social annotation. To better utilise this information, we design a framework that separates the title from the content of a document and apply a title-guided attention mechanism over each sentence in the content. We also propose two semanticbased loss regularisers that enforce the output of the network to conform to label semantics, i.e. similarity and subsumption. We analyse each part of the proposed system with two real-world open datasets on publication and question annotation. The integrated approach, Joint Multi-label Attention Network (JMAN), significantly outperformed the Bidirectional Gated Recurrent Unit (Bi-GRU) by around 13%-26% and the Hierarchical Attention Network (HAN) by around 4%-12% on both datasets, with around 10%-30% reduction of training time.
M. Falis, H. Dong, A. Birch, and B. Alex. Proceedings of the 21st Workshop on Biomedical Language Processing, page 389--401. Dublin, Ireland, Association for Computational Linguistics, (May 2022)
M. Falis, H. Dong, A. Birch, and B. Alex. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, page 907--912. Online and Punta Cana, Dominican Republic, Association for Computational Linguistics, (November 2021)
H. Dong, V. Suárez-Paniagua, W. Whiteley, and H. Wu. (2020)cite arxiv:2010.15728Comment: Structured abstract in full text, 17 pages, 5 figures, 4 supplementary materials (3 extra pages), submitted to Journal of Biomedical Informatics.
H. Dong, W. Wang, K. Huang, and F. Coenen. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), page 1348--1354. Minneapolis, Minnesota, Association for Computational Linguistics, (June 2019)
Y. Chen, H. Dong, and W. Wang. Proceedings of the 2018 International Conference on Data Science and Information Technology, page 138--143. New York, NY, USA, ACM, (2018)
J. Kulshrestha. Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion, page 159--162. New York, NY, USA, ACM, (2016)
J. Lorince, K. Joseph, and P. Todd. Social Computing, Behavioral-Cultural Modeling, and Prediction, volume 9021 of Lecture Notes in Computer Science, Springer International Publishing, (2015)
S. Doerfel, D. Zoller, P. Singer, T. Niebler, A. Hotho, and M. Strohmaier. Proceedings of the 16th LWA Workshops: KDML, IR and FGWM, Aachen, Germany, September 8-10, 2014., volume 1226 of CEUR Workshop Proceedings, page 18--19. CEUR-WS.org, (2014)
C. Wagner, P. Singer, M. Strohmaier, and B. Huberman. Proceedings of the 23rd International Conference on World Wide Web, page 735--746. Republic and Canton of Geneva, Switzerland, International World Wide Web Conferences Steering Committee, (2014)
P. Heymann, D. Ramage, and H. Garcia-Molina. Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, page 531--538. New York, NY, USA, ACM, (2008)
R. Jäschke, B. Krause, A. Hotho, and G. Stumme. Proceedings of the Second International Conference on Weblogs and Social Media (ICWSM 2008), page 192--193. Menlo Park, CA, USA, AAAI Press, (2008)
A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Proceedings of the First International Conference on Semantic and Digital Media Technologies, page 56--70. Berlin, Heidelberg, Springer-Verlag, (2006)