Techreport,

Multisensor RDF Application Profile

V. Alexiev.
Multisensor Project, Ontotext Corp, (October 2016)

Abstract

The Multisensor project analyzes and extracts data from mass- and social media documents (so-called SIMMOs), including text, images and video, speech recognition and translationn, across several languages. It also handles social network data, statistical data, etc. Early on the project made the decision that all data exchanged between project partners (between modules inside and outside the processing pipeline) will be in RDF JSONLD format. The final data is stored in a semantic repository and is used by various User Interface components for end-user interaction. This final data forms a corpus of semantic data over SIMMOs and is an important outcome of the project. The flexibility of the semantic web model has allowed us to accommodate a huge variety of data in the same extensible model. We use a number of ontologies for representing that data: NIF and OLIA for linguistic info, ITSRDF for NER, DBpedia and Babelnet for entities and concepts, MARL for sentiment, OA for image and cross-article annotations, W3C CUBE for statistical indicators, etc. In addition to applying existing ontologies, we extended them by the Multisensor ontology, and introduced some innovations like embedding FrameNet in NIF. The documentation of this data has been an important ongoing task. It is even more important towards the end of the project, in order to enable the efficient use of MS data by external consumers. This document describes the different RDF patterns used by Multisensor, and how the data fits together. Thus it represents an "RDF Application Profile" for Multisensor. We use an example-based approach, rather than the more formal and labourious approach being standardized by the W3C RDF Shapes working group (still in development). We cover the following areas: 1. Linguistic Linked Data in NLP Interchange Format (NIF), including Part of Speech (POS), dependency parsing, sentiment, Named Entity Recognition (NER), etc. 2. Speech recognition, translation. 3. Multimedia binding and image annotation. 4. Statistical indicators and similar data. 5. Social network popularity and influence, etc.

BibTeX key: Alexiev2016-Multisensor-profile
entry type: techreport
year: 2016
month: oct
institution: Multisensor Project, Ontotext Corp
url_source: https://github.com/VladimirAlexiev/multisensor
Document: http://rawgit2.com/VladimirAlexiev/multisensor/master/index.html

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Report %1 Alexiev2016-Multisensor-profile %A Alexiev, Vladimir %D 2016 %K BabelNet CUBE FrameNet ITSRDF MARL Multisensor NERD NIF NLP NLP2RDF OLIA WordNet %T Multisensor RDF Application Profile %U http://rawgit2.com/VladimirAlexiev/multisensor/master/index.html %X The Multisensor project analyzes and extracts data from mass- and social media documents (so-called SIMMOs), including text, images and video, speech recognition and translationn, across several languages. It also handles social network data, statistical data, etc. Early on the project made the decision that all data exchanged between project partners (between modules inside and outside the processing pipeline) will be in RDF JSONLD format. The final data is stored in a semantic repository and is used by various User Interface components for end-user interaction. This final data forms a corpus of semantic data over SIMMOs and is an important outcome of the project. The flexibility of the semantic web model has allowed us to accommodate a huge variety of data in the same extensible model. We use a number of ontologies for representing that data: NIF and OLIA for linguistic info, ITSRDF for NER, DBpedia and Babelnet for entities and concepts, MARL for sentiment, OA for image and cross-article annotations, W3C CUBE for statistical indicators, etc. In addition to applying existing ontologies, we extended them by the Multisensor ontology, and introduced some innovations like embedding FrameNet in NIF. The documentation of this data has been an important ongoing task. It is even more important towards the end of the project, in order to enable the efficient use of MS data by external consumers. This document describes the different RDF patterns used by Multisensor, and how the data fits together. Thus it represents an "RDF Application Profile" for Multisensor. We use an example-based approach, rather than the more formal and labourious approach being standardized by the W3C RDF Shapes working group (still in development). We cover the following areas: 1. Linguistic Linked Data in NLP Interchange Format (NIF), including Part of Speech (POS), dependency parsing, sentiment, Named Entity Recognition (NER), etc. 2. Speech recognition, translation. 3. Multimedia binding and image annotation. 4. Statistical indicators and similar data. 5. Social network popularity and influence, etc.

@techreport{Alexiev2016-Multisensor-profile, abstract = {The Multisensor project analyzes and extracts data from mass- and social media documents (so-called SIMMOs), including text, images and video, speech recognition and translationn, across several languages. It also handles social network data, statistical data, etc. Early on the project made the decision that all data exchanged between project partners (between modules inside and outside the processing pipeline) will be in RDF JSONLD format. The final data is stored in a semantic repository and is used by various User Interface components for end-user interaction. This final data forms a corpus of semantic data over SIMMOs and is an important outcome of the project. The flexibility of the semantic web model has allowed us to accommodate a huge variety of data in the same extensible model. We use a number of ontologies for representing that data: NIF and OLIA for linguistic info, ITSRDF for NER, DBpedia and Babelnet for entities and concepts, MARL for sentiment, OA for image and cross-article annotations, W3C CUBE for statistical indicators, etc. In addition to applying existing ontologies, we extended them by the Multisensor ontology, and introduced some innovations like embedding FrameNet in NIF. The documentation of this data has been an important ongoing task. It is even more important towards the end of the project, in order to enable the efficient use of MS data by external consumers. This document describes the different RDF patterns used by Multisensor, and how the data fits together. Thus it represents an "RDF Application Profile" for Multisensor. We use an example-based approach, rather than the more formal and labourious approach being standardized by the W3C RDF Shapes working group (still in development). We cover the following areas: 1. Linguistic Linked Data in NLP Interchange Format (NIF), including Part of Speech (POS), dependency parsing, sentiment, Named Entity Recognition (NER), etc. 2. Speech recognition, translation. 3. Multimedia binding and image annotation. 4. Statistical indicators and similar data. 5. Social network popularity and influence, etc.}, added-at = {2021-08-25T16:07:36.000+0200}, author = {Alexiev, Vladimir}, biburl = {https://www.bibsonomy.org/bibtex/22fd7b8d32019510ef931792ce3f0cd63/valexiev}, institution = {Multisensor Project, Ontotext Corp}, interhash = {d375153e459feb1c705d17295d70fdc0}, intrahash = {2fd7b8d32019510ef931792ce3f0cd63}, keywords = {BabelNet CUBE FrameNet ITSRDF MARL Multisensor NERD NIF NLP NLP2RDF OLIA WordNet}, month = oct, timestamp = {2021-08-25T16:07:36.000+0200}, title = {Multisensor RDF Application Profile}, url = {http://rawgit2.com/VladimirAlexiev/multisensor/master/index.html}, url_source = {https://github.com/VladimirAlexiev/multisensor}, year = 2016 }

BibSonomy

Multisensor RDF Application Profile

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on