How is the indexing performed?
A: Indexing is the process of creating a Conceptual Fingerprint from a text. In Collexis, this automated indexing mechanism performs the following steps on the text: removing the stop words, normalizing the text, selecting concepts by comparison with the thesaurus, clustering the concepts and attaching a relative weight to the concepts by means of a set of algorithms and measuring the specificity, similarity and frequency of the concepts.
Back to Top
Q: How does Collexis generate its search results?
A: Collexis employs vector matching: comparing a search query with the Fingerprints from the records in a Collexion. The outcome is a very accurate and relevant list of content items and/or experts in the form of a list of records. There also exists the possibility of over-specifying a query (i.e., using a considerable piece of text), thus adding context to the query. This context will help the system to improve the accuracy of the query and return references to those content items that are contextually related. The system administrator can enlarge or reduce the set of returned documents by entering a threshold that indicates the minimum “distance” between the records returned and the query. Matching of a search query with Collexion records can be performed on multiple Collexions at the same time.
Back to Top
Q: What makes Collexis different?
A: Initially, Collexis differentiates itself from full-text search engines by making use of thesauri for information retrieval. The high-quality search is based on semantics that have been defined in a thesaurus or ontology: synonymous terms and terms in different languages are linked to a single concept. Hierarchical relations between concepts, links between definitions and terms, and other semantic relationships are utilized in the search applications. This process helps to highlight those terms most relevant to the searcher’s query.