Abstract
In this paper, we present an approach to term classification based
on verb selectional patterns (VSPs), where such a pattern is defined
as a set of semantic classes that could be used in combination with
a given domain-specific verb. VSPs have been automatically learnt
based on the information found in a corpus and an ontology in the
biomedical domain. Prior to the learning phase, the corpus is terminologically
processed: term recognition is performed by both looking up the dictionary
of terms listed in the ontology and applying the C/NC-value method
for on-the-fly term extraction. Subsequently, domain-specific verbs
are automatically identified in the corpus based on the frequency
of occurrence and the frequency of their co-occurrence with terms.
VSPs are then learnt automatically for these verbs. Two machine learning
approaches are presented. The first approach has been implemented
as an iterative generalisation procedure based on a partial order
relation induced by the domain-specific ontology. The second approach
exploits the idea of genetic algorithms. Once the VSPs are acquired,
they can be used to classify newly recognised terms co-occurring
with domain-specific verbs. Given a term, the most frequently co-occurring
domain-specific verb is selected. Its VSP is used to constrain the
search space by focusing on potential classes of the given term.
A nearest-neighbour approach is then applied to select a class from
the constrained space of candidate classes. The most similar candidate
class is predicted for the given term. The similarity measure used
for this purpose combines contextual, lexical, and syntactic properties
of terms.
Users
Please
log in to take part in the discussion (add own reviews or comments).