Abstract
The use of genetic programming for probabilistic
pattern matching is investigated. A stochastic regular
expression language is used. The language features a
statistically sound semantics, as well as a syntax that
promotes efficient manipulation by genetic programming
operators. An algorithm for efficient string
recognition based on approaches in conventional regular
language recognition is used. When attempting to
recognize a particular test string, the recognition
algorithm computes the probabilities of generating that
string and all its prefixes with the given stochastic
regular expression. To promote efficiency, intermediate
computed probabilities that exceed a given cut-off
value will pre-empt particular interpretation paths,
and hence prune unconstructive interpretation. A few
experiments in recognizing stochastic regular languages
are discussed. Application of the technology in
bioinformatics is in progress.
Users
Please
log in to take part in the discussion (add own reviews or comments).