Stanford CoreNLP provides a set of human language technology tools. It can give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases and syntactic dependencies, indicate which noun phrases refer to the same entities, indicate sentiment, extract particular or open-class relations between entity mentions, get the quotes people said, etc.
A dependency parser analyzes the grammatical structure of a sentence, establishing relationships between "head" words and words which modify those heads.
Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
JaMoPP is a set of Eclipse plug-ins that can be used to parse Java source code into EMF-based models and vice versa. JaMoPP consists of:
a complete Java5 Ecore Metamodel,
a complete Java5 EMFText Syntax, and
an implementation of Java5's static semantics analysis.
Through JaMoPP, every Java program can be processed as any other EMF model. JaMoPP therefore bridges the gap between modelling and Java programming. It enables the application of arbitrary EMF-based tools on full Java programs. Since JaMoPP is developed through metamodelling and code generation, extending Java and embedding Java into other modelling languages, using standard metamodeling techniques and tools, is now possible. To ensure the quality of JaMoPP, it has been successfully tested on a large code base.
YAPP XSLT is a lexical scanner and recursive descent parser generator, implemented in XSLT. No language extensions or non-standard features are used apart from the nodeset() function. Grammars are expressed in XML form and transformed by the generator stylesheet into another XSLT. A lexical scanner may also be generated from the same grammar.
Since programmers often build task-specific tools, one way to make them more productive is to give them better tool-making tools. When tools take the form of program generators, this idea leads to libraries for creating languages that are directly extensible. Programmers may even be encouraged to think about a problem in terms of a language that would better support the task. This approach is sometimes called language-oriented programming
D. Mollá, и B. Hutchinson. Proceedings of the EACL 2003 Workshop on EvaluationInitiatives in Natural Language Processing: are evaluation methods,metrics and resources reusable?, стр. 43--50. Association for Computational Linguistics, (2003)