entry of diego_ma and 1 other user:
(0)
This publication has not been reviewed yet.
rating distribution
average user rating
?
The average rating is computed over all reviews. However, some of them may be invisible to you due to the visibility setting chosen by the reviewers.
Resources (URL, PDF, PS...)
Abstract
Summarization research is notorious for its lack of adequatecorpora: today, there exist only a few small collections oftexts whose units have been manually annotated for textualimportance. Given the cost and tediousness of the annota-tion process, it is very unlikely that we will ever manuallyannotate for textual importance sufficiently large corpora oftexts. To circumvent this problem, we have developed analgorithm that constructs such corpora automatically.Our algorithm takes as input an $<$Abstract, Text$>$ tuple andgenerates the corresponding Extract, i.e., the set of clausessentences in the Text that were used to write the Abstract.The performance of the algorithm is shown to be close to thatof humans by means of an empirical experiment. The exper-iment also suggests extraction strategies that could improvethe performance of automatic summarization systems.


publication