Randomized Language Models via Perfect Hash Functions
D. Talbot, и T. Brants. Proceedings of ACL-08: HLT, стр. 505--513. Columbus, Ohio, Association for Computational Linguistics, (июня 2008)
Аннотация
We propose a succinct randomized language model which employs a perfect hash function to encode fingerprints of n-grams and
their associated probabilities, backoff weights, or other parameters. The scheme can represent any standard n-gram model and is easily combined with existing model reduction techniques such as entropy-pruning. We demonstrate the space-savings of the scheme via machine translation experiments within a distributed language modeling framework.
%0 Conference Paper
%1 talbot-brants:2008:ACLMain
%A Talbot, David
%A Brants, Thorsten
%B Proceedings of ACL-08: HLT
%C Columbus, Ohio
%D 2008
%I Association for Computational Linguistics
%K compression hash lm lossy perfect randomized
%P 505--513
%T Randomized Language Models via Perfect Hash Functions
%U http://www.aclweb.org/anthology/P/P08/P08-1058
%X We propose a succinct randomized language model which employs a perfect hash function to encode fingerprints of n-grams and
their associated probabilities, backoff weights, or other parameters. The scheme can represent any standard n-gram model and is easily combined with existing model reduction techniques such as entropy-pruning. We demonstrate the space-savings of the scheme via machine translation experiments within a distributed language modeling framework.
@inproceedings{talbot-brants:2008:ACLMain,
abstract = {We propose a succinct randomized language model which employs a perfect hash function to encode fingerprints of n-grams and
their associated probabilities, backoff weights, or other parameters. The scheme can represent any standard n-gram model and is easily combined with existing model reduction techniques such as entropy-pruning. We demonstrate the space-savings of the scheme via machine translation experiments within a distributed language modeling framework.
},
added-at = {2008-11-26T10:47:05.000+0100},
address = {Columbus, Ohio},
author = {Talbot, David and Brants, Thorsten},
biburl = {https://www.bibsonomy.org/bibtex/23b4fd5433926f7a34f060d8f92f57394/jjv},
booktitle = {Proceedings of ACL-08: HLT},
interhash = {2cd29903c29c40ca1bfeb6549176eb3e},
intrahash = {3b4fd5433926f7a34f060d8f92f57394},
keywords = {compression hash lm lossy perfect randomized},
month = {June},
pages = {505--513},
publisher = {Association for Computational Linguistics},
timestamp = {2008-11-26T10:47:06.000+0100},
title = {Randomized Language Models via Perfect Hash Functions},
url = {http://www.aclweb.org/anthology/P/P08/P08-1058},
year = 2008
}