Discourse segmentation in aid of document summarization
B.K. Boguraev, Mary S. Neff
HICSS 2000
In this paper, we investigate the performance of grammar-based codes for sources with countably infinite alphabets. Let Λ denote an arbitrary class of stationary, ergodic sources with a countably infinite alphabet. It is shown that grammar-based codes can be modified so that they are universal with respect to any Λ if and only if there exists a universal code for Λ. Moreover, upper bounds on the worst case redundancies of grammar-based codes among large sets of length-n individual sequences from a countably infinite alphabet are established. Depending upon the conditions satisfied by length-n individual sequences, these bounds range from O(log log n/log n) to O(1/log1-α n) for some 0 < α < 1. These results complement the previous universality and redundancy results in the literature on the performance of grammar-based codes for sources with finite alphabets. © 2005 IEEE.
B.K. Boguraev, Mary S. Neff
HICSS 2000
Kafai Lai, Alan E. Rosenbluth, et al.
SPIE Advanced Lithography 2007
Inbal Ronen, Elad Shahar, et al.
SIGIR 2009
Robert G. Farrell, Catalina M. Danis, et al.
RecSys 2012