Show simple item record

dc.contributor.author BOBICEV, Victoria
dc.contributor.author POPESCU, Anatol
dc.contributor.author ZIDRAŞCO, Tatiana
dc.date.accessioned 2019-11-12T10:23:03Z
dc.date.available 2019-11-12T10:23:03Z
dc.date.issued 2005
dc.identifier.citation BOBICEV, Victoria, POPESCU, Anatol, ZIDRAŞCO, Tatiana. Statistical models of language and Zipf’s law. In: Microelectronics and Computer Science: proc. of the 4th intern. conf., September 15-17, 2005. Chişinău, 2005, vol. 2, pp. 133-136. ISBN 9975-66-038-X. en_US
dc.identifier.isbn 9975-66-038-X
dc.identifier.uri http://repository.utm.md/handle/5014/6693
dc.description.abstract Statistical models based on text words became very widespread for the last years. Estimation of words never met in corpus is one of word probability estimation subtasks. Attempts to find the number of never met words, using Zipf’s formula give rather big values for the words never met in corpus. Making several experiments we observed that the number of words never met in corpus is proportional to the number of words met only once and depends on the text vocabulary. If the following texts are of the same type with corpus, estimation of never met words is rather adequate. But if the following texts differ from the corpus, the number of never met words can either increase or decrease considerably. en_US
dc.language.iso en en_US
dc.publisher Technical University of Moldova en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject Zipf law en_US
dc.subject statistical language modelling en_US
dc.subject statistical models en_US
dc.subject zero frequency en_US
dc.title Statistical models of language and Zipf’s law en_US
dc.type Article en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search DSpace


Browse

My Account