Language Models
NGrams.LanguageModel — Type.LanguageModel(N; bos, eos, estimator=NGrams.MLE())Create an N-gram language model, estimating probabilities with estimator.
Training
NGrams.fit! — Function.NGrams.fit!(lm::LanguageModel, tokens)Train the language model by observing a sequence of tokens.
Probability and Smoothing
NGrams.MLE — Type.NGrams.MLE()Maximum Likelihood Estimation for n-gram language modeling.
NGrams.AddK — Type.NGrams.AddK(k::Number)Add-k probability smoothing for n-gram language modeling.
NGrams.Laplace — Type.NGrams.Laplace()Laplace (add-1) smoothing for n-gram language modeling.
NGrams.LinearInterpolation — Type.LinearInterpolation(λ)Linear interpolation for probability smoothing in n-gram language modeling.
λ should be a vector or tuple of linear coefficients for smoothing the model. The coeffients are ordered occording to the n-gram complexity of the model; i.e., the first element is the weight for the model without any backoff, and the final element is the weight for the unigram model.
NGrams.AbsoluteDiscounting — Type.NGrams.AbsoluteDiscounting(d::Number)Absolute discounting for n-gram language modeling.
Sampling from a language model
NGrams.sample — Function.NGrams.sample([rng::AbstractRNG,] lm, [vocabulary])Sample a single token from the language model.
NGrams.generate — Function.NGrams.generate(lm, num_words=1, text_seed=[])Randomly generate num_words from language model.
If text_seed is provided, output is conditioned on that history. The seed is included in the return value and counts against num_words.