trlda.models.OnlineLDA

Class OnlineLDA

object --+ | Distribution --+ | LDA --+ | OnlineLDA

An implementation of an online trust region method for latent Dirichlet allocation.

>>> model = OnlineLDA( num_words=7000, num_topics=100, num_documents=10000, alpha=.1, eta=.3)

alpha can be a scalar or an array with one entry for each topic.

Instance Methods

do_e_step(...)
Alias for update_variables. (Inherited from trlda.models.LDA)

source code

float

lower_bound(docs, num_documents=-1, inference_method='VI', max_iter=100, num_samples=1, burn_in=2)
Estimate lower bound, $\mathcal{L}(\boldsymbol{\lambda})$, for the given set of documents. (Inherited from trlda.models.LDA)

source code

list

sample(self, num_documents, length)
Samples a specified number of documents from the model. (Inherited from trlda.models.LDA)

source code

float

update_parameters(docs, max_iter_tr=10, max_iter_inference=20, kappa=.7, tau=100, **kwargs)
Updates beliefs over parameters.

source code

tuple

update_variables(docs, latents=None, inference_method='VI', max_iter=100, threshold=0.001, num_samples=1, burn_in=2)
Computes beliefs over topic assignments ($z_{di}$) for the given documents. (Inherited from trlda.models.LDA)

source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __init__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Properties

alpha
Controls Dirichlet prior over topic weights, $\theta_k$. (Inherited from trlda.models.LDA)

eta
Controls Dirichlet prior over topics, $\beta_{ki}$. (Inherited from trlda.models.LDA)

lambdas
Parameters governing beliefs over topics, $\beta_{ki}$. (Inherited from trlda.models.LDA)

num_documents
Number of documents in the complete dataset.

num_topics
Number of topics. (Inherited from trlda.models.LDA)

num_words
Number of words. (Inherited from trlda.models.LDA)

update_count
Number of parameter updates.

Inherited from object: __class__

Method Details

update_parameters(docs, max_iter_tr=10, max_iter_inference=20, kappa=.7, tau=100, **kwargs)

source code

Updates beliefs over parameters.

Set max_iter_tr to zero to perform the standard natural gradient step of stochastic variational inference (in this case increase max_iter_inference).

By default, the learning rate is automatically set to

$$\rho_t = (\tau + t)^{-\kappa},$$

where $t$ is the number of calls to this function.

Parameters:

docs (list) - a batch of documents
max_iter_tr (int) - number of steps in trust-region optimization
max_iter_inference (int) - number of variational inference steps per trust-region step
kappa (float) - controls the learning rate decay
tau (float) - decreases intial learning rates
rho (float) - can be used to manually set the learning rate
adaptive (bool) - automatically adapt the learning rate (see Ranganath et al., 2013)
init_gamma (bool) - initialize beliefs over $\boldsymbol{\theta}$ with beliefs of previous trust-region step (default: True)
update_lambda (bool) - if False, don't update beliefs over topics, $\boldsymbol{\lambda}$ (default: True)
update_alpha (bool) - if True, update $\boldsymbol{\alpha}$ via empirical Bayes (default: False)
update_eta (bool) - if True, update $\eta$ via empirical Bayes (default: False)
min_alpha (float) - constrain the $\alpha_k$ to be at least this large (default: 1e-6)
min_eta (float) - constrain $\eta$ to be at least this large (default: 1e-6)
verbosity (int) - controls how many messages are printed

Returns: float

the learning rate used in this update

See Also: update_count