Package trlda :: Package models :: Class OnlineLDA
[frames] | no frames]

Class OnlineLDA

source code

  object --+        
           |        
Distribution --+    
               |    
             LDA --+
                   |
                  OnlineLDA

An implementation of an online trust region method for latent Dirichlet allocation.

>>> model = OnlineLDA(
        num_words=7000,
        num_topics=100,
        num_documents=10000,
        alpha=.1,
        eta=.3)

alpha can be a scalar or an array with one entry for each topic.

Instance Methods
 
do_e_step(...)
Alias for update_variables. (Inherited from trlda.models.LDA)
source code
float
lower_bound(docs, num_documents=-1, inference_method='VI', max_iter=100, num_samples=1, burn_in=2)
Estimate lower bound, $\mathcal{L}(\boldsymbol{\lambda})$, for the given set of documents. (Inherited from trlda.models.LDA)
source code
list
sample(self, num_documents, length)
Samples a specified number of documents from the model. (Inherited from trlda.models.LDA)
source code
float
update_parameters(docs, max_iter_tr=10, max_iter_inference=20, kappa=.7, tau=100, **kwargs)
Updates beliefs over parameters.
source code
tuple
update_variables(docs, latents=None, inference_method='VI', max_iter=100, threshold=0.001, num_samples=1, burn_in=2)
Computes beliefs over topic assignments ($z_{di}$) for the given documents. (Inherited from trlda.models.LDA)
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __init__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Properties
  alpha
Controls Dirichlet prior over topic weights, $\theta_k$. (Inherited from trlda.models.LDA)
  eta
Controls Dirichlet prior over topics, $\beta_{ki}$. (Inherited from trlda.models.LDA)
  lambdas
Parameters governing beliefs over topics, $\beta_{ki}$. (Inherited from trlda.models.LDA)
  num_documents
Number of documents in the complete dataset.
  num_topics
Number of topics. (Inherited from trlda.models.LDA)
  num_words
Number of words. (Inherited from trlda.models.LDA)
  update_count
Number of parameter updates.

Inherited from object: __class__

Method Details

update_parameters(docs, max_iter_tr=10, max_iter_inference=20, kappa=.7, tau=100, **kwargs)

source code 

Updates beliefs over parameters.

Set max_iter_tr to zero to perform the standard natural gradient step of stochastic variational inference (in this case increase max_iter_inference).

By default, the learning rate is automatically set to

$$\rho_t = (\tau + t)^{-\kappa},$$

where $t$ is the number of calls to this function.

Parameters:
  • docs (list) - a batch of documents
  • max_iter_tr (int) - number of steps in trust-region optimization
  • max_iter_inference (int) - number of variational inference steps per trust-region step
  • kappa (float) - controls the learning rate decay
  • tau (float) - decreases intial learning rates
  • rho (float) - can be used to manually set the learning rate
  • adaptive (bool) - automatically adapt the learning rate (see Ranganath et al., 2013)
  • init_gamma (bool) - initialize beliefs over $\boldsymbol{\theta}$ with beliefs of previous trust-region step (default: True)
  • update_lambda (bool) - if False, don't update beliefs over topics, $\boldsymbol{\lambda}$ (default: True)
  • update_alpha (bool) - if True, update $\boldsymbol{\alpha}$ via empirical Bayes (default: False)
  • update_eta (bool) - if True, update $\eta$ via empirical Bayes (default: False)
  • min_alpha (float) - constrain the $\alpha_k$ to be at least this large (default: 1e-6)
  • min_eta (float) - constrain $\eta$ to be at least this large (default: 1e-6)
  • verbosity (int) - controls how many messages are printed
Returns: float
the learning rate used in this update

See Also: update_count