Package trlda :: Package models :: Class BatchLDA
[frames] | no frames]

Class BatchLDA

source code

  object --+        
Distribution --+    
             LDA --+

An implementation of latent Dirichlet allocation (LDA).


>>> documents = load_data('data_train.dat')
>>> model = BatchLDA(num_words=7000, num_topics=100, alpha=.1, eta=.3)
>>> model.update_parameters(documents, max_epochs=100)

alpha can be a scalar or an array with one entry for each topic.

Instance Methods
Alias for update_variables. (Inherited from trlda.models.LDA)
source code
lower_bound(docs, num_documents=-1, inference_method='VI', max_iter=100, num_samples=1, burn_in=2)
Estimate lower bound, $\mathcal{L}(\boldsymbol{\lambda})$, for the given set of documents. (Inherited from trlda.models.LDA)
source code
sample(self, num_documents, length)
Samples a specified number of documents from the model. (Inherited from trlda.models.LDA)
source code
update_parameters(docs, max_epochs=100, max_iter_inference=100, **kwargs)
Updates beliefs over parameters.
source code
update_variables(docs, latents=None, inference_method='VI', max_iter=100, threshold=0.001, num_samples=1, burn_in=2)
Computes beliefs over topic assignments ($z_{di}$) for the given documents. (Inherited from trlda.models.LDA)
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __init__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Controls Dirichlet prior over topic weights, $\theta_k$. (Inherited from trlda.models.LDA)
Controls Dirichlet prior over topics, $\beta_{ki}$. (Inherited from trlda.models.LDA)
Parameters governing beliefs over topics, $\beta_{ki}$. (Inherited from trlda.models.LDA)
Number of topics. (Inherited from trlda.models.LDA)
Number of words. (Inherited from trlda.models.LDA)

Inherited from object: __class__

Method Details

update_parameters(docs, max_epochs=100, max_iter_inference=100, **kwargs)

source code 

Updates beliefs over parameters.

  • docs (list) - a batch of documents
  • max_epochs (int) - number of repeated updates to parameters and hyperparameters
  • max_iter_inference (int) - number of variational inference steps per iteration
  • max_iter_alpha (int) - number of Newton steps applied to $\boldsymbo{\alpha}$ per iteration
  • max_iter_eta (int) - number of Newton steps applied to $\eta$ per iteration
  • update_lambda (bool) - if False, don't update beliefs over topics, $\boldsymbol{\lambda}$ (default: True)
  • update_alpha (bool) - if True, update $\boldsymbol{\alpha}$ via empirical Bayes (default: False)
  • update_eta (bool) - if True, update $\eta$ via empirical Bayes (default: False)
  • min_alpha (float) - constrain the $\alpha_k$ to be at least this large (default: 1e-6)
  • min_eta (float) - constrain $\eta$ to be at least this large (default: 1e-6)
  • emp_bayes_threshold (float) - used to stop empirical Bayes updates when parameters don't change (default: 1e-8)
  • verbosity (int) - controls how many messages are printed