sezanzeb.de

Machine Learning

Ensemble LDA

July 28, 2019

LDA is a widely known model that finds topics in text. But, similar to k-means, you have to decide how many topics to look for. The stochastic nature of machine learning algorithms also causes the model to have instabilities, which results in topics being a mixture of multiple topics. This is the issue that led to the development of the “Ensemble LDA” model, which detects the stable topics in the text by using an ensemble of LDA topic models. The model detects and uses topic mixtures for stabilization.

Python Package

The pull request is ongoing: https://github.com/RaRe-Technologies/gensim/pull/2980.

Thesis Download

EnsembleLDA.pdf

Special Thanks

Thanks to aloosley for his support during the thesis and contributions to the PR.

Also thanks to mpenkov and radim for their code reviews and continuous interest in the work.