Optimal Strategy for Big Topic Modeling on Data Streams
Topic Modeling in big data have been still research area with lots of modeling algorithms to design the efficient topic modling strategy. Expectation maximization is typically used to calculate maximum likelihood estimate for given incomplete samples and estimated parameters. This paper proposes a novel framework “Online Expectation-Maximization for Latent Dirichlet Allocation” that shows the topic distribution from the previously unseen documents incrementally with constant memory requirements. It is more efficient for some lifelong topic modeling tasks.
Keywords - Big Data, Topic Model, Data Streaming, Expectation-Maximization.