We are exited to announce that shoddh has fully transitioned into a Dapp . Buy tokens @ Drut.ai/token

Categories

Recent Blogs

it is all folks

Oct 7 2020 , 3 min read

it is all folks

Oct 7 2020 , 3 min read

it is all folks

Oct 7 2020 , 3 min read

Anmol Raj

0x675805dF64a3d3646F05aD8B469f159567fa8052

m;klnjbieguhoengepgniegeingepongoenge

Hands-On Topic Modeling with Python

A tutorial on topic modeling using Latent Dirichlet Allocation (LDA) and visualization with pyLDAvis

Jan 15

4 min read

Topic modeling is a popular technique in Natural Language Processing (NLP) and text mining to extract topics of a given text. Utilizing topic modeling we can scan large volumes of unstructured text to detect keywords, topics, and themes.

Topic modeling is an unsupervised machine learning technique and does not need labeled data for model training. It should not be confused with topic classification which is a supervised machine learning technique and needs labeled data for training to fit and learn. In some cases, topic modeling can be used together with topic classification, where we perform topic modeling first to detect topics in a given text and label each record with its corresponding topic. Then this labeled data is used for training a classifier and performing topic classification on unseen data.

In this article, we will focus on topic modeling and cover how to prepare data with text preprocessing, assign the best number of topics with coherence score, extract topics using Latent Dirichlet Allocation (LDA), and visualize topics using pyLDAvis.

While following the article, I encourage you to check out the Jupyter Notebook on my GitHub for full analysis and code.

We have lots of things to cover, let’s get started! 🤓


Subcribe to our Newsletter.