Response blog on topic modeling

The ones who are invited or planning to be in the buffet must have the idea of what is served on the menu. The background knowledge that is what I am referring to. The one perhaps is more important than the computational process or even the decision making process of how many latent topics are waiting to be discovered within a seemingly large pool of documents aka corpus. The magic is fascinating but overwhelming if the magician can not communicate with the audience. In the case of topic modeling, the magic is the machine with computational ability that does not have the ability to express on its own(not sentient!!!). There is a role for a magician, an expert to make the magic audience enchanting, putting context to the result.

Let me provide an example. A while back, I conducted an experiment of topic modeling on the DigitalNZ archive of historical newspapers. The result is the following interactive illustration of topics. I decided to uncover 20 topics that are more prevalent during the New Zealand Wars in the 1800s.

LDA

The interactive visualization is available in the following URL

https://zicoabhidey.github.io/pyldavis-topic-modeling-visualization#topic=0&lambda=1&term=

I played the role of the magician to demystify the result and present it to a broader audience who is not a historian by no means. I used intuition to and lent the superpower of Google to support my intuition to derive the bowl that represents the topic. Below is the result I came up with. I could not produce all the 20 topics I was hoping to find.

TopicsExplanation
“gun”, “heavy”, “colonial_secretary”, “news”, “urge”, “tax”, “thank”, “mail”, “night”‘Implying political movement and communication during the pre-independence declaration period.
“bill”, “payment”, “say”, “issue”, “sum”, “notice”, “pay”, “deed”, “amount”, “person”Business-related affairs after the independence declaration.
“distance”, “iron”, “firm”, “dress”, “black”, “mill”, “cloth”, “box”, “wool”, “bar”Representing industrial affairs mostly related to garments.
“Vessel”, “day”, “take”, “place”, “leave”, “fire”, “ship”, “native”, “water”, “captain”Represent maritime activities or war from a port city like Wellington.
“land”, “acre”, “company”, “town”, “sale” , “road”, “country”, “plan”, “district”, “section”Representing real-estate-related activities.
“year”, “make”, “receive”, “take”, “last”, “state”, “new”, “colony”, “great”, “give”No clear association.
“sail”, “master”, “day”, “passage”, “auckland”, “port”, “brig”, “passenger”, “agent”, “freight”Representing shipping activities related to Auckland port.
“Say”, “go”, “court”, “take”, “kill”, “prisoner”, “try”, “come”, “witness”, “give”Representing judicial activities and crime news.
“boy”, “pull”, “flag_staff”, “mount_albert”, “white_pendant”, “descriptive_signal”, “lip”, “battle”, “bride”, “signals_use”Representing traditional stories about Maori Myth and Legend regarding mount Albert.
Table 1: some of the topics and explanations from gensim LDA model
TopicsExplanation
‘land’, ‘company’, ‘purchase’, ‘colony’, ‘claim’, ‘price’, ‘acre’, ‘make’, ‘system’, ‘title’Representing real-estate-related activities.
‘native’, ‘man’, ‘fire’, ‘captain’, ‘leave’, ‘place’, ‘officer’, ‘arrive’, ‘chief’,  ‘make’Representing news regarding New Zealand War. 
‘government’, ‘native’, ‘country’, ‘settler’, ‘colony’, ‘man’, ‘act’, ‘people’, ‘law’Representing news about the sovereignty treaty signed in 1835.
‘mile’, ‘water’, ‘river’, ‘vessel’, ‘foot’, ‘island’, ‘native’, ‘side’, ‘boat’, ‘harbour’Representing maritime activities from a port city like Wellington.
‘settlement’, ‘company’, ‘make’,’war’, ‘place’, ‘port_nicholson’, ‘settler’, ‘state’, ‘colonist’, ‘colony’Representing news about Port Nicholson during the war in Wellington 1839
Table 2: some of the topics and explanations from gensim Mallet model

After working long hours on this project, although I am super delighted that I have produced an interactive visualization of the mallet model which is hard to produce, there is always a feeling of disappointment that I did not have the knowledge of the historian. A historian with special knowledge of New Zealand’s history might have judged better.

Attending buffet without the knowledge of the menu is like sailing a boat without the compass but isn’t it what the distant reading is? A calculated leap of faith.