Early topic detection from topic frequency transition


This paper presents a method for early topic detection from blog articles. Existing methods for topic detection are usually based on burst detection. However, bursted topics are generally popular ones. The topics are not valuable information from the viewpoint of marketing. Valuable topics are described in a few blogs and have the potential to spread to many blogs.

Therefore, we propose a method for detectiong potential topics from blogs. First, we collect blog articles from blogers, and the blogers are classified into one category according to the blog article. The system extracts remarkable topics in the blog community (categorized blogers). Then we filter the topics based on its Document Frequency (DF). Next, we create a classifier for detecting a potential topics from the topic frequency transition in the blog community and all blogs. Finally, the system recommends the topics that decided on potentail topics by the classifier. Experimental results using actuall blog data show that the precision is 78.4% and the recall is 83.4% in potential topics detection. The results indicate that our method is effective for early topic detection.