Abstract
With the development of information technology, there has been an increased popularity in the use of electronic texts. Topic detection and tracking can identify hot information from isolated texts. Obtaining hot topics has become an important issue in recent years. The combination of statistics and natural language processing was utilized in the current study to discover hot topics from texts. First, the statistics technique was adopted to obtain frequent and high weighted words. Then, the linguistic grammar rules were used to generate the candidate phrases. Finally, the hot phrase topics were obtained based on the weight computation method of phrases. Experiment results showed that the proposed approach was effective in that the extracted topics can express more comprehensive information. This study's results are meaningful in the areas of text classification, text clustering, information retrieval, and construction of high-quality Tibetan corpus.
Original language | English (US) |
---|---|
Pages (from-to) | 267-272 |
Number of pages | 6 |
Journal | Journal of Digital Information Management |
Volume | 12 |
Issue number | 4 |
State | Published - Aug 1 2014 |
Keywords
- Feature selection
- Hot topic discovery
- Information extraction
- Topic detection
- Weight computation
ASJC Scopus subject areas
- Management Information Systems
- Information Systems
- Library and Information Sciences