Font Size: a A A

Research On Key Technologies Of Programming Forum Search

Posted on:2011-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:X R YangFull Text:PDF
GTID:2178330338479971Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of internet, the forum began to appear and develop rapidly. Currently, the forum almost covers our daily life and each fields of work, after several years, a huge number of high quality knowledge has been accumu-lated, however, the post of forum can hardly be used because they could be cov-ered by later posts. So, to meet the needs of users, we should make full use of knowledge, then, we should study and design retrieval technologies suitable for forum. In such a context, this paper presents the research on key technologies of program forum search engine.Firstly, after the analysis of source code of page from programming forum, and taking into account the needs of retrieval systems, we design a forum data collecting system based on regular expressions, it change semi-structured data into structured data and store, and optimize the data collector according to the actual needs.Secondly, after the analysis the data features of the programming forum, we propose solution strategy for each feature. As there are a lot of useless posts in the forum which lower the results of search engine, we extract the key posts of thread and design retrieval model based on key posts. Because there is dialogue relationship between posts in the same thread, we use ranking support vector machine to mine forum structure and design retrieval model based on the forum structure. Experiment results show that key posts extraction, and structure mining of forum can improve the performance of the forum retrieval system.Finally, because there are a lot of programming terminology, cross-language association, and synonyms, we adopt a Chinese word segmentation system which is more suitable for programming forum, and expand query based on knowledge to address these issues. At the end of this paper, we implement a forum retrieval system based on lucene.
Keywords/Search Tags:forum search engine, forum structure mining, key post extraction, ranking support vector machine
PDF Full Text Request
Related items