Font Size: a A A

High Quality Microblogging Retrieval Method Based On Clustering Constraints

Posted on:2019-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:K WangFull Text:PDF
GTID:2428330593950369Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The rise of social media not only reduces the cost of people's communication,but also changes people's habit of consuming information.People are no longer satisfied with passive consumption information and become the main body of manufacturing and disseminating information,making the data spread rapidly and the amount of data is unprecedented.Taking microblogging as an example,the short text features of microblogging,such as short length,wide use of special characters,and colloquial expression,make the traditional long text retrieval methods in microblogging retrieval neutral degradation,or even completely unavailable.However,the mainstream social media,such as Microblogging,Twitter and Facebook,are eager to build a fast and intelligent information filtering system to provide users with more effective information push services.This requires in-depth study of microblogging short text retrieval methods.The method of introducing external information to improve retrieval performance is simple and effective,which has attracted wide attention from researchers.However,with the in-depth study of the method of introducing external information,researchers find that there are several problems in solving short text retrieval:1.Relevant microbloggings are difficult to sort.Generally,a large number of relevant microbloggings can be retrieved,but how to sort the relevant microbloggings,making the limited push contain more information and pushing the quality of service higher,still remains to be studied.2.The effective clustering of microbloggings is difficult.Because of the large amount of data,short text and colloquial expression of microblogging,the usual clustering method is not effective.In order to solve the above problems,this paper proposes a microblogging retrieval method.By combining the clustering information of microblogging,it can achieve the purpose of understanding users' actual search intention and improving retrieval performance.The main achievements of this paper are summarized as follows:1.A microblogging search framework is proposed,and exploring the impact of several basic query expansion methods on retrieval performance.2.A multiple retrieval model is proposed,comparing and verifying the retrieval performance of the multiple retrieval model.3.A clustering method based on non-negative matrix factorization(BNMF,Basic Non-negative Matrix Factorization)is proposed,improving the retrieval perfor-mance of the retrieval model with the clustering constraints.4.A clustering method based on relevant constraints(RNMF,Relevance Nonnegative Matrix Factorization),which is compared to BNMF,verifying the performance of the clustering method.The experiments on the microblogging data set provided by TREC(Text REtrieval Conference)show that the high quality microblogging retrieval method based on clustering constraints can effectively improve the performance of micro-blog retrieval compared with the basic retrieval method.
Keywords/Search Tags:microblogging search, multiple retrieval model, non-negative matrix factorization, microblogging clustering
PDF Full Text Request
Related items