Font Size: a A A

Research On Fusion Sorting Model Of Domainized Word Vectors

Posted on:2019-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:M QiaoFull Text:PDF
GTID:2428330566476963Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of social economy and the Internet,various transactions and information are stored as data.How to use these data and carry out scientific and effective management is a very popular research point in the field of information retrieval.Its implementation form is an expert system.The expert system belongs to an application of information retrieval.The content of its main implementation can be defined as the category of natural language processing,that is,the similarity matching of short texts.At present,most of the natural language processing uses the word vector as the basic element,which solves the problem of word mapping to some extent.However,the results of word vector training are largely related to the corpus,which results in a large gap between word vectors trained in two different corpus.For the expert system,the feature is that the conversation is relatively concise and sometimes the expression is not standardized.Therefore,it is necessary to put forward the more important information and the data in the knowledge base for retrieval in the short question,so the pros and cons of the word vector influence to a certain degree.Overall search efficiency and quality.However,in the actual use process,we will find that there are few proper nouns.If we still use large-scale non-domain corpus,it will cause a large deviation in text matching.And some potential stop words will have a greater impact when comparing short text similarities.In order to solve the above problems,the optimization of the word vector and the fusion of different word vectors are studied.The optimized results are trained according to the supervised sorting learning algorithm and finally applied to the specific field of expert robots.The main tasks include:(1)In the word vector training,the weighted overlap and spatial mapping of certain rules are added to eliminate the influence of potential stop words on the whole and increase the weight of proper nouns.(2)Correlate the two sets of word vectors for training so that they can adapt to a version-free expert system.It is applied to the RWMD algorithm so that it can perform a more accurate sorting of scores without training.(3)According to the improved word vectors obtained,adding additional scoring criteria makes the ranking learning model more robust.The use of the single document method and the document-to-method approach to the training of the final model makes it similar in effect to the document list method and reduces the time complexity.The results show that the fusion ordering model is superior to the list model in the expert system.
Keywords/Search Tags:expert system, ranking learning algorithm, word vector mapping, fusion ranking model, information retrievall
PDF Full Text Request
Related items