Tag Recommnedation For Dialogue Corpus

Posted on:2013-08-09

Degree:Master

Type:Thesis

Country:China

Candidate:G N Fang

Full Text:PDF

GTID:2248330371966323

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid growth of network information, people hope vast amounts of text could be marked with appropriate tags. In other words, the content of text is described with one or a few words. As a result, that can greatly accelerate people’s browsing speed. Furthermore, the performances of tasks in Natural language Processing field, such as text classification and information retrieval could be promoted with high quality labels. Therefore, there are many researches focusing on automatic tag generation (tag recommendation). At the same time, with the fast development of social networks, for example instant chat, twitter, microblog, people express and exchange their views using these tools. However, there exist great differences between this kind of data and web pages. For instances, they all have certain characteristics of dialogues, they are usually short and have loose structures. These characteristics bring more difficulties on tag recommendation for dialogue corpus. At present, the research directly to tackling this kind of data is still very rare. And whether the methods of tag recommendation that have good performances on web pages could be suitable for dialogues are still unknown.This paper focuses on the data which own the characteristics of dialogues. We research tag recommendation, relevant words and dialogue characteristic in-depth and propose an unsupervised method for generating informative tags for multi-party dialogue in an open domain. Our model first extracts keywords from text through a multi-weighting framework, which includes frequency weighting, sentence weighting, speaker weighting and position weighting. Then we get their bigrams through frequent pattern matching. In order to generate more flexible and socialized tags, we expand keywords and their bigrams by exploring tag associations mined from a famous bookmarking web Delicious. Finally we rank the three parts of tag candidates under a uniform metric.The main research contents are as follows:1) We conduct a deep study on the characteristics of dialogue data and analyze this kind data from five aspects, there are dialogue format, discourse mode, discourse style, discourse field, turn-taking. 2) According to the characteristics of the dialogue data, in the module of keyword extraction, we propose a multi-weighting framework by considering four weightings and there are frequency weighting, sentence weighting, speaker weighting and position weighting. On the basis of extracted keywords, we get bigrams through POS pattern matching. The experiment results on two dialogue datasets indicate that our algorithm is effective;3)In the section of social tag expansion, we introduce one classic association rule mining algorithm named Apriori to get social tags which highly related existing keywords and bigrams. The results show the method is available.

Keywords/Search Tags:

dialogue, tag recommendation, multi-weighting, association rule, social tag

PDF Full Text Request

Related items

1	A Personalized Recommendation System Based On Multi-level Association Rule In E-commerce
2	Research On Mobile Tourism Route Recommendation Model Based On Optimized Social Tags And Association Rules Algorithm
3	Application Of Association Rule Mining In E-commerce Recommendation System
4	Research Of Cross-Domain Recommendation System For Multi-Source Data
5	Research On Mining Algorithm Of Association Rule And Its Application For Biological Data
6	The Application Study Of Multi-Agent And The Association Rule Mining
7	Personalized Recommendation Model Of App Based On WLAN User Group
8	Research On Personalized Recommendation Method Based On Association Rule Mining
9	The Research On Personalized Recommendation Technology Based On XML And Association Rule
10	Association Rule Mining Algorithms Based On Boundary Idea