Font Size: a A A

Personalized Project Search In Software Crowdsourcing Platform Based On Text Mining

Posted on:2019-08-24Degree:MasterType:Thesis
Country:ChinaCandidate:N LiFull Text:PDF
GTID:2428330590992450Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,software crowdsourcing has attracted widespread attention in industry and academia since the projects are published on the Internet and solved by the wisdom of the general public.And it is the key issue to help users find the suitable projects in the crowdsourcing platforms.And there are some disadvantages about search approaches in the crowdsourcing platforms:(1)most existing search approaches are based on matching the words,lacking mining the context of projects and queries;(2)the queries may not express the users' intents completely,which causes the search results don't satisfy the users;(3)the existing search approaches ignore the personalized needs of users,which means all users get the same results.And the users cannot find the appropriate projects in time.In this case,we propose a personalized project search approach based on text mining,which uses the text mining technology to establish the project model and user model in the platforms,and expands the queries.And a learning to rank algorithm is used to sort the project candidates.The main contributions of this paper include:(1)We propose a project modeling method based on text mining.We mine theproject titles and demands by the lexical level,the topic level and the neuralnetwork semantic level.For building the project topic model,we make thelabels in the software crowdsourcing platform as supervisory informationfor sampling training.And a method based on the time decay function isused to estimate the heat of project.(2)We propose a user interest modeling method based on time window.Bysplitting time windows,a short-term interest model is established in eachtime window.And we calculate the factors of continuation and volatility toestablish the long-term user interest model.(3)We propose query expansion methods based on the semantic topic PRF andStackOverflow.For common terms,we expand queries by the semantictopic PRF technology and select the words related to the queries and topicsas expansion candidates.For domain terms,we use a method based onStackOverflow.(4)We propose pre-filtering strategies and use a learning to rank algorithm.Byidentifying the entities in the queries,we construct SQL statements withcustomized templates and analyze the uses' hidden constraints to select theproject candidates suitable to the users.And we use a list wise algorithm,LambdaMART,to sort the project candidates.In this paper,we use the data from several software crowdsourcing platforms to conduct a series of experiments.In the experiments,the user interest modeling method and the query expansion method proposed in this paper increase NDCG by 19.8% and 27.2% averagely.
Keywords/Search Tags:Software Crowdsourcing, Text Mining, User Model, Query Expansion, Personalized Project Search
PDF Full Text Request
Related items