Scientific Data Acquiring And Experts Similarity Researching Based On Open Access

Posted on:2017-04-21

Degree:Master

Type:Thesis

Country:China

Candidate:X Huang

Full Text:PDF

GTID:2308330482981786

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With rapid development of the Internet, the way people get scientific literatures has changed fundamentally. Reading and accessing literatures through network has become mainstream currently. In this situation, Open Access (OA) was born in order to promote the dissemination and using of research product. The core features of OA are providing free academic information and research products through the Internet under the premise of respecting the interests. This paper is a part of a project, open access scientific data platform, which belongs to the national natural science foundation. The platform can alleviate the current difficulties in obtaining domestic scientific literature data and enhance management capabilities for scientific data of government or research institutions.The scientific research data in National Natural Science Foundationâ€™s database has many issues, like unstandardized data format, wrong information, property loss and so on. We build a muti-mode data acquisition and processing components to solve these problems. We obtain about 8,000,000 papers from many different authorized datasources, and add the missing property, delete the duplicate data. By doing that, we establish and maintain a massive academic database. Academic papers are rigorous and contain a lot of information, taking advantage of that information make it possible to analyse the similarity between experts and domains. Based on the database we established, this paper proposes a new method of generating expertâ€™s scientific labels using topic model. The method computes topic distribution of every expert from papers of this expert, coupled with the word distribution of every topic we can get the words which have highest contribution to the expert as labels. We make comparisons with traditional method TF-IDF. Besides, we compute the similarity between experts using distributed representation.In fact, we compute the similarity between labels and get the similarity between experts combined with expert labels. The similarity of experts are used for recommendation. Experiments show that this similarity displays the connection of experts and similarity between expertâ€™s domains and research content very well.

Keywords/Search Tags:

Open Access, expert similarity, scientific papers, research domain, topic model, distributed representation

PDF Full Text Request

Related items

1	Research On The Representation Of Scientific Papers
2	The Research On Quality Evaluation Model Of Open Access Scientific Paper
3	Topic Detection On Scientific Research Papers Based On Topic Model
4	Research On Classification Algorithm Of Scientific Papers Based On Topic Model
5	Research On Similarity Of Scientific Documents Based On Semantics And Graphs
6	Automatic Recommendation System For Matching Scientific Research Projects With Experts
7	Machine Learning Based Model For Detecting Similarity Of Scientific Papers
8	Research On Key Technologies Of Schema Induction Based Open-domain Event Extraction
9	Research On Scientific Papers Sharing System In The Web2.0Environment
10	Research Of Copy Detection Of Chinese Scientific Papers Base On Text Structure And Content