Research And Realization Of Chinese And English Vertical Search Engines On The Police

Posted on:2019-06-18

Degree:Master

Type:Thesis

Country:China

Candidate:L P Wang

Full Text:PDF

GTID:2348330542454345

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the entry of human society into the information age,the Internet has penetrated into all aspects of public life and is playing an increasingly important role in people's lives.However,it is becoming more and more difficult to get the exact information that is needed on the network with huge amount of data.Facing the large and rapid growth of network space,artificial browsing is obviously unable to satisfy people's thirst for information and knowledge.Search engines have become an important way for people to obtain useful data from the network,and play a more and more important role in people's daily life and work.Vertical search engine is a professional search tool for a specific industry area,which can provide users with accurate,timely and complete industry information.By comparing with the policy of the provinces,we can locate the problems in the scientific and technological innovation activities in our province,excavate and screen out the demand information of the science and technology innovation policy in our province,that is,to screen and select the different contents of "people have me without" policy or homogeneity policy,and to define and plan the problems in the process of regulation and regulation of science and technology innovation policy in our province.It provides reference for design selection.This paper is based on the developed vertical search engine for policy information,and then adds three functional modules: "people have me no" search,policy public opinion search,and English policy retrieval.This paper first introduces the theoretical basis and main techniques of realizing Chinese and English search engines,and then introduces the application and improvement of two text classification algorithms in the "man I do not" search module.The two improved algorithms are: 1)the policy text keyword extraction algorithm based on the concurrence of words: This paper adjusts the dynamic calculation of the two important coefficients of the key degree of the concurrence word based on the word co existing keyword extraction algorithm,which makes the extraction of the key words more consistent with the general intention of the article.In the study of text classification algorithm,we improve the method of calculating keyword similarity threshold based on the weighting of feature words in the study of text classification algorithm: This paper calculates the similarity threshold dynamically by combining the weight distribution law of the policy text feature words,so that the text classification is classified.The similarity of special political words is more in line with the actual situation.The improved algorithm is compared with the traditional algorithm.The experimental results show that the improved algorithm is better than the traditional algorithm.At last,the whole design of the system and the design and implementation of each module are introduced,and all modules are tested and tested with the result that the functions of each module are well realized,and all of them can meet the requirements of use.

Keywords/Search Tags:

Anti crawler, Distributed crawler, Keyword extraction, Text classification, English retrieval, Online translation

PDF Full Text Request

Related items

1	Research On Topic Focused Web Crawler And Related Technologies
2	Research And Implementation Of Topic Crawler In The Field Of Inspection And Quarantine
3	A Focused Crawler Based On Statistical Machine Translation And Topic Propagation
4	Design And Implementation Of News Website Crawler And Classification Retrieval Platform Based On Microservice
5	Research On English Text Summarization And Machine Translation Based On Machine Learning
6	Design And Implementation Of Anti-Crawler System Based On Spark Streaming
7	Product Tag Extraction Based On User Reviews Under Distributed Crawler
8	Content Resource Evaluation Base On Web Crawler
9	Research And Application Of Distributed Crawler Technology Based On Ant Colony Algorithm
10	The Design And Development Of Deep-Customizable Crawler Tool System