Research On Corporate Internet Negative Information Capture

Posted on:2019-12-01

Degree:Master

Type:Thesis

Country:China

Candidate:N D Wang

Full Text:PDF

GTID:2428330548481889

Subject:Computer technology

Abstract/Summary:

With the globalization of the Internet in the information era,the arrival of the age of big data artificial intelligence,people,and the concept of "data ownership" to "data creation" have started to reflect on the value 'creation in data mining in existing industries.In all industries,financial industry practitioners are eager to promote economic development and value return from data.Timely and accurate Internet data is of strategic significance to bank risk control.In the era of rapid Internet development,how to accurately collect and analyze the intricate information data of lenders based on their own needs is an urgent problem.The external data source approach supplements the bank's first-time understanding of the relevant information of the lenders,and timely screening and early warning of potential risks is of great significance to improving the level of risk management.The traditional method of information collection is a method of "not to be denied".Information is not extracted after the information is screened to obtain the information.The information is not only of low quality but also has low efficiency in crawling data.The cost of late data processing is also quite large.In view of the above issues,this paper has started from data source acquisition,data collection efficiency,data preprocessing,and data storage and storage.The full-text work can be divided into the following three parts:1)Chinese company abbreviation generation and detection.A new machine learning method based on double-level conditional random field join rule derivation and web crawler inspection was proposed.By constructing a double-level conditional random field model,the classification of each word within the company name is identified,a feature set is constructed and a CRFs model is input,and the abbreviation obtained through the output is collected by a web crawler for statistical evaluation.There is a certain practical application value for accurately generating the abbreviation describing the related company.2)Corporate negative information collection solutions.Through the use of the"spread first product" mode to collect information on the requirements for extraction,the entire network is used to collect crawlers.Subjects related to the basic company name have the option of crawling predefined web pages that match the theme,and then Incremental crawlers are used to generate a directional crawler strategy based on the topic of different corporate negative information.A large number of machine learning algorithms are used to preprocess the collected data such as deduplication,denoising,and screening.3)Designed and implemented a corporate negative information collection system.This system is a sub-project service for risk management and monitoring platforms of bank venture capital business personnel.Users interact with the risk warning platform to send information acquisition requirements to the value acquisition system,and then the dispatch center analyzes the tasks and delivers the collection tasks.Finally,it collects collected data for pre-processing analysis and provides risk analysis system data support.

Keywords/Search Tags:

Bank venture, Abbreviation, CRFs, Information Collection

Related items

1	Design And Implementation Of The Venture Capital Management System Of Liaoning Rural Credit Cooperatives
2	Research On The Choice And Effect Of Introducing Venture Capitalinto IC Equipment Enterprises
3	Design And Implementation Of Bank Collection System For Non-tax
4	Based On The Same Field Crfs And Interdisciplinary Under Brand Word Extraction
5	The Design And Implementation For Banking Information Network Venture Management System
6	Text Mining Based On The Data Of Venture Capital Media For Analyzing Venture Capital Market Trends
7	Design And Implementation Of Core Business Sysetm Of Bank Of Kunlun
8	CRFs-Based Chinese Named Enitity Recognition With Improved Tag Set
9	The Research And Implementation Of Transaction Log Collection System For Bank
10	Research On Matching Method Of Full Name And Abbreviation By Fusing Multi-modal Features