Font Size: a A A

Research On Intelligence Identification Of Illegal Description In Online Advertisement

Posted on:2019-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:H F GuoFull Text:PDF
GTID:2428330590465903Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In the internet era,network advertising has become one of the extremely valuable advertising media.Identifying the huge amount of Internet advertising information is a very important content in the big data analysis.However,when the network advertising develops rapidly,a large number of illegal network advertising emerges in endlessly and repeatedly because of the problems such as advertising supervision,laws and regulations,and the quality of advertising professionals.Therefore,the effective supervision and management on the internet advertising have an important realistic significance.In order to identify the illegal description in networking advertising more intelligently,two kinds of algorithms are proposed in this paper starting from two kinds of common violations in network advertising,and they can identify the illegal description in networking advertising contents intelligently.The main work of this paper is listed as follows:1.The illegal words may exist in network advertising are identified.Aiming at the problems of huge amount of network advertising,and hard to supervise and examine artificially item by item,a scheme based on illegal thesaurus identifying illegal advertising words is proposed.First,the illegal seed vocabulary is extracted according to the newly modified Advertising Law.Using synonyms to expand seed words and semantic similarity to filter expansion vocabulary then,the illegal word thesaurus of network advertising is gained.Then,based on the rule of natural language,the illegal words in network advertising are identified combining contextual semantic information on the basis of string matching.The experimental results show that this method can identify the illegal words in network advertising effectively and can also effectively assist the supervision of network advertising,possessing a good application potential.2.Identifying the illegal description words may exist in network advertising.Aiming at the problems of short network advertising text and semantic loss,the identification method of using Word2 vec and in-depth study mode LSTM is proposed.First,in consideration of that the traditional text representation method is easy to cause sparse data representation and dimensional disasters,the in-depth study tool Word2 vec developed by Google is adopted to conduct the semantic word vector and sentence vector representation on network advertising.Second,for the vectorized text,the Long Short Term Memory Network LSTM mode which is specialized in processing serialized data is adopted to judge the illegal words in network advertising.In the end,the experimental results showed that this method can identify the illegal words in network advertising effectively,especially for the identification of illegal words with similar font and similar word semantics,the identification accuracy is high.Two methods proposed in this paper can effectively identify the illegal words and illegal sentences with illegal description,laying a good foundation for intelligently identifying the illegal contents in network advertising,which can reduce the workload of law enforcement personnel to a certain degree,and it also possesses a positive significance on constructing a healthier network advertising market environment which is worthy of the trust of consumers.
Keywords/Search Tags:online advertisement, construction of thesaurus, LSTM, Word2vec, illegal advertising recognition
PDF Full Text Request
Related items