| With the innovation of technology develops faster and faster,technology difficulty growing,to achieve fast and efficient in the field of a technological innovation,first of all,to master the core and key technology of this field,the core patent often on behalf of its technology in the field of the core and key technology of a core patents in the field of identification helps to dig the core technology in the field of information,It is the key to realize technological innovation,so core patent identification plays an important role in guiding technological innovation.The existing core patent recognition method using structured data,much depends on the method of expert opinion,low recognition efficiency and credibility deficiency and incomplete information,a more accurate in order to be able to quickly identify the core patents,this paper proposes a patent is the core of structured and unstructured data oriented patent recognition model.Firstly,this paper proposes an improved random forest algorithm based on hierarchical weighting for feature optimization of structured data.In the process of feature importance calculation,the traditional random forest algorithm only considers the simple sum of the change of Gini value at the node of feature,which will lose the location information of feature.In order to make up for the shortcomings of the traditional random forest algorithm in feature selection process,this paper uses the layered weighting method instead of the original simple summation method to screen out important features of structured data,and verifies the effectiveness of the improved algorithm through UCI standard data set.Secondly,in view of the characteristics of unstructured data said problems,through the word vector model will be unstructured data in the form of word vector feature.Finally,a core patent recognition model based on gated attention mechanism is proposed to improve TextCNN,and structured data and unstructured data are combined through data fusion method,and the validity of the proposed core patent recognition model is verified by real patent data in wisdom Bud patent database.Experimental results show that on UCI standard data set,the improved random forest algorithm based on hierarchical weighting has better stability and feature selection effect,which proves the effectiveness of the improved random forest algorithm in feature selection.The core patent recognition model proposed in this paper has higher accuracy and efficiency in terms of the real patent data from the Wisdom Bud patent database,which proves the effectiveness of the core patent recognition model proposed in this paper. |