| With the fast development of informatization, the sensitive information related to individual privacy, trade secret and states secrete have more and more existence forms and is increasing in quantity and have different secret classifications. Currently, there’s little research on automatic classification while traditional manual marking or classification is often inefficient and the result is not good. Therefore using computer for automatic analysis of sensitive information and automatic classification has become an important and practical research topic.A lot of information sensitivity studies need multi-granularity sensitive information analysis. First, this paper presents an algorithm SSAD, sentence sensitivity analysis based on dependency parsing. After dependency parsing analysis in sentences, extract core structure of sentence that contains sensitive information. Then analyze the semantic distance and the position of sentence. And considering sensitive words the sentence contained calculate the sensitive values of whole sentence. Finally, store the sentence frame for further processing.In order to analyze of the sensitivity of the document, based on effective analysis of sentence-level information sensitive, this paper presents an algorithm DSAD, document sensitivity analysis based on dependency parsing. After segment and dependency parsing, according to the similarity with storied sentence frame, calculate the sensitivity value of the document. If the document has been classified, calculate document sensitive values considering the sentences’ sensitivity and document secret classification. If the document has not been classified, calculate security classification of the document by the distribution of sensitive words and sentence frame, and then combine with the sensitive sentences information the document contains to calculate the document sensitive values.To solve the problem that the same sensitive information has different sensitivities at different times, this paper proposes a strategy of dynamic updating the sensitivity of sensitive words.Experimental results show that the algorithms above can calculate the sensitivity of the sentence-level and document-level information effectively, and can classify and sort the document without secret classification rightly. |