Font Size: a A A

Exploring Semantic Structure And Pragmatic Information With Language Models

Posted on:2024-11-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:L ZhouFull Text:PDF
GTID:1528307373471004Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Natural Language Processing(NLP),a crucial branch of artificial intelligence,endeavors to enable computers to comprehend,analyze,and generate human language.It has significantly enhanced the intelligence of human-computer interaction and propelled the development of fields such as information retrieval,intelligent question answering,and machine translation.NLP is pivotal in improving productivity,enhancing user experience,and driving technological progress.Language models,a fundamental technique in NLP,employ various statistical and probabilistic methods to estimate the likelihood of a given sequence of words occurring in a sentence,laying the groundwork for word prediction in textual data.Current leading language models,trained on large-scale datasets,possess powerful understanding and prediction capabilities,laying a solid foundation for achieving intelligent language understanding and communication.A critical aspect of language understanding and communication consists of semantic structure and pragmatic information,representing objective and subjective information within language,respectively.By delving into the study of semantic structure and pragmatic information,language models can more accurately comprehend and generate natural language,thereby enhancing the efficiency and quality of human-computer interaction and providing essential support for the advancement of language processing technology.Therefore,based on language model technology,this dissertation proposes the following research questions regarding the semantic structure and pragmatic information in language:(1)How to design and utilize reasonable methods to encode,represent,and learn semantic structure information in language?Textual data in natural language may contain complex structures within seemingly unstructured sequences,including long-distance dependencies and subordinate relationships,posing challenges for language models in understanding and generating such sentences.Reasonable utilization of semantic structure information therein can enhance the model’s comprehension and processing of natural language.Semantic structure information is relatively objective and can be deduced based on language rules.However,pragmatic information is often subjective and ambiguous,facing linguistic characteristics across different cultural backgrounds.Therefore,this dissertation further explores:(2)How to comprehend cross-cultural pragmatic information in language models?Language customs,social etiquette,and implicit information vary across different cultural backgrounds,affecting people’s interpretation and understanding of language expressions.Language models need to possess the capability to understand and process cultural diversity and adaptability to better capture pragmatic information.To address the two aforementioned research questions,this dissertation combines language models.Firstly,we investigate three aspects of semantic structure:encoding methods,representation forms,and learning approaches.The aim is to enhance the language model’s understanding of complex semantic structures,thus enabling more accurate processing of natural language tasks.Secondly,building upon the language model,we explore the cross-cultural differences in pragmatic information,including cross-cultural transfer prediction and cross-cultural knowledge detection.The goal is to deepen our understanding of language usage patterns and characteristics in different cultural backgrounds,thereby improving the language model’s applicability in cross-cultural environments.The research content is outlined as follows:(1)This dissertation proposes a novel method for constructing logical adjacency matrices.Combined with an entity-attention mechanism,we design a relationship extraction algorithm that breaks the traditional message-passing paradigm of graph neural networks.This allows for the direct acquisition of multi-hop neighborhood information in a single-layer graph network,facilitating the utilization of semantic structural information in sentence dependency trees.Experimental results demonstrate that this model enhances the understanding and learning capabilities of semantic representations from multiple perspectives.Building upon this,we further propose a new message-passing paradigm with features including multi-step message sources,node-specific message outputs,and multispace message interactions.The superiority and applicability of this paradigm are verified by instantiating a dual-perception graph neural network model.(2)This dissertation proposes a simple and efficient neural architecture.This architecture separates the learning of contextual representations from the propagation of semantic structural information,allowing for integration with any pre-trained language model and textual graph semantics representation.It establishes a framework for systematically investigating the impact of graph semantics representation and related parsers on relation classification tasks.Experimental results on relation classification datasets in both English and Chinese,covering general and literary domains,demonstrate that graph semantics representation significantly enhances the performance of relation classification tasks in both English and Chinese datasets.Moreover,the structural parsing quality of related parsers also demonstrates a certain impact on the effectiveness of graph semantics representation.(3)This dissertation proposes a detection framework based on language models to evaluate the effectiveness of multi-layer perceptrons(MLPs)in language tasks.It explores the incremental semantic structural information learned when language models are combined with MLPs and discusses the necessity of explicit structural information in natural language processing tasks.Experimental results indicate that MLPs enhance the ability of pre-trained language models to capture surface,syntactic,and semantic information in text.Particularly,MLPs excel in capturing syntactic and semantic information.These findings provide valuable insights for designing various types of language model variants using MLPs.(4)This dissertation proposes two culturally-related features at the national and language levels.These features are utilized to measure the cultural background differences between the source and target datasets,thereby enhancing the prediction accuracy of the optimal transfer dataset in cross-cultural transfer learning.The research highlights the importance of considering cultural differences in the application of transfer learning techniques in subjective language tasks.This approach helps mitigate issues such as data scarcity,improves model performance,and reduces training costs.(5)This dissertation proposes a fully automated intelligent method for generating large-scale cultural knowledge datasets.Additionally,it develops a system for cultural knowledge detection that incorporates different language models,detection prompts,and evaluation criteria to assess the language model’s ability to detect various cultural knowledge.Experimental results demonstrate significant bias in language models when detecting American culture.However,this bias diminishes when using other languages for detection.Including cultural cues in detection prompts enhances the language model’s ability to acquire cultural knowledge.This research highlights the influence of cultural background and language selection on cultural knowledge detection,offering valuable insights for understanding cultural bias in language models.
Keywords/Search Tags:Natural language processing, Language models, Semantic structures, Graph neural networks, Pragmatic information
PDF Full Text Request
Related items