| Natural language inference(NLI)studies the relationship between two pieces of texts T and H.The entailment from T to H is confirmed if a person could refer H based on T.Enabling machines to perform such a task has many applications in the field of natural language processing.With more annotated datasets becoming available in recent years,NLI has seen huge progress.However,studies(Gururangan et al.,2018;Poliak et al.,2018)show that most existing datasets are biased,thus unqualified as the evaluation standard.This thesis puts forward the same argument from the perspective of linguistics.Most modern NLI datasets lack such data that reflects certain linguistic phenomenon.One type of such phenomenon is that lots of sentence pairs(T and H)in natural language are in the following form: T is a conditional sentence and H is closely related to the conditional’s consequent.It’s hard even for human to identify the relationships between such sentence pairs.This thesis studies texts with such features as described above and tries to answer two questions.Given a conditional statement in the form of "if p,q" and a question composed of a modal verb and the conditional’s consequent,namely q,in the form of "Can(Will)q?",(1)Is there an option among “yes”,“no” or “unsure” that most people would choose as the answer to the question(modal verb + q)?(2)What are the influencing factors in human’s processing such texts? To answer these questions,this thesis constructs a dataset of such data as described above.And 7 annotators are invited to annotate this dataset.Quantitative and qualitative methods are exploited to analyze the annotated dataset.The results show that(1)there is no standard answer for such data,and(2)people’s understandings of modal verbs and their familiarities with texts’ topics may influence their final choices.The work done in this thesis reveals the complexity of the linguistic phenomenon,demonstrating that existing NLI datasets are biased.Theories from logic and cognitive science are employed to explain the reasoning mechanism when humans are processing such phenomenon. |