Font Size: a A A

Improving Scholarly Search By Query Expansion Techniques And Multi-Objective Approach

Posted on:2023-07-18Degree:DoctorType:Dissertation
Institution:UniversityCandidate:Shah KhalidFull Text:PDF
GTID:1528307025961609Subject:Computer Application Technology
Abstract/Summary:PDF Full Text Request
Searching for related publications is an integral part of the research process,very frequently carried out by researchers.Related publications in the domain of interest can be used by researchers for different purposes: to find useful information,to understand the current developments,to gain expertise,to evaluate the progress made so far,and to conclude about the current state of a research problem.Academic search engine,also known as scholarly retrieval system,is a kind of domain-specific Information Retrieval(IR)solution that enables researchers to search,explore,and download research publications across many disciplines.Scholarly retrieval systems such as Google Scholar,Semantic Scholar,Microsoft Academic Search,Xueshu Baidu,Pubmed,Arnet Miner,and many more provide access to scholarly publications.They comprise resources for academic searchers who search for related papers of interest.Many of these retrieval systems are quite successful in the domain of academic search.Although,researchers’ specific information needs,the exponential increase in the volume of scholarly literature,synonym and polysemy,and so on,have been the key challenging issues faced by the scholarly retrieval systems.Very often,it is still a tedious task for academic searchers to locate the most useful papers in a field of interest,especially when the users do not have enough background information.To further improve academic search performance,this thesis structures its contributions into the following two parts:The first part of this thesis studies and proposes a novel approach that combines query expansion and citation network analysis for supporting the scholarly search.It is a two-stages interactive academic search process.Upon receiving the initial search query,in the first stage,the retrieval system provides a ranked list of results.Two different styles of relevance feedback are investigated in the second stage(i)User Relevance Feedback(URF),in which the user selects a few relevant papers and from which some useful terms are obtained for query expansion.(ii)Pseudo Relevance Feedback(PRF),in which some valuable terms are obtained from a few top-ranked papers for query expansion.In both stages,citation analysis is involved in further improving the quality of the search results.The novelty of the approach lies in the combined exploitation of query expansion and citation networks analysis that may bring the most relevant papers to the top of the search results list.The approach is evaluated on the Association for Computational Linguistics(ACL)and Semantic Scholar Open Research Corpus(S2ORC)datasets.The experimental results demonstrate that the technique is effective and robust in both feedback styles comparatively for locating relevant papers regarding normalized Discounted Cumulative Gain(n DCG),precision,and recall.The second part of the thesis investigates multiple features for the usefulness of research publication in the strategic ranking of scholarly retrieval systems.Most of the previous investigations assume that the only concern of the user is the relevancy of the paper to the query.This thesis hypothesizes that the usefulness of a paper to a searcher is determined not only by its relevance to the query but also by other aspects such as the impact factor of the paper and its publication age.This is vital,especially when a large number of papers are relevant to the query.This part has a twofold contribution.First,it proposes a group of new evaluation metrics to measure the usefulness of scholarly papers.These metrics consider all three factors together:relevance factor,impact factor,and freshness.Second,it presents a paper-ranking framework that ranks papers by a linear combination of their relevance score,publication age score,and impact score.The framework is evaluated with the ACL dataset.The results demonstrate that very flexible ranking policies are applicable by setting different weights to these three factors.More diversified requirements can be achievable for end users.
Keywords/Search Tags:Academic Search, Citation Analysis, Query Expansion, Evaluation Metrics, Relevance, Impact, Publication Age
PDF Full Text Request
Related items