| With the popularity of internet mobile devices and the improvement of people’s consumption level,the text information generated by users on network platforms such as reviews and opinions continues to grow.Among them,the value of reviews on e-commerce platforms is the greatest,as they can provide users with purchasing references,as well as provide businesses with genuine improvement suggestions.As the introduction of a product by the business often beautifies the product,while the evaluation of other users more truthfully reflects the performance,quality and effectiveness of the product,the historical evaluation on the product detail page plays an important reference role in user decision-making.In addition,widely covered user feedback can also help users determine whether a product meets their specific needs.Moreover,businesses can improve product design and service quality based on the consensus suggestions in reviews,thereby increasing sales and revenue.However,these review contents also bring a lot of redundant information,making readers need to spend a lot of time browsing numerous reviews in order to summarize the real situation.For users with personalized needs who are particularly concerned about certain aspects,they need to carefully select reviews related to aspects similar to their own concerns in order to understand whether a product is suitable.Therefore,the excessive amount of review content causes information overload for readers.Summarizing these review texts and summarizing the consensus information reached among users in a summary and concise language is an important way to deal with the problem of information overload.However,manually summarizing multiple reviews is a very difficult task.The information between reviews is repetitive and redundant,and the importance of different reviews is also different.During the process of reading reviews,annotators need to filter the information in each review,judge the importance of each review based on the views of other reviews,and avoid biased review texts being used as references for summarization.During the process of summarization,annotators need to identify important consensus views,sort them by importance,and choose according to the specified length of the summary.During the process of writing the summary text,annotators need to integrate consensus views and restate them in summary language.This summary process requires annotators to have high information screening and judgment ability,as well as language organization and summarization ability.The high requirements for annotators and the complex and time-consuming annotation process together result in the high cost of manual summarization of multiple reviews.This also leads to a lack of annotation data for multi-review summarization tasks.Therefore,it is crucial to design algorithms to automatically summarize multiple reviews under unsupervised conditions.However,the existing research on unsupervised multi-review summarization is still in its infancy and has many shortcomings.From the perspective of encoding and decoding multiple reviews,on the one hand,the existing hierarchical encoding methods make the encoding process between reviews invisible,lack of information exchange,and can easily lead to the ambiguity or even loss of secondary consensus information;on the other hand,the decoding process does not macroscopically consider the multiple aspects of information that may be covered in the reviews,which can lead to disorganized or redundant/lost information in the generated summary.From the perspective of unsupervised end-to-end model training,existing research either lacks effective self-supervised training objectives and cannot complete the training optimization of extractive summarization models,or does not consider decomposing consensus and biased information in reviews,which may lead to biased information affecting the extraction of consensus information.From the perspective of generating summaries that reflect merchant-specific information,on the one hand,existing research is often constrained by the exposure bias problem of local optimization,resulting in generated summaries tending to repeat their content;on the other hand,there is a lack of punishment for generating generic,bland summary content,resulting in a lack of specific information.From the perspective of meeting different users’ personalized needs,existing aspect-based summarization methods are usually pre-defined aspects,which are difficult to adapt to users’ diverse personalized needs and different merchant-specific information;on the other hand,there is a lack of attention to users’ language style differences.In response to the problems existing in the current research on multi-review summarization tasks,this paper proposes four main improvement algorithms in a progressive manner.From the perspective of encoding and decoding architecture,this paper proposes an interactive hierarchical multi-document summary sequence-to-sequence generation model,which includes an interactive hierarchical encoder to address the issue of missing secondary information in the encoder and a cross-sequence joint attention mechanism to address the issue of lack of macroscopic information in the decoder.From the perspective of unsupervised model optimization,this paper proposes a multi-angle review self-supervised summary pre-training mode based on mutual information,which includes a mutual information consensus learning algorithm to address the problem of lack of self-supervised targets in extractive multi-document summaries and a multi-angle review representation learning algorithm to address the issue of lack of decomposition of consensus and bias information.From the perspective of optimizing summary text features,this paper proposes a feature information generation algorithm based on reinforcement learning,which jointly addresses the issues of text generation being limited by exposure bias and multi-review summaries lacking feature information.From the perspective of personalized user needs,this paper proposes an unsupervised personalized summary generation algorithm based on user profiles,which addresses the issues of predefined aspects being difficult to adapt to personalized needs and lack of attention to language style differences.It introduces the user profile extraction method based on personalized aspects and text style modeling,as well as a self-supervised learning mechanism for user profiles and personalized summary generation.These four algorithms are progressively layered to form the proposed complete personalized multi-review summarization algorithm.Detailed experiments show that the proposed personalized multi-review summarization algorithm can effectively generate summaries that conform to multi-review consensus and have business characteristics without using annotated data for training.Furthermore,it can generate personalized summaries that cater to users’ interests by changing the aspects of consensus and summary style based on their personalized profiles.Each algorithm also demonstrates effectiveness and necessity in their respective experiments through automatic and manual evaluations.Finally,based on the proposed personalized multi-review summarization algorithm,this paper designs and implements a personalized multi-review summarization algorithm visualization display system.The system integrates multiple review information display,merchant information display,topic word display,and comparison display of summaries generated by different algorithms for different users,which more intuitively demonstrates the research results of this paper. |