دانلود Graph-based biomedical text summarization: An itemset mining and sentence clustering approach

ترجمه فارسی مقاله Graph-based biomedical text summarization
قیمت : 1,270,000 ریال
شناسه محصول : 2008101
نویسنده/ناشر/نام مجله : Journal of Biomedical Informatics
سال انتشار: 2018
تعداد صفحات انگليسي : 17
نوع فایل های ضمیمه : Pdf+Word
حجم فایل : 1 Mb
کلمه عبور همه فایلها : www.daneshgahi.com
عنوان انگليسي : Graph-based biomedical text summarization: An itemset mining and sentence clustering approach

چکیده

Abstract

Objective: Automatic text summarization offers an efficient solution to access the ever-growing amounts of both scientific and clinical literature in the biomedical domain by summarizing the source documents while maintaining their most informative contents. In this paper, we propose a novel graph-based summarization method that takes advantage of the domain-specific knowledge and a well-established data mining technique called frequent itemset mining.

Methods: Our summarizer exploits the Unified Medical Language System (UMLS) to construct a concept-based model of the source document and mapping the document to the concepts. Then, it discovers frequent itemsets to take the correlations among multiple concepts into account. The method uses these correlations to propose a similarity function based on which a represented graph is constructed. The summarizer then employs a minimum spanning tree based clustering algorithm to discover various subthemes of the document. Eventually, it generates the final summary by selecting the most informative and relative sentences from all subthemes within the text.

Results: We perform an automatic evaluation over a large number of summaries using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics. The results demonstrate that the proposed summarization system outperforms various baselines and benchmark approaches. Conclusion: The carried out research suggests that the incorporation of domain-specific knowledge and frequent itemset mining equips the summarization system in a better way to address the informativeness measurement of the sentences. Moreover, clustering the graph nodes (sentences) can enable the summarizer to target different main subthemes of a source document efficiently. The evaluation results show that the proposed approach can significantly improve the performance of the summarization systems in the biomedical domain.

Keywords: text summarization

Skip Navigation Links