DOCUSAGE: HARNESSING HIERARCHICAL CLUSTERING IN SALIENCE-DRIVEN NARRATIVE SYNTHESIS
dc.contributor.advisor | Belcaid, Mahdi | |
dc.contributor.author | Sadmanee, Akib | |
dc.contributor.department | Computer Science | |
dc.date.accessioned | 2024-10-09T23:45:57Z | |
dc.date.available | 2024-10-09T23:45:57Z | |
dc.date.issued | 2024 | |
dc.description.degree | M.S. | |
dc.identifier.uri | https://hdl.handle.net/10125/108680 | |
dc.subject | Artificial intelligence | |
dc.subject | Computer science | |
dc.subject | Dataset synthesis | |
dc.subject | Narrative synthesis | |
dc.subject | Natural language processing | |
dc.subject | Text summarization | |
dc.title | DOCUSAGE: HARNESSING HIERARCHICAL CLUSTERING IN SALIENCE-DRIVEN NARRATIVE SYNTHESIS | |
dc.type | Thesis | |
dcterms.abstract | Text summarization remains a crucial yet challenging task in natural language processing, especially as the volume of text data grows exponentially. This thesis introduces Sumsage, a new optimization-based text summarization method that synthesizes concise yet informative summaries. Our work presents several notable contributions to the field. We developed the Syn-D-sum dataset from the CNN/DailyMail dataset, creating a robust resource for training and evaluating summarization models. We also propose the Sumsage algorithm, which leverages hierarchical clustering to extract key sentences and construct coherent summaries, closely emulating human summarizers. Additionally, we designed two new evaluation methods: the Symphony penalty and the Captured Importance Quantification scores, which assess the quality of generated summaries by considering both narrative structure and sentence order. Sumsage’s dynamic tree structure and hierarchical clustering approach enable efficient and scalable summarization while maintaining contextual relevance and minimizing hallucination. Additionally, our experiments show that Sumsage yields superior performance over GPT-3.5-turbo, generating summaries similar to those written by humans and capturing more essential information. Sumsage represents a novel advancement in text summarization, offering a robust and interpretable method for generating high-quality summaries. This approach not only addresses current challenges but also lays the foundation for future innovations in narrative synthesis and evaluation. | |
dcterms.extent | 67 pages | |
dcterms.language | en | |
dcterms.publisher | University of Hawai'i at Manoa | |
dcterms.rights | All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner. | |
dcterms.type | Text | |
local.identifier.alturi | http://dissertations.umi.com/hawii:12294 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Sadmanee_hawii_0085O_12294.pdf
- Size:
- 1.46 MB
- Format:
- Adobe Portable Document Format