Natural Language Processing and Large Language Models Supporting Data Analytics for System Sciences

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 8 of 8
  • Item
    Zero-shot Comparison of Large Language Models (LLMs) Reasoning Abilities on Long-text Analogies
    (2025-01-07) Combs, Kara; Bihl, Trevor; Howlett, Spencer; Adams, Yuki
    In recent years, large language models (LLMs) have made substantial strides in mimicking human language and coherently presenting information. However, researchers continue to debate the accuracy and robustness of LLMs’ reasoning abilities. The reasoning abilities of thirteen LLMs were tested on two long-text analogy datasets, named Rattermann and Wharton, which required them to rank a series of stories from most analogous to least analogous compared to a source story. On the Rattermann dataset, GPT-4 obtained the highest accuracy of 70%. As a whole, LLMs seem to struggle with over-emphasizing similar story entities (characters and settings) and a lack of awareness of higher-order relationship(s) between stories. LLMs struggled more with the Wharton dataset, with the highest accuracy achieved being 46.4% by GPT-4o, and all but nine LLMs performing below random chance accuracy. Although LLMs are improving, they still struggle with higher-cognitive tasks such as analogical reasoning.
  • Item
    How Does ESG Reporting Data Affect Operational Efficiency? Does ESG rating matter?
    (2025-01-07) Lui, Gladie
    Several initiatives, including the Global Reporting Initiative in 1997, the Carbon Disclosure project in 2000, the Sustainability Accounting Standards Board in 2011, the Taskforce on Climate-related Financial Disclosures in 2015, and the Workforce Disclosure Initiative in 2016 have contributed to the landscape of sustainability reporting. To harmonize the plethora of guidelines, the European Commission and the International Financial Reporting Standards Foundation are undertaking efforts to improve reporting practices. To understand the value of sustainability reporting, the current status of sustainability reports is investigated in two studies. The first study reveals that differences in emphasis of prominent topics, the sentiment and readability of sustainability reports throughout the period of 2015 to 2021. The second study finds a positive relationship between sustainability reporting and operational efficiency score of companies. This positive association is more pronounced when the firm-level external monitoring effect of Environment, Social and Governance ratings is higher than median ratings.
  • Item
    DiFiLE: A Knowledge-Distillation Longformer Model for Finance with Ensembling
    (2025-01-07) Hristova, Diana; Satani, Nirav
    10-K reports are a very important source of information in finance. Unfortunately, due to their text length they can hardly be analyzed by state-of-the-art transformer-based methods. In this paper, we aim to address this by combining the fields of efficient attention mechanisms, knowledge distillation (KD), and ensembling. Our five-step approach, DiFiLE, first pre-processes the data and splits it into data chunks based on the report items. Then, for each chunk, we estimate a teacher Longformer model. This is followed by KD and the generation of the corresponding student models. Finally, we aggregate the results from the chunks with ensembling and in particular stacking. We evaluate DiFiLE on the 10-K reports of the DJIA companies. The results show high performance of the teacher model, which is then well mimicked by its distilled version, requiring 30% fewer resources.
  • Item
    Undergraduate Pacific Studies Exam Generation and Answering Using Retrieval Augmented Generation and Large Language Models
    (2025-01-07) Tyndall, Erick; Gayheart, Colleen; Some, Alexandre; Genz, Joseph; Langhals, Brent; Wagner, Torrey
    The capabilities of large language models have increased to the point where entire textbooks can be queried using retrieval-augmented generation (RAG). The study evaluates the ability of OpenAI’s ChatGPT-3.5-Turbo and ChatGPT-4-Turbo models to create and answer exam questions based on an undergraduate textbook. 14 exams were created with true-false, multiple-choice, and short-answer questions from a textbook available online. The accuracy of the models in answering these questions is assessed both with and without access to the source material. Performance was evaluated using text-similarity metrics including ROUGE-1, cosine similarity, and word embeddings. 56 exam scores were analyzed to find that RAG-assisted models outperformed those without access to the textbook, and that ChatGPT-4-Turbo was more accurate than ChatGPT-3.5-Turbo on nearly all exams. The findings demonstrate the potential of generative artificial intelligence tools in academic assessments and provide insights into comparative performance of these models.
  • Item
    Evaluating Topic Models with OpenAI Embeddings: A Comparative Analysis on Variable-Length Texts Using Two Datasets
    (2025-01-07) Wahbeh, Abdullah; Al-Ramahi, Mohammad; El-Gayar, Omar; Elnoshokaty, Ahmed; Nasralah, Tareq
    Topic modeling is a crucial unsupervised machine learning technique for identifying themes within unstructured text. This study compares traditional topic modeling methods, like Latent Dirichlet Allocation (LDA), against advanced embedding-based models, specifically BERTopic-OpenAI. The analysis utilizes two distinct datasets: user reviews from the mental health app Replika and the 20newsgroup dataset. For the Replika dataset, both methods identified common themes, but BERTopic-OpenAI uncovered additional nuanced topics, demonstrating its enhanced semantic capabilities. Quantitative evaluation of the 20newsgroup dataset further highlighted BERTopic-OpenAI's advantage through achieving higher topic coherence and diversity than the best-performing LDA model. These results suggest that embedding-based models provide more coherent, interpretable, and diverse topics, making them valuable tools for extracting meaningful insights from extensive and variable-length text corpora. Future research should focus on refining these advanced techniques to improve their applicability and effectiveness in dynamic and varied textual environments.
  • Item
    Enhancing Ontologies with Large Language Models: A Semi-Automated Approach
    (2025-01-07) Pham, Tran Van Anh; Huettemann, Sebastian; Mueller, Roland M.
    The process of creating and maintaining domain ontologies is a time- and resource-intensive activity, given the dynamic nature of domain knowledge and the regular introduction of new terms. This study aims to determine the effectiveness of large language models (LLMs) in augmenting the domain ontology authoring process. We fine-tuned state-of-the-art pre-trained LLMs and evaluated their performance on two tasks: synonym identification and parent-child relationship identification. The models achieved 98% accuracy in the first task and 75.4% accuracy in the second, demonstrating significant capabilities in automating synonym identification and relationship classification. In addition to providing a methodological basis for further extending and improving these results, we demonstrate that LLMs can be effectively used in ontology development and maintenance. This can save time and effort in the process.
  • Item
    Support Ticket Anonymization: Advancing Data Privacy with Transformer-Based Named Entity Recognition
    (2025-01-07) Raffetseder, David; Weilguny, Carl; Haidinger, Patrick; Pichler, Hans-Peter; Narzt, Wolfgang
    Organizations are recognizing the inherent potential of tapping into their existing knowledge base of historical data, employing data-centric and AI-driven systems to ameliorate their customer support process. However, as is often the case with transformative advancements, this vision poses its challenges. Central among them is the concern for privacy and data protection. Before an AI-driven system can be utilized, it is crucial to ensure that the contents of the knowledge base, which often carries sensitive personally identifiable information (PII), are thoroughly anonymized. This paper proposes an anonymization solution tailored for the support ticket data of an industrial automation company. The anonymization solution was developed by comparatively evaluating machine-learning-based approaches based on state-of-the-art transformer architectures. According to the evaluations and experiments in the domain-specific context, the best-performing architectural approach is an ensemble approach, combining multiple transformer-based language models trained to perform Named Entity Recognition with static, pattern-based approaches. Satisfactory results for the use case with an overall recall of PII entities of more than 97%, which therefore come close to state-of-the-art performance from other domains, have been achieved by this approach, which also involved fine-tuning language models on domain data to further improve the performance.