Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/60186

Detecting Important Terms in Source Code for Program Comprehension

File Size Format  
0746.pdf 296.74 kB Adobe PDF View/Open

Item Summary

Title:Detecting Important Terms in Source Code for Program Comprehension
Authors:Rodeghero, Paige
McMillan, Collin
Keywords:Software Product Lines and Platform Ecosystems: Engineering, Services, and Management
Software Technology
Source Code Terms, Program Comprehension
Date Issued:08 Jan 2019
Abstract:Software Engineering research has become extremely dependent on terms (words in textual data) extracted from source code. Different techniques have been proposed to extract the most "important'' terms from code. These terms are typically used as input to research prototypes: the quality of the output of these prototypes will depend on the quality of the term extraction technique. At present no consensus exists about which technique predicts the best terms for code comprehension. We perform a literature review, and propose a unified prediction model based on a Naive Bayes algorithm. We evaluate our model in a field study with professional programmers, as well as a standard 10-fold synthetic study. We found our model predicts the top quartile of the most-important terms with approximately 50% precision and recall, outperforming other popular techniques. We found the predictions from our model to help programmers to the same degree as the gold set.
Pages/Duration:10 pages
URI:http://hdl.handle.net/10125/60186
ISBN:978-0-9981331-2-6
DOI:10.24251/HICSS.2019.902
Rights:Attribution-NonCommercial-NoDerivatives 4.0 International
https://creativecommons.org/licenses/by-nc-nd/4.0/
Appears in Collections: Software Product Lines and Platform Ecosystems: Engineering, Services, and Management


Please email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

This item is licensed under a Creative Commons License Creative Commons