Technical Reports
Permanent URI for this collectionhttps://hdl.handle.net/10125/33076
Browse
Recent Submissions
Item type: Item , Distributed Anonymous Computation of Social Distance(IEEE, 2016-01-10) Biagioni, EdoardoIn a distributed social network, no single system holds information about all the individuals in the network, and no single system is trusted by all the individuals in the network. It is nonetheless desirable to reliably compute the social distance among individuals. This must be done anonymously, without giving away any identifying information about individuals in the social network, and reliably, without allowing anyone to pretend to be socially closer to someone else than they actually are. The Social Network Connectivity Algorithm, or SoNCA, ac- complishes these goals in a distributed manner. This paper describes both the high-level algorithm and a concrete design that is intended for future use with a network, AllNet, designed to provide secure interpersonal communication utilizing all avail- able means, including Internet, cellular communications, ad-hoc networking and delay-tolerant networking.Item type: Item , AllNet: using Social Connections to Inform Traffic Prioritization and Resource Allocation(http://alnt.org/, 2012-10) Biagioni, EdoardoAllNet is a new networking protocol designed to provide communication utilizing all available means, including Internet and cellular communications, but when these are not available, also ad-hoc networking and delay-tolerant networking. These latter mechanisms are best for low-bandwidth commu- nications. Effective support of low-bandwidth networking needs message prioritization, which can benefit by knowing whether messages are being sent on behalf of someone to whom the owner of the mobile device is socially connected. By keeping track of the social network of each of the friends of the owner of the mobile device, the device can devote its resources to supporting better quality communication among people its owner cares about, and fewer resources to communication among people its owner doesn’t know. AllNet generalizes this notion by anonymously keeping track of friends, friends of friends, friends of friends of friends, and so on. Doing this while using only limited communication and storage is the challenge addressed by the AllNet social network connectivity algorithm described and evaluated in this paper.Item type: Item , Ubiquitous Interpersonal Communication over Ad-Hoc Networks and the Internet(alnt.org, 2013) Biagioni, EdoardoThe hardware and low-level software in many mobile de- vices are capable of mobile-to-mobile communication, in- cluding ad-hoc mode for 802.11, Bluetooth, and cognitive radios. We have started to leverage this capability to provide in- terpersonal communication both over infrastructure networks (the Internet), and over ad-hoc and delay-tolerant networks composed of the mobile devices themselves. This network is fully decentralized so it can function with- out any infrastructure, but takes advantage of Internet con- nections when available. Devices may communicate when- ever they are able to exchange packets. All interpersonal communication is encrypted and authenticated so packets may be carried by devices belonging to untrusted others. One challenge in a fully decentralized network is rout- ing. Our design uses Rendezvous Points (RPs) and Dis- tributed Hash Tables (DHTs) for delivery over the Internet, and hop-limited broadcast and Delay Tolerant Networking (DTN) within the ad-hoc network. Each device has a policy that determines how many pack- ets may be forwarded, and a packet prioritization mecha- nism that favors packets likely to consume fewer network resources. A goal of this design and implementation is to provide useful interpersonal communications using at most 1% of any given resource on mobile devices.Item type: Item , Mobility and Address Freedom in AllNet(alnt.org, 2014-06-06) Biagioni, EdoardoMobile devices can be addressed through a variety of means. We propose that each device select its own addresses, we motivate this choice, and we describe mechanisms for deliv- ering data using these addresses. Hierarchical point-of-attachment addresses are not effec- tive with mobile devices. The network has to maintain a global mapping between addresses and locations whether or not the address is topological. Since this mapping is needed anyway, there is not much point in having the structure of the address encode device location. Instead, we have designed a network protocol, AllNet, to support self-selected address- ing. When data is transmitted over the Internet, a Distributed Hash Table (DHT) provides a connection between senders and and receivers. The advantages of self-selected addresses include the abil- ity of devices to join and form a network without any need for prior agreement, and the ability to choose a personal, memorable address. When multiple devices choose the same address another mechanism, such as signed and encrypted messages, provides the necessary disambiguation.Item type: Item , Hackystat-SQI: First Progress Report(2005-07-01) Kagawa, A.This report presents the initial analysis that are available for Hackystat-SQI and future directions.Item type: Item , Studying Micro-Processes in Software Development Stream(2005-07-01) Kou, H.In this paper we propose a new streaming technique to study software development. As we observed software development consists of a series of activities such as edit, compilation, testing, debug and deployment etc. All these activities contribute to development stream, which is a collection of software development activities in time order. Development stream can help us replay and reveal software development process at a later time without too much hassle. We developed a system called Zorro to generate and analyze development stream at Collaborative Software Development Laboratory in University of Hawaii. It is built on the top of Hackystat, an in-process automatic metric collection system developed in the CSDL. Hackystat sensors continuously collect development activities and send them to a centralized data store for processing. Zorro reads in all data of a project and constructs stream from them. Tokenizers are chained together to divide development stream into episodes (micro iteration) for classification with rule engine. In this paper we demonstrate the analysis on Test-Driven Development (TDD) with this framework.Item type: Item , Continuous GQM: An automated framework for the Goal-Question-Metric paradigm(2005-08-01) Lofi, C.Measurement is an important aspect of Software Engineering as it is the foundation of predictable and controllable software project execution. Measurement is essential for assessing actual project progress, establishing baselines and validating the effects of improvement or controlling actions. The work performed in this thesis is based on Hackystat, a fully automated measurement framework for software engineering processes and products. Hackystat is designed to unobtrusively measure a wide range of metrics relevant to software development and collect them in a centralized data repository. Unfortunately, it is not easy to interpret, analyze and visualize the vast data collected by Hackystat in such way that it can effectively be used for software project control. A potential solution to that problem is to integrate Hackystat with the GQM (Goal / Question / Metric) Paradigm, a popular approach for goal-oriented, systematic definition of measurement programs for software-engineering processes and products. This integration should allow the goal-oriented use of the metric data collected by Hackystat and increase it’s usefulness for project control. During the course of this work, this extension to Hackystat which is later called hackyCGQM is implemented. As a result, hackyCGQM enables Hackystat to be used as a Software Project Control Center (SPCC) by providing purposeful high-level representations of the measurement data. Another interesting side-effect of the combination of Hackystat and hackyCGQM is that this system is able to perform fully automated measurement and analysis cycles. This leads to the development of cGQM, a specialized method for fully automated, GQM based measurement programs. As a summary, hackyCGQM seeks to implement a completely automated GQMbased measurement framework. This high degree of automation is made possible by limiting the implemented measurement programs to metrics which can be measured automatically, thus sacrificing the ability to use arbitrary metrics.Item type: Item , Priority Ranked Inspection: Supporting Effective Inspection in Resource-limited Organizations(2005-08-01) Kagawa, A.Imagine that your project manager has budgeted 200 person-hours for the next month to inspect newly created source code. Unfortunately, in order to inspect all of the documents adequately, you estimate that it will take 400 person-hours. However, your manager refuses to increase the budgeted resources for the inspections. How do you decide which documents to inspect and which documents to skip? Unfortunately, the classic definition of inspection does not provide any advice on how to handle this situation. For example, the notion of entry criteria used in Software Inspection determines when documents are ready for inspection rather than if it is needed at all. My research has investigated how to prioritize inspection resources and apply them to areas of the system that need them more. It is commonly assumed that defects are not uniformly distributed across all documents in a system, a relatively small subset of a system accounts for a relatively large proportion of defects. If inspection resources are limited, then it will be more effective to identify and inspect the defect-prone areas. To accomplish this research, I have created an inspection process called Priority Ranked Inspection (PRI). PRI uses software product and development process measures to distinguish documents that are "more in need of inspection" (MINI) from those ``less in need of inspection'' (LINI). Some of the product and process measures include: user-reported defects, unit test coverage, active time, and number of changes. I hypothesize that the inspection of MINI documents will generate more defects with a higher severity than inspecting LINI documents. My research employed a very simple exploratory study, which includes inspecting MINI and LINI software code and checking to see if MINI code inspections generate more defects than LINI code inspections. The results of the study provide supporting evidence that MINI documents do contain more high-severity defects than LINI documents. In addition, there is some evidence that PRI can provide developers with more information to help determine what documents they should select for inspection.Item type: Item , Results from the 2006 Classroom Evaluation of Hackystat-UH(2006-12-01) Johnson, P.This report presents the results from a classroom evaluation of Hackystat by ICS 413 and ICS 613 students at the end of Fall, 2006. The students had used Hackystat-UH for approximately six weeks at the time of the evaluation. The survey requests their feedback regarding the installation, configuration, overhead of use, usability, utility, and future use of the Hackystat-UH configuration. This classroom evaluation is a semi-replication of an evaluation performed on Hackystat by ICS 413 and 613 students at the end of Fall, 2003, which is reported in "Results from the 2003 Classroom Evaluation of Hackystat-UH". As the Hackystat system has changed significantly since 2003, some of the evaluation questions were changed. The data from this evaluation, in combination with the data from the 2003 evaluation, provide an interesting perspective on the past, present, and possible future of Hackystat. Hackystat has increased significantly in functionality since 2003, which has enabled the 2006 usage to more closely reflect industrial application, and which has resulted in significantly less overhead with respect to client-side installation. On the other hand, results appear to indicate that this increase in functionality has resulted in a decrease in the usability and utility of the system, due to inadequacies in the server-side user interface. Based upon the data, the report proposes a set of user interface enhancements to address the problems raised by the students, including Ajax-based menus and parameters, workflow based organization of the user interface, real-time display for ongoing project monitoring, annotations, and simplified data exploration facilities.Item type: Item , Evaluation of Jupiter: A Lightweight Code Review Framework(2006-12-01) Yamashita, T.Software engineers generally agree that code reviews reduce development costs and improve software quality by finding defects in the early stages of software development. In addition, code review software tools help the code review process by providing a more efficient means of collecting and analyzing code review data. On the other hand, software organizations that conduct code reviews often do not utilize these review tools. Instead, most organizations simply use paper or text editors to support their code review processes. Using paper or a text editor is potentially less useful than using a review tool for collecting and analyzing code review data. In this research, I attempted to address the problems of previous code review tools by creating a lightweight and flexible review tool. This review tool that I have developed, called "Jupiter", is an Eclipse IDE Plug-In. I believe the Jupiter Code Review Tool is more efficient at collecting and analyzing code review data than the text-based approaches. To investigate this hypothesis, I have constructed a methodology to compare the Jupiter Review Tool to the text-based review approaches. I carried out a case study using both approaches in a software engineering course with 19 students. The results provide some supporting evidence that Jupiter is more useful and more usable than the text-based code review, requires less overhead than the text-based review, and appears to support long-term adoption. The major contributions of this research are the Jupiter design philosophy, the Jupiter Code Review Tool, and the insights from the case study comparing the text-based review to the Jupiter-based review.Item type: Item , Improving Software Development Process and Product Management with Software Project Telemetry(2006-12-01) Zhang, Q.Software development is slow, expensive and error prone, often resulting in products with a large number of defects which cause serious problems in usability, reliability, and performance. To combat this problem, software measurement provides a systematic and empirically-guided approach to control and improve software development processes and final products. However, due to the high cost associated with "metrics collection" and difficulties in "metrics decision-making," measurement is not widely adopted by software organizations. This dissertation proposes a novel metrics-based program called "software project telemetry" to address the problems. It uses software sensors to collect metrics automatically and unobtrusively. It employs a domain-specific language to represent telemetry trends in software product and process metrics. Project management and process improvement decisions are made by detecting changes in telemetry trends and comparing trends between different periods of the same project. Software project telemetry avoids many problems inherent in traditional metrics models, such as the need to accumulate a historical project database and ensure that the historical data remain comparable to current and future projects. The claim of this dissertation is that software project telemetry provides an effective approach to (1) automated metrics collection and analysis, and (2) in-process, empirically-guided software development process problem detection and diagnosis. Two empirical studies were carried out to evaluate the claim: one in software engineering classes, and the other in the Collaborative Software Development Lab. The results suggested that software project telemetry had acceptably-low metrics collection and analysis overhead, and that it provided decision-making value at least in the exploratory context of the two studies.Item type: Item , Statistical Modeling of Resource Availability in Desktop Grids(2007-11-01) Wingstrom, J.; Casanova, H.Desktop grids are compute platforms that aggregate and harvest the idle CPU cycles of individually owned personal computers and workstations. A challenge for using these platforms is that the compute resources are volatile. Due to this volatility the vast majority of desktop grid applications are embarrassingly parallel and high-throughput. Deeper understanding of the nature of resource availability is needed to enable the use of desktop grids for a broader class of applications. In this document we further this understanding thanks to statistical analysis of availability traces collected on real-world desktop grid platforms.Item type: Item , On the NP-Hardness of the RedundantTaskAlloc Problem(2007-11-01) Wingstrom, J.; Casanova, H.Consider an application that consists of n independent identical tasks to be executed on m computers, with m > n. Assume that each computer can execute a task with a given probability of success (typically <1). One can use redundancy to execute replicas of some of the tasks on the m-n computers. The problem is to determine how many replicas should be created for each task, or more precisely the number of task instances that should be created for each task and to which computers these instances should be allocated in order to maximize the probability of successful application completion. We formally define this problem, which we call RedundantTaskAlloc, and prove that it is NP-hard.Item type: Item , Health ManagementInformation Systems for Resource Allocation and Purchasing in Developing Countries(2007-12-01) Streveler, D.; Sherlock, S.World Bank, Health Nutrition and Population, Discussion Paper: The paper begins with the premise that it is not possible to implement an efficient, modern RAP strategy today without the effective use of information technology. The paper then leads the architect through the functionality of the systems components and environment needed to support RAP, pausing to justify them at each step. The paper can be used as a long-term guide through the systems development process as it is not necessary (and likely not possible) to implement all functions at once. The paper’s intended audience is those members of a planning and strategy body, working in conjunction with technical experts, who are charged with designing and implementing a RAP strategy in a developing country.Item type: Item , Multiple-Genome Annotation of Genome Fragments Using Hidden Markov Model Profiles(2008-01-01) Menor, M.; Baek, K.; Poisson, G.To learn more about microbes and overcome the limitations of standard cultured methods, microbial communities are being studied in an uncultured state. In such metagenomic studies, genetic material is sampled from the environment and sequenced using the whole-genome shotgun sequencing technique. This results in thousands of DNA fragments that need to be identified, so that the composition and inner workings of the microbial community can begin to be understood. Those fragments are then assembled into longer portions of sequences. However the high diversity present in an environment and the often low level of genome coverage achieved by the sequencing technology result in a low number of assembled fragments (contigs) and many unassembled fragments (singletons). The identification of contigs and singletons is usually done using BLAST, which finds sequences similar to the contigs and singletons in a database. An expert may then manually read these results and determine if the function and taxonomic origins of each fragment can be determined. In this report, an automated system called Anacle is developed to annotate, following a taxonomy, the unassembled fragments before the assembly process. Knowledge of what proteins can be found in each taxon is built into Anacle by clustering all known proteins of that taxon. The annotation performances from using Markov clustering (MCL) and Self- Organizing Maps (SOM) are investigated and compared. The resulting protein clusters can each be represented by a Hidden Markov Model (HMM) profile. Thus a “skeleton” of the taxon is generated with the profile HMMs providing a summary of the taxon’s genetic content. The experiments show that (1) MCL is superior to SOMs in annotation and in running time performance, (2) Anacle achieves good performance in taxonomic annotation, and (3) Anacle has the ability to generalize since it can correctly annotate fragments from genomes not present in the training dataset. These results indicate that Anacle can be very useful to metagenomics projects.Item type: Item , Accuracy and Responsiveness of CPU Sharing Using Xen's Cap Values(2008-05-01) Schanzenbach, D.; Casanova, H.The accuracy and responsiveness of the Xen CPU Scheduler is evaluated using the "cap value" mechanism provided by Xen. The goal of the evaluation is to determine whether state-of-the-art virtualization technology, and in particular Xen, enables CPU sharing that is sufficiently accurate and responsive for the purpose of enabling "flexible resource allocations" in virtualized cluster environments.Item type: Item , Traffic and Navigation Support through an Automobile Heads Up Display (A-HUD)(2008-05-01) Chu, K.; Brewer, R.; Joseph, S.The automobile industry has produced many cars with new features over the past decade. Taking advantage of advances in technology, cars today have fuel-efficient hybrid engines, proximately sensors, windshield wipers that can detect rain, built-in multimedia entertainment, and all-wheel drive systems that adjust power in real-time. However, the interaction between the driver and the car has not changed significantly. The information being delivered – both in quantity and method – from the car to the driver has not seen the same improvements as there has been “under the hood.” This is a position paper that proposes immersing the driver inside an additional layer of traffic and navigation data, and presenting that data to the driver by embedding display systems into the automobile windows and mirrors. We have developed the initial concepts and ideas for this type of virtual display. Through gaze tracking the digital information is superimposed and registered with real world entities such as street signs and traffic intersections.Item type: Item , Note Taking and Note Sharing While Browsing Campaign Information(2008-12-01) Robertson, S.; Vatrapu, R.; Abraham, G.Participants were observed while searching and browsing the internet for campaign information in a mock-voting situation in three online note-taking conditions: No Notes, Private Notes, and Shared Notes. Note taking significantly influenced the manner in which participants browsed for information about candidates. Note taking competed for time and cognitive resources and resulted in less thorough browsing. Effects were strongest when participants thought that their notes would be seen by others. Think-aloud comments indicated that participants were more evaluative when taking notes, especially shared notes. Our results suggest that there could be design trade-offs between e-Democracy and e-Participation technologies.Item type: Item , Resource Allocation using Virtual Clusters(2008-09-01) Stillwell, M.; Schanzenbach, D.; Casanova, H.In this report we demonstrate the utility of resource allocations that use virtual machine technology for sharing parallel computing resources among competing users. We formalize the resource allocation problem with a number of underlying assumptions, determine its complexity, propose several heuristic algorithms to find near-optimal solutions, and evaluate these algorithms in simulation. We find that among our algorithms one is very efficient and also leads to the best resource allocations. We then describe how our approach can be made more general by removing several of the underlying assumptions.Item type: Item , Resolving LR Type Conflicts at Translation or Compile Time(2009-06-01) Pager, D.The paper considers circumstances in which it is advantageous to resolve reduce-reduce conflicts at compile time, rather than at compiler-construction time. The application considered is that of translating English to one of the Romance languages, such as Italian, where adjectives and nouns have distinctive forms depending on their gender.
