Assessment, Evaluation and Measurements (AEM)
Permanent URI for this collection
Browse
Recent Submissions
Item Using Students’ Screencasts as Alternative to Written Submissions(2022-01-04) Christensen, HenrikIn this paper, we report our experiences on using student produced screencasts as a medium for students to explain and provide overview of their solution to advanced design and programming exercises. In our context, the screencasts have replaced written reports as submissions, and we report both on students' perception on work effort and effectiveness of screencasts as well as teaching assistants' experiences in assessing and marking the screencasts. Our main conclusions are that screencasted submissions is an important tool in the teacher's toolbox for some categories of learning tasks, but there are a number of best practices to follow to gain the full benefits of the approach.Item Semi-Automatic Assessment of Modeling Exercises using Supervised Machine Learning(2022-01-04) Krusche, StephanMotivation: Modeling is an essential skill in software engineering. With rising numbers of students, introductory courses with hundreds of students are becoming standard. Grading all students’ exercise solutions and providing individual feedback is time-consuming. Objectives: This paper describes a semi-automatic assessment approach based on supervised machine learning. It aims to increase the fairness and efficiency of grading and improve the provided feedback quality. Method: While manually assessing the first submitted models, the system learns which elements are correct or wrong and which feedback is appropriate. The system identifies similar model elements in subsequent assessments and suggests how to assess them based on scores and feedback of previous assessments. While reviewing new submissions, reviewers apply the suggestions or adjust them and manually assess the remaining model elements. Results: We empirically evaluated this approach in three modeling exercises in a large software engineering course, each with more than 800 participants, and compared the results with three manually assessed exercises. A quantitative analysis reveals an automatic feedback rate between 65 % and 80 %. Between 4.6 % and 9.6 % of the suggestions had to be manually adjusted. Discussion: Qualitative feedback indicates that semi-automatic assessment reduces the effort and improves consistency. Few participants noted that the proposed feedback sometimes does not fit the context of the submission and that the selection of feedback should be further improved.Item Diagnosability, Adequacy & Size: How Test Suites Impact Autograding(2022-01-04) Clegg, Benjamin; Fraser, Gordon; Mcminn, PhilAutomated grading is now prevalent in software engineering courses, typically assessing the correctness of students' programs using automated test suites. However, deficiencies in test suites could result in inconsistent grading. As such, we investigate how different test suites impact grades, and the extent to which their observable properties influence these grades. We build upon existing work, using students' solution programs, and test suites that we constructed using a sampling approach. We find that there is a high variation in grades from different test suites, with a standard deviation of ~10.1%. We further investigate how several properties of test suites influence these grades, including the number of tests, coverage, ability to detect other faults, and uniqueness. We use our findings to provide tutors with strategies for building test suites that evaluate students' software with consistency. These strategies include constructing test suites with high coverage, writing unique and diverse tests which evaluate solutions' correctness in different ways, and to run the tests against artificial faults to determine their quality.Item Calibrated Peer Reviews in Requirements Engineering Instruction: Application and Experiences(2022-01-04) Tenbergen, Bastian; Daun, MarianInstructing Requirements Engineering (RE) is a challenging task due to the absence of single absolute and correct solutions computer science students so often strive for. Instead, there is often a variety of compromise solutions for each RE problem. Therefore, it is essential that aspiring Software Engineers are exposed to as many solution alternatives as possible to experience the implications of RE decisions. To facilitate this, we propose a learning-by-multiple-examples process, in which we make use of a calibrated peer review grading model for assignments. Paired with a think-pair-share model of semester-long, industry-realistic, project-based low-stakes milestones, we were able to generate a rich collaborative learning atmosphere. In this paper, we report the course design and experiences from the application of calibrated peer reviews in an undergraduate RE course. Qualitative and quantitative application results show that calibrated peer reviews significantly improve students’ learning outcomes.Item Introduction to the Minitrack on Assessment, Evaluation and Measurements (AEM)(2022-01-04) Krusche, Stephan