Big Data and Analytics: Pathways to Maturity

Item

Exploring Critical Success Factors in Agile Analytics Projects

( 2020-01-07) Tsoy, Mikhail ; Staples, D. Sandy

Via updating Chow and Cao’s list of success factors for agile projects, attributes of potential critical success factors (CSF’s) for agile analytics projects were identified from the literature. Ten new attributes were added to Chow and Cao’s original list. Seven new attributes from the general agile project literature address: risk appetite, team diversity and availability, engagement, project planning, shared goals, and methods uncertainty. Three attributes specific to analytics projects were added: data quality, model validation, and building customers’ trust in model solution. The potential validity of the various CSF’s and attributes was explored via data from case studies of two analytics projects that varied in deployment success. The more successful project was found to be stronger in almost all the factors than the failed project. The findings can help researchers and analytics practitioners understand the environmental conditions and project actions that can help get business value from their analytics initiatives.

Item

A New Metric for Lumpy and Intermittent Demand Forecasts: Stock-keeping-oriented Prediction Error Costs

( 2020-01-07) Martin, Dominik ; Spitzer, Philipp ; Kühl, Niklas

Forecasts of product demand are essential for short- and long-term optimization of logistics and production. Thus, the most accurate prediction possible is desirable. In order to optimally train predictive models, the deviation of the forecast compared to the actual demand needs to be assessed by a proper metric. However, if a metric does not represent the actual prediction error, predictive models are insufficiently optimized and, consequently, will yield inaccurate predictions. The most common metrics such as MAPE or RMSE, however, are not suitable for the evaluation of forecasting errors, especially for lumpy and intermittent demand patterns, as they do not sufficiently account for, e.g., temporal shifts (prediction before or after actual demand) or cost-related aspects. Therefore, we propose a novel metric that, in addition to statistical considerations, also addresses business aspects. Additionally, we evaluate the metric based on simulated and real demand time series from the automotive aftermarket.

Item

Model Interpretation and Explainability towards Creating Transparency in Prediction Models

( 2020-01-07) Dolk, Daniel ; Kridel, Donald ; Dineen, Jacob ; Castillo, David

Explainable AI (XAI) has a counterpart in analytical modeling which we refer to as model explainability. We tackle the issue of model explainability in the context of prediction models. We analyze a dataset of loans from a credit card company and apply three stages: execute and compare four different prediction methods, apply the best known explainability techniques in the current literature to the model training sets to identify feature importance (FI) (static case), and finally to cross-check whether the FI set holds up under “what if” prediction scenarios for continuous and categorical variables (dynamic case). We found inconsistency in FI identification between the static and dynamic cases. We summarize the “state of the art” in model explainability and suggest further research to advance the field.

Item

Understanding Customer Preferences Using Image Classification – A Case Study

( 2020-01-07) Brusch, Ines

Today, companies have a large amount of data at their disposal. In addition to classic data in text or table form, the number of images also increases enormously. This is particularly the case if the customer contact exists via the Internet, e.g., social networks, blogs or forums. If these images can be evaluated, they lead to a better understanding of the customer. Improved recommendations can be made and customer satisfaction can be increased. This paper shows by means of support vector machines (SVM), convolutional neural networks (CNN) and cluster analyses how it is possible for companies to evaluate image data on their own and thus to understand and classify the customer. The data of travel platform users serve as a case study. Advantages and disadvantages of, as well as prerequisites for SVMs and CNNs are pointed out and segmentation of the users on the basis of their images is made.

Item

Easy and Efficient Hyperparameter Optimization to Address Some Artificial Intelligence “ilities”

( 2020-01-07) Bihl, Trevor ; Schoenbeck, Joe ; Steeneck, Daniel ; Jordan, Jeremy

Artificial Intelligence (AI), has many benefits, including the ability to find complex patterns, automation, and meaning making. Through these benefits, AI has revolutionized image processing among numerous other disciplines. AI further has the potential to revolutionize other domains; however, this will not happen until we can address the “ilities”: repeatability, explain-ability, reliability, use-ability, trust-ability, etc. Notably, many problems with the “ilities” are due to the artistic nature of AI algorithm development, especially hyperparameter determination. AI algorithms are often crafted products with the hyperparameters learned experientially. As such, when applying the same algorithm to new problems, the algorithm may not perform due to inappropriate settings. This research aims to provide a straightforward and reliable approach to automatically determining suitable hyperparameter settings when given an AI algorithm. Results, show reasonable performance is possible and end-to-end examples are given for three deep learning algorithms and three different data problems.

Item

Introduction to the Minitrack on Big Data and Analytics: Pathways to Maturity

( 2020-01-07) Armour, Frank ; Kaisler, Stephen ; Espinosa, J.

Big Data and Analytics: Pathways to Maturity

Permanent URI for this collection

Browse

Browse

Recent Submissions