Big Data and Analytics: Pathways to Maturity

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 8
  • Item
    Smart Objects: An Active Big Data Approach
    ( 2018-01-03) Kaisler, Stephen ; Money, William ; Cohen, Stephen
    The world of data and information has been steadily evolving due to changes in the expansion of complexity and of the data processed by our systems. Big Data has evolved from data that are numbers and characters conceived and collected by individuals, to unstructured data types collected by a variety of devices. Recent work has postulated that the Big Data evolutionary process is making a conceptual leap to incorporate intelligence.. This paper proposes that Big Data have not yet made a complete evolutionary leap, but rather that a new class of data - a higher level of abstraction is needed to understand and integrate this "intelligence" concept. This paper examines previous definitions, and offers a new definition for Smart Objects (SO) that extends this evolutionary path, examines the basic concept of smart data (is it really exhibiting properties associated with or purported to be intelligence?), and identifies issues and challenges associated with understanding Smart Objects as a new software paradigm. It concludes that Smart Objects incorporate new features and have different properties from passive and inert Big Data.
  • Item
    Counting Human Flow with Deep Neural Network
    ( 2018-01-03) Doong, Shing
    Human flow counting has many applications in space management. This study applied channel state information (CSI) available in IEEE 802.11n networks to characterize the flow count. Raw inputs including mean, standard deviation and five-number summary were extracted from windowed CSI data. Due to the large number of raw inputs, stacked denoising autoencoders were used to extract hierarchical features from raw inputs and a final layer of softmax regression was used to model the flow counting problem. It is found that this deep neural network structure beats other popular classification algorithms including random forest, logistic regression, support vector machine and multilayer perceptron in predicting the flow count with attractive speed performance.
  • Item
    Feature enrichment through multi-gram models
    ( 2018-01-03) Forss, Thomas
    We introduce a feature enrichment approach, by developing multi-gram cosine similarity classification models. Our approach combines cosine similarity features of different N-gram word models, and unsupervised sentiment features, into models with a richer feature set than any of the approaches alone can provide. We test the classification models using different machine learning algorithms on categories of hateful and violent web content, and show that our multi-gram models give across-the-board performance improvements, for all categories tested, compared to combinations of baseline unigram, N-gram, and sentiment classification models. Our multi-gram models perform significantly better on highly imbalanced sets than the comparison methods, while this enrichment approach leaves room for further improvements, by adding instead of exhausting optimization options.
  • Item
    An Efficient Recommender System Using Locality Sensitive Hashing
    ( 2018-01-03) Zhang, Kunpeng ; Fan, Shaokun ; Wang, Harry Jiannan
    Recommender systems are widely used for personalized recommendation in many business applications such as online shopping websites and social network platforms. However, with the tremendous growth of recommendation space (e.g., number of users, products, etc.), traditional systems suffer from time and space complexity issues and cannot make real-time recommendations when dealing with large-scale data. In this paper, we propose an efficient recommender system by incorporating the locality sensitive hashing (LSH) strategy. We show that LSH can approximately preserve similarities of data while significantly reducing data dimensions. We conduct experiments on synthetic and real-world datasets of various sizes and data types. The experiment results show that the proposed LSH-based system generally outperforms traditional item-based collaborative filtering in most cases in terms of statistical accuracy, decision support accuracy, and efficiency. This paper contributes to the fields of recommender systems and big data analytics by proposing a novel recommendation approach that can handle large-scale data efficiently.
  • Item
    Leveraging Big Data Analytics to Improve Quality of Care In Health Care: A fsQCA Approach
    ( 2018-01-03) Wang, Yichuan
    Academics across disciplines such as information systems, computer science and healthcare informatics highlight that big data analytics (BDA) have the potential to provide tremendous benefits for healthcare industries. Nevertheless, healthcare organizations continue to struggle to make progress on their BDA initiatives. Drawing on the configuration theory, this paper proposes a conceptual framework to explore the impact of BDA on improving quality of care in health care. Specifically, we investigate how BDA capabilities interact with complementary organizational resources and organizational capabilities in multiple configurations to achieve higher quality of care. Fuzzy-set qualitative comparative analysis (fsQCA), which is a relatively new approach, was employed to identify five different configurations that lead to higher quality of care. These findings offer evidence to suggest that a range of solutions leading to better healthcare performance can indeed be identified through the effective use of BDA and other organizational elements.