Collaboration for Data Science

Item

Unsupervised Deep Learning for Fake Content Detection in Social Media

( 2021-01-05) Tao, Jie ; Fang, Xing ; Zhou, Lina

Fake content is ever increasing in the online environment, driven by various motivations such as gain-ing commercial and political advantages. The interactive and collaborative nature of social media further fuels the growth of fake content by exerting fast and widespread influence. Despite growing and interdisciplinary efforts in detecting fake content in social media, some common research challenges remain to be addressed such as humans’ cognitive bias and scarcity of labeled data for training supervised machine learning models. This study aims to tackle both challenges by developing unsupervised deep learning models for the detection of fake content in social media. In view that traditional linguistic features fail to capture context information, our proposed method learns feature representations from the context in social media content. The empirical evaluation results with fake comments from YouTube demonstrate that our proposed methods not only outperform baseline models with traditional unsupervised machine learning techniques, but also achieve comparable performance to the state-of-the-art supervised models. The proposed analytical pipeline provides an end-to-end solution to detecting fake social media contents, which largely reduce the human labor required in collaborative data science teams (i.e., particularly the data labeling). The findings of this study can be used to facilitate collaboration in data science by reducing humans’ cognitive bias and improve the collaboration efficiency.

Item

Integrating Blockchain for Data Sharing and Collaboration Support in Scientific Ecosystem Platform

( 2021-01-05) Coelho, Raiane ; Braga, Regina ; David, José Maria ; Dantas, Mário ; Stroele, Victor ; Campos, Fernanda

Nowadays, scientific experiments are conducted in a collaborative way. In collaborative scientific experiments, aspects such interoperability, privacy and trust in shared data should be considered to allow the reproducibility of the results. A critical aspect associated with a scientific process is its provenance information, which can be defined as the origin or lineage of the data that helps to understand the results of the scientific experiment. Other concern when conducting collaborative experiments, is the confidentiality, considering that only properly authorized personnel can share or view results. In this paper, we propose BlockFlow, a blockchain-based architecture with the aim of bringing reliability to the collaborative research, considering the capture, storage and analysis of provenance data related to a scientific ecosystem platform (E-SECO).

Item

Collaboration for Big Data Analytics: Investigating the (Troubled) Relationship between Data Science Experts and Functional Managers

( 2021-01-05) Hagen, Janine

The utilization of insights from big data analytics (BDA) in business operations has been identified as a major driver to unlock value from big data. This emphasizes the importance of the involvement of functional business managers in BDA projects and draws attention to their collaboration with BDA experts, such as data scientists. Scholars have identified several challenges that explain why the success rates of BDA projects remain low. However, the relationship between managers and data science experts has not yet been examined as a potential reason for failure. By applying a social capital perspective on the relationship between these groups, we employ a multiple case study to investigate possible obstacles. We find that the relationship is largely troubled due to incongruent cognitive interpretations of BDA applications in the business context, and the absence of structural network ties. These findings suggest a previously under-researched reason why BDA projects still frequently fail.

Item

Introduction to the Minitrack on Collaboration for Data Science

( 2021-01-05) Zhou, Lina ; Paul, Souren ; Schwade, Florian

Collaboration for Data Science

Permanent URI for this collection

Browse

Browse

Recent Submissions