Data Analytics, Data Mining and Machine Learning for Social Media

Permanent URI for this collection


Recent Submissions

Now showing 1 - 5 of 11
  • Item
    Identifying Citation Sentiment and its Influence while Indexing Scientific Papers
    ( 2020-01-07) Ghosh, Souvick ; Shah, Chirag
    Sentiment analysis has proven to be a popular research area for analyzing social media texts, newspaper articles, and product reviews. However, sentiment analysis of citation instances is a relatively unexplored area of research. For scientific papers, it is often assumed that the sentiment associated with citation instances is inherently positive. This assumption is due to the hedged nature of sentiment in citations, which is difficult to identify and classify. As a result, most of the existing indexes focus only on the frequency of citation. In this paper, we highlight the importance of considering the sentiment of citation while preparing ranking indexes for scientific literature. We perform automatic sentiment classification of citation instances on the ACL Anthology collection of papers. Next, we use the sentiment score in addition to the frequency of citation to build a ranking index for this collection of scientific papers. By using various baselines, we highlight the impact of our index on the ACL Anthology collection of papers. Our research contributes toward building more sentiment sensitive ranking index which better underlines the influence and usefulness of research papers.
  • Item
    Success Factors of Donation-Based Crowdfunding Campaigns: A Machine Learning Approach
    ( 2020-01-07) Alazazi, Massara ; Wang, Bin ; Allan, Tareq
    Crowdfunding has emerged as an alternative mechanism to traditional financing mechanisms in which individuals solicit financial capital or donation from the crowd. The success factors of crowdfunding are not well-understood, particularly for donation-based crowdfunding platforms. This study identifies key drivers of donation-based crowdfunding campaign success using a machine learning approach. Based on an analysis of crowdfunding campaigns from, we show that our models were able to predict the average daily amount received at a high level of accuracy using variables available at the beginning of the campaign and the number of days it had been posted. In addition, Facebook and Twitter shares and the number of likes, improved the accuracy of the models. Among the six machine learning algorithms we used, support vector machine (SVM) performs the best in predicting campaign success.
  • Item
    Evaluation of VI Index Forecasting Model by Machine Learning for Yahoo! Stock BBS Using Volatility Trading Simulation
    ( 2020-01-07) Sasaki, Kodai ; Suwa, Hirohiko ; Ogawa, Yuki ; Umehara, Eiichi ; Yamashita, Tatsuo ; Tsubouchi, Kota
    The risk avoidance is very crucial in investment and asset management. One commonly used index as a risk index is the VI index. Suwa et al. (2017) analyzed stock bulletin board messages and predicted it rise. In our study, we developed a simulation of trading Nikkei stock index options using intra-day data and verified the validity of the VI index prediction model proposed by Suwa et al. In a period from November 18, 2014, to June 29, 2016, we conducted a simulation using a long straddle strategy. The profit and loss from trading with the instructions of their model was +3,021 yen. The benchmark's profit and loss was -3,590 yen. The improvement with their model was +6,611 yen. Therefore, we confirmed that Suwa et al.'s VI index prediction model might be effective.
  • Item
    Using Data Analytics to Filter Insincere Posts from Online Social Networks A Case Study: Quora Insincere Questions
    ( 2020-01-07) Al-Ramahi, Mohammad ; Alsmadi, Izzat
    The internet in general and Online Social Networks (OSNs) in particular continue to play a significant role in our life where information is massively uploaded and exchanged. With such high importance and attention, abuses of such media of communication for different purposes are common. Driven by goals such as marketing and financial gains, some users use OSNs to post their misleading or insincere content. In this context, we utilized a real-world dataset posted by Quora in to evaluate different mechanisms and algorithms to filter insincere and spam contents. We evaluated different preprocessing and analysis models. Moreover, we analyzed the cognitive efforts users made in writing their posts and whether that can improve the prediction accuracy. We reported the best models in terms of insincerity prediction accuracy.
  • Item
    The New Window to Athletes’ Soul – What Social Media Tells Us About Athletes’ Performances
    ( 2020-01-07) Gruettner, Arne ; Vitisvorakarn, Min ; Wambsganss, Thiemo ; Rietsche, Roman ; Back, Andrea
    Professional sports has evolved from a game to an organization that has been codified, strategized, and commercialized. One factor that is shaping the sports industry is the pervasiveness of social media. On the one hand, social media is used as a powerful medium for distributing and getting news, engaging in topical discussions, and empowering brands. On the other hand, social media has become a crucial mouthpiece for athletes to interact with peers, share opinions, thoughts, and feelings. However, millions of followers, tweets, and likes later, researchers, practitioners, and athletes alike ask whether social media has an impact on an athlete’s performance. We conducted a social media usage and a sentiment analysis of 124,341 Twitter tweets extracted from 31 tennis athletes. We linked these data to 8,095 corresponding match day performances. The results show that high social media usage has a significant negative impact on athletes’ performance.