Event detection and prediction based on social multimedia data

Event detection and prediction based on social multimedia data




Social media consists of a set of internet-based applications built on the technological and ideological foundations of the Web 2.0, which permit user interaction and creation and sharing of user-generated content (Bartlett & Miller, 2013). It is a conglomerate of a range of social networking sites, specifically new media forms, such as Twitter, Google Plus and Facebook. These platforms present impressive amount of data on users and their social interactions, as a result providing rich opportunities for research investigation, including predicting future events, such as market demands, finance, health and entertainment (Tsagkias, 2008). Researchers have showed significant interest in exploring the social media to predict future events (Bollena et al., 2011). Between 2006 and 2012, some 18 percent of the research community showed interest in predicting events with the social media. Studies have also indicated that a substantial fraction of the social media streams consist of events (VanDan, 2012). Based on this premise, the thesis proposes that having an accurate and effective event detection method, using the social media, is critical in avoiding unintended implications, misinformation and false expectations.

Research design

Based on the exploratory research problem, the study will involve an exploratory research design (Anon, n.d.; Bryman, 2012). A survey of literature shows that few related studies can be referred to. Hence, the focus would be on gaining familiarity and insight into the most accurate and effective detection method to detect events from the Twitter stream, for subsequent investigations (De Vaus, 2001; Seale, 2004). Hence, the objectives of the research design include gaining familiarity with the basic concerns concerning the use of Twitter in predicting events, obtaining a well-grounded picture of the accurate and effective detection methods, refining issues for systematic investigations and setting direction for future research (Dou et al., 2012; Gilbert, 2008). Consequently, the complexity and the length of the research design are focused on:

Identifying the research problem and justifying its relevance;

Reviewing previous studies;

Outlining research questions and formulating hypothesis;

Description of research methodology to answer the research questions;

Description of data analysis methods.

Research Purpose

The research also aims to develop accurate and effective detection method for detecting events from the Twitter stream that will strengthen future researches on predicting events, such as disasters, using the social media. To determine whether tweets in the form of text and photo analysis can be synthesized to effectively predict events and to determine patterns in the manner in which events occur and respond to stimuli. To determine effective methods of applying the social media in organizational risk management strategies and to improve public health preparedness or improved disaster management actions.

Problem statement

Essentially, event detection denotes activities that cause change to the volume of the text data that discuss the related topic at a particular time (Schoen et al., 2013). A range of events are detectable from the social media, such as disease epidemics, traffic and disasters. On the other hand, multimedia event detection consists of events discussed by individuals and portrayed by the multimedia content, shared through social networking sites.

Three key challenges have to be addressed in using social networking sites, such as Twitter to detect these events. The first includes determining the tweets that are relevant to the events, since some tweets share terms relevant to certain events (VanDan, 2012; Choudhury et al., 2013). The second challenge involves determining the time of the event, since defining the precise time of events, such as disease outbreak may be tricky (Anon, 2013). The third key challenge includes determining where the event happened. Due to these challenges, grouping Twitter data to accurately predict an event can be a challenge (Liu et al., 2011). To overcome the challenges, this research seeks to develop an accurate and effective event detection method to detect and predict occurrence of events, using Twitter stream, by picking up tweets with texts and media photos.

Research Significance

The major significance of this study is to develop accurate and effective detection method, so as to detect events from the Twitter stream that will strengthen future researches on predicting events, using the social media (Anon, 2012; Kwak et al., 2010). The exploratory results will also illustrate its improved performance over the factors or text and photo analysis and synthesizing the process of predicting events, in order to determine patterns in the manner in which events occur and respond to stimuli (Arafat et al., 2013). In regards to the challenges affecting the effectiveness of Twitter data in real time assessment of epidemics, such as influenza activity, in effectively predicting the events, the findings of the study shall provide an opportunity to significantly improve public health preparedness or improved disaster management vigilance (Achrekar et al., 2003). The findings of the study are also expected to be of significant interest to the current event prediction or organizational risk management strategies.

Research Question

In investigating into the accurate and effective detection method for detecting events from the Twitter stream, the research will attempt to answer 5 key research questions.

Is developing accurate and effective detection method for detecting events from the Twitter stream dependent on the time, place of events and relevance of tweets?

Can combination of tweets picked with texts and media photos ensure effective event detection?

Does combination of tweets with texts and media photos define the precise time of events?

Do the texts and media photos make it easy to determine tweets that are relevant to certain events?

Can the photos, texts, textual semantic and their combination lead to best prediction accuracy of events?

Research Methodology and data analysis

The research method will mainly consist of data mining and predictive analytics. It will comprise statistical analysis of unprecedentedly large datasets derived from Twitter (Gundecha & Liu, 2012; Pang & Lee, 2007). Accordingly, Twitter streams will be monitored so as to pick up text with photos that shall afterwards be stored in a database. Subsequently, extraction of the texts and photos would be performed in the mining tool. The study will use ‘bag or words’ as the texts’ central feature which would afterwards be collected through the use of Term Frequency-Inverse Document Frequency method (TF-IDF). In the case of the visual features, color histogram and Scale-Invariant Feature Transform (SIFT) will be used. The mining results will afterwards be compared with other research results that used different methods to determine their consistency.


Achrekar, H., Gandhe, A., Lazarus, R., Yu, S. & Liu, B. (2003). Predicting Flu Trends using Twitter Data. The First International Workshop on Cyber-Physical Networking Systems, 702-707

Anon. (2012). Predicting Information Credibility in Time-Sensitive Social Media. Emerald Group Publishing Limited

Anon. (2013). What can Twitter tell us about the real world? Retrieved: <https://sites.google.com/site/twitterandtherealworld/home>

Anon. (n.d.). What Is Research Design?. Retrieved: <http://www.nyu.edu/classes/bkg/methods/005847ch1.pdf>

Arafat, J., Halimu, C., Habib, M. & Hossain, R. (2013). Emotion Detection and Event Prediction System. Global Journal of Computer Science and Technology Network, Web & Security 13(13), 1-7

Bartlett, J. & Miller, C. (2013). The State of The Art: A Literature Review of Social Media Intelligence Capabilities For Counter-Terrorism. London: Demos

Bollena, J., Maoa, H. & Zengb, X. (2011). Twitter mood predicts the event. Journal of Computational Science 2(1), 1–8

Bryman, A. (2012). Social Research Methods. London: Oxford University Press,

Choudhury, M., Gamon, M., Counts, S. & Horvitz, E. (2013). Predicting Depression via Social Media. Association for the Advancement of Artificial Intelligence. Retrieved: <http://research.microsoft.com/pubs/192721/icwsm_13.pdf>

De Vaus, D. (2001). Research Design in Social Research. London: SAGE. Retrieved: < http://libguides.usc.edu/content.php?pid=83009&sid=818072>

Dou, W., Wang, X., Ribarski, W. & Zhou, M. (2012). Event Detection in Social Media Data. Retrieved: <http://coitweb.uncc.edu/~xwang25/pubs/2012/Dou-Event%20Detection%20Tasks-2012.pdf>

Gilbert, N. (2008). Researching Social Life. New York: Sage Publications

Gundecha, P. & Liu, H. (2012). Mining Social Media: A Brief Introduction. Informs.

Kwak, H., Lee, C., Park, H. & Moon (2010). What is twitter, a social network or a news media? Retrieved: < an.kaist.ac.kr/~haewoon/papers/2010-www-twitter.pdf>

Liu, X., Troncy, R. & Huet, B. (2011). Using Social Media to Identify Events. Retrieved: <http://www.eurecom.fr/~troncy/Publications/Troncy-wsm11.pdf>

Pang, B. & Lee, L. (2007). Opinion mining and sentiment analysis. Foundations and Trends in Information 2(1-2):1–135

Schoen, H., Gayo-Avello, D., Mustaraj, E. & Gloor, P. (2013). The Power of Prediction with Social Media. Retrieved: <https://www.uni-bamberg.de/fileadmin/uni/fakultaeten/sowi_lehrstuehle/politikwissenschaften_2/MANUSKRIPTE_FEB/Schoen_et_al._2013_Predicting_with_Social_Media.pdf.

Seale, C. (2004). Social Research Methods: A Reader. New York: Psychology Press

Tsagkias, M. (2008). Mining Social Media: Tracking Content and Predicting Behaviour. Amsterdam: SIKS

VanDan, C. (2012). A Probabilistic Topic Modeling Approach For Event Detection In Social Media. A Thesis Submitted to Michigan State Universityin partial fulfillment of the requirements for the degree of Master of Science Computer Science 2012

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *