Institute of Information Technology
Alpen-Adria Universität Klagenfurt
Universitätsstr. 65-67, Klagenfurt, Austria
Social media is getting increasingly important during a crisis, especially due to the widespread usage of mobile devices (e.g., phones). People can document whenever and whatever they want in several modes (i.e., pictures, videos and/or text). Social media gives an opportunity to communicate the current situation to other (affected) people or to emergency agencies although mobile phone or emergency lines may be overloaded. In the last years, studies were conducted focusing on different aspects of social media in crisis management underlining its continuously increasing importance in this area. Also, systems, tools and algorithms performing social media analysis have been developed and implemented to automatize monitoring, classification or aggregation tasks. This contribution summarizes important research work (i.e., work not already covered within a separate contribution in this E-Letter) using social media (analysis) in the context of crisis management. It gives an overview on existing case studies and analysis frameworks developed to support emergency agencies in several crisis management activities. This work should be seen as a starting point for people interested in this topic or coming from similar research areas to gain an overview on social media analysis in crisis management.
Crisis management consists of three main phases: preparedness, response and recovery. Preparedness takes all actions (e.g., training, advice for citizens, etc.) into account to be prepared in the case of a disaster. Response comprises activities that are direct reactions on the impact of an already occurred disaster to stabilize the situation. Recovery focuses on all steps to reestablish the damaged infrastructure. In all these phases, information about the situation at hand is important to take decisions and plan actions.
In all three phases (especially, in response), social media is getting increasingly important for crisis management. People can document their situation, observations or anything around the crisis at hand, despite overloaded phone lines or not yet arrived emergency agencies. Several studies, summarized in Section 2, show the importance of social media during a crisis. The studies show that social media is used in different contexts. Also, different emergency agencies (e.g., police forces) have already reacted and take this new communication channel into account during their work. Acquiring data from social media allows agencies to see the situation from another perspective, e.g., of victims, bystanders or any other related persons. The - via social media - identified incidents and threats can be used to plan emergency response in a more efficient way.
Beside the collaboration between emergency agencies and the citizens, social media also offers a collaboration platform between emergency agencies, e.g., for knowledge sharing and information exchange. It also allows the coordination and communication of self-help groups initiated by affected citizens, covering collaboration between the public in case of an emergency.
Independent of the range of applications, social media gained high research attention in recent years. Thus, this contribution summarizes important studies and technical work in the area of social media analysis in crisis management. The contribution is structured as follows. Section 2 gives an overview of related studies. Section 3 describes several technical developments for social media analysis. Section 4 shows social media tools introduced in a context different from crisis management, but which are nevertheless interesting options to be adapted to crisis management. In Sec. 5 an overview of the approaches is given. Section 6 summarizes and concludes the work.
2 STUDIES ON SOCIAL MEDIA USE IN CRISIS
There were several studies performed to highlight the importance of social media in emergency management. Palen  and Vieweg et al.  described the importance of social media during several different events (e.g., 2007 California Wildfires, 2010 Red River floods, 2009 Oklahoma Grassfires). Also, the use of social media sharing tools (i.e., Flickr platform) in the context of several emergencies (e.g., Minneapolis Bridge collapse 2007) is addressed in . It is stated that the pictures posted are significant for emergency response.
Choudhary et al.  described the influence of social media during the Egyptian revolution. The authors performed sentiment analysis, topic extraction and they examine in general the usage of social media. During the revolution, tweets were used for news propagation, documenting and distributing information on events. Yates and Paquette  depicted social media as an important collaboration tool also between emergency agencies. They studied social media use in the context of the Haiti earthquake in 2010.
Denef et al.  described a study focusing on Twitter usage in police forces. They analyzed the tweeting behavior of two big UK police forces during the UK riots 2011. Analysis showed that the amount of followers of the forces increased significant after the riots . The authors uncovered two communication strategies (informal vs. formal) and highlighted the (dis-)advantages. In addition, they also stated that several investigations are needed to support police forces in using Twitter in the future. For example, considering the different communication strategies in the development of monitoring/analysis tools is important. Organizational policies in using social media are needed as well. Additional studies showed that police forces see social media as positive development .
Also, Hughes et al.  analyzed social platforms for police and fire services in the context of Hurricane Sandy 2012. The study pointed out that tools are needed to make social media more 'listenable' (i.e., better to listen and monitor)  for both the public and the emergency agencies themselves.
Dugdale et al.  described the use of SMS (short message service) and tweets during the Haiti earthquake. Several information systems (e.g., Sahana, Ushahidi, etc.) were used during the Haiti earthquake for social media involvement. Studies in this context show that information from social media on a higher/aggregate level turns out to be very helpful, especially as indicator of events.
For example, several people from a specific area write about persons in rubble due to collapsed infrastructure and buildings. Independent if there are really all those people trapped as documented (e.g., people could escape or free themselves), the fact that there are many collapsed buildings and, therefore, threats for members of public in this area is true.
Reuter et al.  analyzed social media from the perspective of emergent groups (i.e., volunteer activities). They examined how social software (e.g., Twitter, Wikis, social networks etc.) is used by those groups. They analyzed data from the Tornado Crisis in USA (2011)  and data from German incidents for their studies in emergent groups . They identified several roles of users in emergent groups, e.g., the reporter (creating information), the retweeter (forwarding information), the helper (involved in helping), and the repeater (information-broker). The importance of considering emergent groups to increase usability of social media platforms for those groups is stated . The authors also showed the importance of social media - beside case studies in USA - for Germany .
There is consensus that social media is useful for emergency management and the involved agencies. Hence, this topic needs special attention during future research work. The studies also show that considering the different stakeholders and their requirements in the technical development and changes in the policy for treating and incorporating social media in practice are important for the future.
3 SOCIAL MEDIA ANALYSIS FRAMEWORKS
The following sections describe frameworks and tools from different research groups to analyze social media or to include new analysis functionalities into the social media usage for crisis management.
Abel et al. ,  describe in their work the functionalities of Twitcident - a system to filter and search tweets. The filtering tasks are performed automatically in the system, whereas the search is performed by the user through the selection of facets. Facets describe detailed entities or properties mentioned in a tweet, i.e., locations, persons, incident classes or keywords . The Twitcident processing chain consists of several sequential steps : incident profiling, social media aggregation, semantic enrichment of the gained tweets, filtering.
Twitcident is linked with P2000, a communication network in the Netherlands containing incidents of public interest. Whenever a new incident is posted on P2000, the processing chain starts. First, based on the posted P2000-messages, an incident profile is created consisting of a set of facet-value pairs. A facet-value pair is additionally weighted to indicate the importance of this pair (i.e., based on the occurrence).
Second, tweets based on this profile are crawled from the Twitter platform. During the semantic enrichment, the tweets are analyzed by : (i) extracting named entities, (ii) classification of tweets into classes (e.g., damages, risks, causalities etc.) via manually inserted rules, (iii) analyzing referenced web resources, and (iv) extraction of metadata (like the user, number of followers etc.). Most of the processing steps result in additional facets describing a profile of the analyzed tweet. The last step in the automatic process comprises filtering. Semantic filtering is performed by comparing the tweet profiles with the incident profile using the Jaccard similarity .
Twitcident also facilitates facet-search interfaces . They are based on facets extracted in the previous steps, which helps the user to browse through the data to gain a better overview.
The whole processing chain of Twitcident allows the examination of tweets in real-time with advanced functionalities in filtering and searching. Terpstra et al.  show the application of Twitcident in the Pukkelpop festival after the hit of a heavy storm. It highlights the evolution of tweets related to specific categories, e.g., damages or casualties, during the incident.
B. The WeKnowIt-System
Another approach for analyzing social media in the context of crisis management was examined in the EU Project WeKnowIt. The system built in this project is based on interviews performed considering emergency management practitioners , . Based on the answers of the interviews, several requirements for building the response tool were extracted and described by Lanfranchi and Ireson .
The system can be subdivided in two parts: first it offers an opportunity to document information to the system by sending information (e.g., videos, pictures) and it offers a visualization that shows the reported information in a structured way. The representation relies on a map-based display, which shows aggregated messages. The message aggregation is performed using an offline text processing method with sample data for training . This results in categories (described by a bag of words) for summarizing incoming message tweets in a specific category.
Tweak-the-tweet (http://faculty.washington.edu/kstarbi/TtT_Hurricane_Map_byEvent.html) developed at the University of Colorado is a crowdsourcing platform. It defines instructions to incorporate specific hashtags so that the written tweets of users become structured and machine readable , . These instructions can be used by volunteers (or directly by the authors) to (re-)structure incoming messages. Afterwards, a simple parsing algorithm considering the grammar is applied to extract information given in combination with these hashtags (e.g., #treedown, #road, etc. followed by some detailed information). Later, the extracted information can be used to perform keyword-based filtering. The major difference to other platforms is that this system 'works with the existing social media infrastructure' .
MacEachren et al.  describe a system called SensePlace2 for filtering Twitter messages. Their work is based on a survey that gives them hints on how to design their information system. The survey comprises several emergency management practitioners to identify the role of social media in emergency management. In addition, the survey includes questions about the first prototype and user interface screenshot of the SensePlace2 system. The survey shows openness to the use of social media in emergency management, especially for monitoring, information gathering, and situational assessment .
The SensePlace2 system crawls tweets based on keywords entered by the user (i.e., user interest). The gathered and stored tweets are analyzed via named-entity recognition to identify organizations, people and locations given in the text . Additionally, the location is extracted from the tweet either by a given geo-location or by examining the user profile. A color-scheme displayed on the map indicates high frequency of tweet activities in a specific area (gridded form ). SensePlace2 introduces different search and filtering facilities (i.e., map, location, word clouds etc.) to browse through a huge amount of tweets by considering the extracted information. Depending on the selection of the filtering mechanisms in the user interface, the set of shown tweets is narrowed down.
Li et al.  depict an event detection system for Twitter, called 'Twitter-based Event Detection and Analysis System' (TEDAS). It focuses on detecting high-level events (e.g., all car accidents recognized during January 21 and January 22). The detection process is based on spatial and temporal analysis of tweets.
Therefore, the system is separated into an offline and online analysis part. Tweets are crawled from the Twitter platform by using different rules. A rule describes how and what kind of keywords should occur so that the tweet indicates a 'crime and disaster related event' (CDE) with high chance . The crawled tweets are furthermore examined offline by a classifier to remove unrelated tweets (i.e., not describing a CDE). For classification, URLs, the usage of hashtags, the @ operator, inclusion of numbers in the text, etc. are considered as features. In addition, missing location data is predicted based on tweets a user has written in the past (e.g., mentioning a location) and the environment the user is in contact with (e.g., location of friends).
To rank tweets and events based on the user query (e.g., all car accidents in the area of New York) different features are considered. This comprises specific keywords (e.g., death), formats (e.g., containing an @), user profile (e.g., number of followers, verified account etc.), and the usage (e.g., spreading of tweets). All these features are combined in a linear regression model for rank estimation . The user can browse through the identified CDE by entering keywords combined with location and/or temporal information. The CDEs matching the user-defined query are displayed with the corresponding tweets in a map.
Twitris+ (extension of Twitris) described by Sheth et al.  shows another opportunity to aggregate social media information. The framework considers Twitter messages and SMS's (short message service). For analysis, the dataset is divided into spatial and temporal information . For example, a user wants to know everything from a continent (e.g., Europe) within the last 2 hours: then the dataset is separated into tweets from Europe which are again divided into 2 hour steps. Afterwards, important n-grams (i.e., sequence of n words) from each tweet contained in the set are extracted. These n-grams are called event-descriptors. An event-descriptor is seen as important, if :
G. Emergency Situation Awareness Platform
The emergency situation awareness platform developed by Yin et al.  at CSIRO (http://www.csiro.au/Outcomes/ICT-and-Services/emergency-situation-awareness.aspx) examines tweets. The framework analyzes tweets based on a background-alert model, i.e., tweets gathered on days without any emergency are used to identify 'bursty keywords' . These 'bursty keywords' are indicators for incidents marking an emergency. In addition, the authors perform text classification to identify the impact of the incidents identified. This comprises the identification of tweets related to, for example, road damages, collapsing of buildings or bridges etc. Online clustering helps in grouping similar tweets to topic clusters . The system uses the geo-location from the Twitter user profile, if there is no explicit location given in the tweet. Results are visualized in a map. Several functionalities to support the user are available, e.g., color-coding which indicates the importance of topics, word clouds to navigate, a map representation to see the distribution of topics. Details on the architecture can be found in .
Crisees developed by Maxwell et al.  at the University of Glasgow describes another monitoring tool for social media streams. In particular they consider YouTube and Twitter in their system. The system gains social media items based on queries entered by the user (i.e., entering search terms). A sentiment analysis is performed through these items. Beside a list representation of the gained items, the items are also displayed on a map based on their given location. The authors also report on a survey in  with 70 first responders from the UK to identify further requirements to refine their monitoring tool. The most important finding is that first responders look for reliable users (or information) to track their accounts.
I. System for real and virtual volunteers
Reuter et al.  created a prototype implementation for supporting emergent groups (i.e., volunteers) in using social media. These groups are formed by self-help activities from several persons in crisis situations. The created prototype is interconnected with Twitter and Facebook. It helps members in forming and joining of groups. In addition, it allows to administer groups and group activities; and it enables the spreading of information between members. In addition, the system is also connected to an existing emergency service to allow forwarding of data to professional first responders (i.e., emergency agencies) . Hence, this system allows to gain situational awareness for both, volunteers and professionals.
J. Classification System for Microblogs
Zhou et al.  developed a classification system for microblogs. The idea of the classification is to identify microblogs that are of special importance for specific emergency agencies (i.e., fire department, police, paramedics etc.). The authors tested their system with data from the Yushu earthquake 2010 in China. They gained microblogs from Sina, a big information service company in China . After removing noisy information, like stopwords, html-tags etc., the microblogs are translated into uni-grams for classifier training. The trained Naive Bayes Classifier can be used to identify emergency-related microblogs for specific agencies.
The SocialSensor EU project develops a multimedia indexing and search system supporting the news and infotainment area . Tools developed for handling the news context to support journalists, news providers, and contributors in browsing through social media are related to the crisis management area. In this context several mechanism, like event detection, search & browsing (Tool: News dashboard) and estimation of trustfulness of Twitter users (Tool: Alethiometer) are developed . The SocialSensor project also examined a crawling and search framework in the context of the 2013 protests in Turkey ('Occupy Gezi') . The framework suggested in  facilitates fetching, ranking, indexing, and aggregating of media items (e.g, pictures or videos extracted from social media posts or web pages). The aggregation in  is performed on both the geo-location (i.e., for map representation) and the visual content of media items. The SocialSensor tools can also be applied to emergency scenarios to crawl for emergency-related information, as the 'Occupy Gezi' study in  shows.
4 NON CRISES-RELATED SOCIAL MEDIA ANALYSIS TOOLS
Indeed, there are also other social media analysis tools not directly applied or implemented in the crisis management context. This section gives a short overview on other social media tools that might be useful in the context of crisis management.
Eddi developed by Bernstein et al.  aggregates tweets based on the topics they discuss. This allows a user to have an overview of posted status messages. For example, this tool would allow police forces to examine their own twitter streams and those of their followers to identify new incidents or aspects of incidents.
Twitinfo developed by Marcus et al.  analyzes incoming tweets crawled by keywords through peak detection. The identified peaks show specific interesting incidents (e.g., a goal in a soccer game) . Tweets related to an identified peak are going through a sentiment analysis and are shown to the user for further investigations. This would be another tool for detecting events in social media during a crisis.
An extension of the work of Marcus et al.  describes the usage of an SQL-based stream query language to analyze tweets, called TweeQL. Through the definition of queries it is possible, for example, to gain all tweets that contain 'New York' in the text directly from the Twitter stream.
Rizzo et al.  developed a summarization tool for media galleries called MediaFinder. It focuses on visual information, like pictures or tweets that include pictures. The system aggregates the visual items based on their similarity and allows different perspectives on the data (e.g., timeline view, graph view etc.). Visual aggregation is also important in crisis management and thus MediaFinder would be an interesting tool to investigate.
TwitterBeat from Shook et al.  describes an approach to analyze huge amounts of textual data uncovering the sentiment. This results in a heatmap describing the 'tone' for the current available information. Sentiment analysis in general can be used to identify the mood after a disaster, e.g., for crime prevention.
TweetMotif developed by O'Connor et al.  is another tool for gaining an overview on Twitter activities. First, tweets related to keywords describing the user-interest are fetched. Afterwards, the gained tweets are analyzed and grouped into topics. The tweets are analyzed via n-gram extraction from the text and by considering a probability model to form topics. This tool can be used to continuously search for specific keywords and related discussions on Twitter.
Table 1 summarizes the referenced research activities based on common aspects of the performed social media analysis (i.e., independent of an application in crisis management). The categorization was done based on the presented papers (e.g., (project) description, screenshots etc.), which does not necessarily include or highlight the full functional range of the systems.
The categories comprise multi-platform, crowdsourcing, organizational management, sentiment, visual content, and event/topic detection. The category multi-platform emphasizes if the system uses two or more social media platforms in its implementation and evaluation. Systems especially designed for crowdsourcing are highlighted, too. Organizational management describes related oranizational/management tasks, like group management, routing of information to corresponding agencies etc. Sentiment summarizes systems performing sentiment analysis on social media data. Systems examining the visual content (i.e., the picture and video material itself) are covered in visual content category. Additionally, tools performing event/topic detection are emphasized as well.
In addition to these categories, also the following aspects are treated: (1) which social media platforms were used (2) which visualization components for the user interface are illustrated, (3) which filtering mechanism are given in the user interface, and (4) an excerpt of (additional) interesting analysis techniques of the corresponding work is shown, too.
Independent of the categorization, all of the mentioned research work are fundamental for situational awareness in crisis management and for monitoring of social media activities.
This contribution gives an overview of several social media studies and social media analysis tools. It summarizes studies and technical developments from several research groups giving an overview of social media research in crisis management.
In summary, it can be recognized that there is intense interest in including social media in crisis management from several research fields. Studies show that there is a positive response from emergency agencies to incorporating social media in crisis management activities. Additionally, there is ongoing work in finding more specific requirements in defining and creating social media monitoring tools. Currently, there are systems available for:
Although the list of research work discussed here is not complete regarding social media analysis, as social media is used in different other areas beside crisis management (e.g., tourism, media search in journalism, network or community analysis, etc.), this survey should give a first insight into technical developments performed so far.