Spatio-temporal Characteristics of Bursty Words in Twitter Streams
Publication Type
Original research
Authors

Social networking and microblogging services such as Twitter provide a continuous source of data from which useful information can be extracted. The detection and characterization of bursty words play an important role in processing such data, as bursty words might hint to events or trending topics of social importance upon which actions can be triggered. While there are several approaches to extract bursty words from the content of messages, there is only little work that deals with the dynamics of continuous streams of messages, in particular messages that are geo-tagged.

In this paper, we present a framework to identify bursty words from Twitter text streams and to describe such words in terms of their spatio-temporal characteristics. Using a time-aware word usage baseline, a sliding window approach over incoming tweets is proposed to identify words that satisfy some burstiness threshold. For these words then a time-varying, spatial signature is determined, which primarily relies on geo-tagged tweets. In order to deal with the noise and the sparsity of geo-tagged tweets, we propose a novel graph-based regularization procedure that uses spatial cooccurrences of bursty words and allows for computing sound spatial signatures. We evaluate the functionality of our online processing framework using two real-world Twitter datasets. The results show that our framework can efficiently and reliably extract bursty words and describe their spatio-temporal evolution over time.

Journal
Title
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Publisher
ACM
Publisher Country
Germany
Publication Type
Prtinted only
Volume
--
Year
2013
Pages
10