On Summarization and Timeline Generation for Evolutionary Tweet Streams - Coimbatore

Tuesday, 1 December 2015

Item details

City: Coimbatore, Tamil Nadu
Offer type: Offer

Contacts

Contact name Lansa
Phone 9095395333

Item description

Abstract
Short-text messages such as tweets are being created and shared at an unprecedented rate. Tweets, in their raw form, while being informative, can also be overwhelming. For both end-users and data analysts, it is a nightmare to plow through millions of tweets which contain enormous amount of noise and redundancy.
In this paper, we propose a novel continuous summarization framework called Sumblr to alleviate the problem. In contrast to the traditional document summarization methods which focus on static and small-scale dataset, Sumblr is designed to deal with dynamic, fast arriving, and large-scale tweet streams. Our proposed framework consists of three major components. First, we propose an online tweet stream clustering algorithm to cluster tweets and maintain distilled statistics in a data structure called Tweet Cluster Vector (TCV). Second, we develop a TCV-Rank summarization technique for generating online summaries and historical summaries of arbitrary time durations. Third, we design an effective topic evolution detection method, which monitors summary-based/volume-based variations to produce timelines automatically from tweet streams. Our experiments on large-scale real tweets demonstrate the efficiency and effectiveness of our framework.

For more details, Contact
ph:9095395333,9159115969

Office Address:
LansA Informatics Pvt Ltd,
165, 5th Street, Cross cut road,
Gandhipuram, Coimbatore-641012