IJMTES – EVOLVING STREAM DATA CLUSTERING AND TCV RANK SUMMARIZATION

Journal Title : International Journal of Modern Trends in Engineering and Science

Author’s Name : K.Selvaraj unnamed

Volume 03 Issue 08 2016

ISSN no:  2348-3121

Page no: 71-74

Abstract – Tweet are being created short text message and shared for both users and data analysts. Twitter which receives over 400 million tweets per day has emerged as an invaluable source of news, blogs, opinions and more. our  proposed work consists three components tweet stream clustering  to cluster tweet using k-means cluster algorithm and second tweet cluster vector technique to generate rank summarization using greedy algorithm, therefore requires functionality which significantly differ from traditional summarization . in general, tweet summarization and third to detect and monitors the summary-based and volume based variation to produce timeline automatically from tweet stream. Implementing continuous tweet stream reducing a text document is however not an simple task, since a huge number of tweets are worthless, unrelated and raucous in nature, due to the social nature of tweeting. Further, tweets are strongly correlated with their posted instance and up-to-the-minute tweets tend to arrive at a very fast rate. Efficiency—tweet streams are always very big in level, hence the summarization algorithm should be greatly capable; Flexibility—it should provide tweet summaries of random moment durations. (3) Topic evolution—it should routinely detect sub-topic changes and the moments that they happen.

Keywords— Tweet Stream, summarization, Timeline,  Topic evolution,s Smmary

Reference

  1. C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu, “A framework for clustering evolving data streams,” in Proc. 29th Int. Conf. VeryLarge Data Bases, 2003, pp. 81–92.
  2. T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An efficient data clustering method for very large databases,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 1996, pp. 103–114.
  3. P. S. Bradley, U. M. Fayyad, and C. Reina, “Scaling clustering algorithms to large databases,” in Proc. Knowl. Discovery Data Mining, 1998, pp. 9–15.
  4. L. Gong, J. Zeng, and S. Zhang, “Text stream clustering algorithm based on adaptive feature selection,” Expert Syst. Appl., vol. 38, no. 3, pp. 1393–1399, 2011.
  5. Q. He, K. Chang, E.-P. Lim, and J. Zhang, “Bursty feature representation for clustering text streams,” in Proc. SIAM Int. Conf. Data Mining, 2007, pp. 491–496.
  6. J. Zhang, Z. Ghahramani, and Y. Yang, “A probabilistic model for online document clustering with application to novelty detection,” in Proc. Adv. Neural Inf. Process. Syst., 2004, pp. 1617–1624.
  7. S. Zhong, “Efficient streaming text clustering,” Neural Netw., vol. 18, nos. 5/6, pp. 790–798, 2005.
  8. C. C. Aggarwal and P. S. Yu, “On clustering massive text and categorical data streams,” Knowl. Inf. Syst., vol. 24, no. 2, pp. 171–196, 2010.
  9. R. Barzilay and M. Elhadad, “Using lexical chains for text summarization,” in Proc. ACL Workshop Intell. Scalable Text Summarization, 1997, pp. 10–17.
  10. W.-T. Yih, J. Goodman, L. Vanderwende, and H. Suzuki, “Multidocument summarization by maximizing informative contentwords,” in Proc. 20th Int. Joint Conf. Artif. Intell., 2007, pp. 1776–1782.
Scroll Up