Journal Title : International Journal of Modern Trends in Engineering and Science
Paper Title : EXERTION OF MANIFOLD ITEM SET IN DATA ANALYTICS USING EXTEMPORIZED ALGORITHM
Volume 04 Issue 03 2017
ISSN no: 2348-3121
Page no: 168-172
Abstract – Enormous amount of data getting explored through Social media as technologies are advancing. People use these technologies in day to day activities, Frequent Item Set Mining Algorithms are aimed to disclose Frequent Item Sets from transactional database but as the data set size increases, it cannot be handled by traditional Frequent Item Set Mining. To extract useful information, frequent item set mining techniques can be used. Among many techniques of frequent item set mining, clustering is most popular technique. K-means is one of the simplest unsupervised learning algorithms that used to solve the well-known clustering problem
Keywords – Apache Hadoop, Map Reduce, Big Data, Clustering, K-Means Clustering, Frequent Item Set Mining
- “Apache Hadoop”. http://hadoop.apache.org/ visited on Sept, 2014
- “Map Reduce Tutorial”, http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html,April 2015
- J. Han, M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000.
- Weizhong Zhao, Huifang Ma, Qing He, “Parallel K-Means Clustering Based on MapReduce” in Cloud Com 2009, LNCS 5931, pp. 674–679, 2009. Springer Verlag Berlin Heidelberg 2009
- Prajesh P Anchalia, Anjan K Koundinya, Srinath N K, “Map Reduce Design of K-Means Clustering Algorithm” in IEEE 2013.
- Mr. Krishna Yadav, Mr. Jwalant Baria, “Mini-Batch K-Means Clustering Using Map-Reduce in Hadoop” in International Journal of Computer Science and Information Technology Research Vol. 2, Issue 2, pp: (336-342), Month: April-June 2014
- Sicular, S. (2013) “Gartner’s Big Data Definition Consists of Three Parts, Not to Be Confused with Three “V”s,” Forbes, March 27.
- Doug Laney, 3d Data management: controlling data volume, velocity and variety, Appl. Delivery Strategies Meta Group (949) (2001).
- A, Katal, Wazid M, and Goudar R.H. “Big data: Issues, challenges, tools and Good practices.” Noida: 2013, pp. 404 – 409, 8-10 Aug. 2013.
- Jeffrey Dean, and Sanjay Ghemawat, “Map Reduce: Simplified Data Processing on Large Clusters.” Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation., pp. 137- 150, USENIX Association, Berkley, CA, USA, 2004.
- J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.H. Bae, J. Qiu, and G. Fox. Twister: A run time for iterative Map Reduce. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pages 810-818. ACM, 2010.
- S. Ghemawat, H. Gobi off, S. Leung. “The Google file system,” In Proc.of ACM Symposium on Operating Systems Principles, Lake George, NY, Oct 2003, pp 29–43.
- Borthakur, D. “The Hadoop Distributed File System: Architecture and Design”, 2007