Journal Title : International Journal of Modern Trends in Engineering and Science
Paper Title : ENHANCED A-PRIORI ALGORITHM BASED MAP/REDUCE METHOD TO MINE MEDICAL DATA FROM DATABASES
Volume 03 Issue 12 2016
ISSN no: 2348-3121
Page no: 49-55
Abstract – Association Rule or Affinity Analysis is the fundamental data mining analysis to find the co-occurrence relationships like purchase behavior of customers. The analysis is legacy in sequential computation so that many data mining applications of big data shows that to mine medical relevant data in such databases, whereas the data cannot be taken already with existing history. SQL MapReduce framework as a product called Aster, it provides nPath SQL to process big data stored in the DB. Market Basket Analysis is executed on the framework but it is based on its SQL Databases with MapReduce Database. Association rule has been used efficiently to manage mine the Medical relevant data such as stock items and products etc analyzing the patient (customer) behavior. It is based on Apriority Property where all subsets of a frequent item set must also be frequent. The map() and reduce() functions run on distributed nodes in parallel. Each map and reduce operation can be processed independently on each node and all the operations can be performed in parallel. Map/Reduce can handle Big Data sets as data are distributed on HDFS, in here the minimum support basis documents are searched initially according to its relevant support co-ordinates. In our Proposed the Apriori-like algorithms for Spatio-Temporal Pattern Queries presents a way to construct Apriori-like algorithms for mining spatio-temporal patterns. This Thesis addresses problems of the different types of comparing functions that can be used to mine frequent patterns. Map-Reduce for Medical Data Mining on Multi core discuss ways to develop a broadly applicable programming paradigm that is applicable to different learning algorithms.
Keywords— Association Rule Mining, Big Data Computing, Map Reduce, HDFS, Spatio-Temporal Patterns
- Mohammed J. Zaki, Srinivasan Parthasarathy, Mitsunori Ogihara and Wei Li. New algorithms for fast discovery of association rules. Technical Report 651, Computer Science Department, University of Rochester, Rochester, NY 14627. 1997.
- J. Han, H. Pei and Y. Yin. Mining Frequent Patterns without Candidate Generation. In Proc. Conf. on the Management of Data (SIGMOD’00, Dallas, TX), ACM Press, New York, NY, USA 2000
- Agrawal, R., & Shafer, J. C. (1996). Parallel mining of association rules. In IEEE Transactions on Knowledge and Data Engineering ,Vol. 8(6), pp. 962–969. Springer.
- Zaki, M. J., Parthasarathy, S., Li, W. L. W., & Ogihara, M. (1997). Evaluation of sampling for data mining of association rules. In Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications, pp. 42–50. IEEE.
- Lam, C. (2011). Hadoop in Action. Manning Publications Co.
- White, T. (2009). Hadoop: the definitive guide: the definitive guide. “ O’Reilly Media, Inc.”
- Shaw, M. J. . B. C., Subramaniam, C. ., Tan, G. W. ., & Welge, M. E. . (2001). Knowledge management and data mining for marketing. Decision Support Systems, 31(1), 127–137.
- Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010a). The Hadoop distributed file system. In 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies, MSST2010, pp. 1–10. IEEE.
- Vavilapalli, V. K., Seth, S., Saha, B., Curino, C., O’Malley, O., Radia, S., Shah, H. (2013). Apache Hadoop YARN. In the 4th annual Symposium, pp. 1–16. ACM.
- Apache Software Foundation., -Apache Hadoop, Jan.2010, URL ttp://hadoop.apache.org/.
- PoweredBy Hadoop, June 2010 URL http://wiki.apache.org/hadoop/PoweredBy.
- HDFS Architecture Guide. https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
- Yahoo! Hadoop Tutorial.http://developer.yahoo.com/hadoop/tutorial/index.html
- Apache Hadoop. http://hadoop.apache.org
- Ghemawat, S., Gobioff, H. and Leung, S. “The Google File System”, ACM SIGOPS Operating Systems Review, 2003, 37(5), 29–43.
- Dean, J. and Ghemawat, S. “MapReduce: Simplified Data Processing on Large Clusters”, ACM Commun., 2008, vol. 51, 107–113.
- R. Agrawal, T. Imielinski and A. Swami, ”Mining Association Rules between Sets of Items in Large Databases”, Proceedings of the 1993 ACM SIGMOD Conference, 1993, pp. 207-216.
- H. E.-R. Mostafa Abd-El-Barr, Fundamental of Computers Organization and Architecture, Willey, 2005.
- C. Lam, Hadoop in Action, Manning Publications, 2010.
- S. Hairong Kuang, “The Hadoop Distributed File System,” MSST, pp. 1-10, 2010.