Academia.eduAcademia.edu

Optimized Pattern Mining Of Sample Data

Abstract

Data precise is often found in real-world applications due to reasons such as imprecise measurement, outdated sources, or sampling errors. Much research has been published in the area of managing in databases. In many cases, only partially aggregated data sets are available because of privacy concerns. Thus, each aggregated record can be represented by a probability distribution. In other privacy-preserving data mining applications, the data is perturbed in order to preserve the sensitivity of attribute values. In some cases, probability density functions of the records may be available. Some recent techniques construct privacy models, such that the output of the transformation approach is friendly to the use of data mining and management techniques. Here data is inherent in applications such as sensor monitoring systems, location-based services, and biological databases. There is an increasing desire to use this technology in new application domains. One such application domain that is likely to acquire considerable significance in the near future is database mining. The information in those organizations are classified using EMV, mean obtained after calculating the profit. Decision tree is constructed using the profit and EMV values which prunes the data to the desired extent. An increasing number of organizations are creating ultra large data bases (measured in gigabytes and even terabytes) of business data, such as consumer data, transaction histories, sales records, etc. Such data forms a potential gold mine of valuable business information.