Papers by Priyanka Trikha

Clustering has become an increasingly important task in modern application domains such as market... more Clustering has become an increasingly important task in modern application domains such as marketing and purchasing assistance, multimedia, molecular biology etc. The goal of clustering is to decompose or partition a data set into groups such that both the intra-group similarity and the inter-group dissimilarity are maximized. In many applications, the size of the data that needs to be clustered is much more than what can be processed at a single site. Further, the data to be clustered could be inherently distributed. The increasing demand to scale up to these massive data sets which are inherently distributed over networks with limited bandwidth and computational resources has led to methods for parallel and distributed data clustering. In this thesis, we present a cohesive framework for cluster identification and outlier detection for distributed data. The core idea is to generate independent local models and combine the local models at a central server to obtain global clusters. ...

A Fast K-Means Algorithm
In clustering, we are given a set of N points in d-‐dimension space R d and we have to arrange th... more In clustering, we are given a set of N points in d-‐dimension space R d and we have to arrange them into a number of groups (called clusters). In k-‐means clustering, the groups are identified by a set of points that are called the cluster centers. The data points belong to the cluster whose center is closest. Existing algorithms for k-‐means clustering suffer from two main drawbacks, (i) The algorithms are slow and do not scale to large number of data points and (ii) they converge to different local minima based on the initializations. We present a fast greedy k-‐means algorithm that attacks both these drawbacks based on a greedy approach. The algorithm is fast, deterministic and is an approximation of a global strategy to obtain the clusters. The method is also simple to implement requiring only two variations of kd-‐trees as the major data structure. With the assumption of a global clustering for k-‐1 centers, we introduce an efficient method to compute the global clustering for ...

International Journal of Machine Learning and Computing, 2013
Clustering problem is an unsupervised learning problem. It is a procedure that partition data obj... more Clustering problem is an unsupervised learning problem. It is a procedure that partition data objects into matching clusters. The data objects in the same cluster are quite similar to each other and dissimilar in the other clusters. The traditional algorithms do not meet the latest multiple requirements simultaneously for objects. Density-based clustering algorithms find clusters based on density of data points in a region. DBSCAN algorithm is one of the density-based clustering algorithms. It can discover clusters with arbitrary shapes and only requires two input parameters.In this paper, we propose a new algorithm based on DBSCAN. We design a new method for automatic parameters generation that create clusters with different densities and generates arbitrary shaped clusters. The kd-tree is used for increasing the memory efficiency. The performance of proposed algorithm is compared with DBSCAN. Experimental results indicate the superiority of proposed algorithm.
Interpreting Inference Engine for Semantic Web
Semantic web is a web of data, where data should be related to one another and also Knowledge wil... more Semantic web is a web of data, where data should be related to one another and also Knowledge will be organized in conceptual spaces according to its meaning. To understand and use the data and knowledge encoded in semantic web documents requires inference engine. There are number of inference engines used for consistency checking and classification like Pellet, Fact, Fact++, Hermit, Racer Pro, KaON2, and Base Visor. Some of them are reviewed and tested for few prebuilt ontologies. This paper presents the analysis of different inference engines with set of ontologies. It requires assessment and evaluation before selecting an appropriate inference engine for a given application.

Digital watermarking is a process for modifying physical or electronic media to embed a machine-r... more Digital watermarking is a process for modifying physical or electronic media to embed a machine-readable code into the media. The media may be modified such that the embedded code is imperceptible or nearly imperceptible to the user, yet may be detected through an automated detection process. Watermarking is the art of imperceptibly embedding a message into a work. More than 700 years ago in Fabriano (Italy), paper watermarks appeared in handmade paper, in order to identify its provenance, format, and quality. In this context, the watermark is a kind of invisible signature that allows identifying the creator or the owner of a document, and to detect possible copyright violations, and especially non-authorized copying [1]. More recently, different watermarking techniques and strategies have been proposed in order to solve a number of problems, ranging from the detection of content manipulations, to information hiding (steganography), to document usage tracing. In particular, the insertion of multiple watermarks to trace a document during its lifecycle is a very interesting and challenging application [1]. The main property of the proposed method is that it allows the insertion of multiple watermarks by different users, who sequentially come into play one after the other and do not need any extra information besides the public keys. This characteristic makes the present approach more attractive than previously available solutions.
A Framework: Intrusion Detection in Data Mining
Today intrusion detection in data mining has gain more interest to the researches, there are many... more Today intrusion detection in data mining has gain more interest to the researches, there are many intrusion detection issues in data mining like dos attacks, R2L, U2R and probing etc. In this paper, we have Introduce a framework for the intrusion detection system which is used for filtering of the dataset toward the network attacks. We have also talk about the basic data mining technique of intrusion detection which is used for the intrusion detection in the field of data mining.Index Terms— Data mining, Data mining techniques, Intrusion detection, Intrusion detectino system, Host-based IDS, Network-based IDS, DCS.

A New Approach for Discovering Frequent Pattern from Transactional Database
Frequent pattern mining is a heavily researched area in the field of data mining with wide range ... more Frequent pattern mining is a heavily researched area in the field of data mining with wide range of applications. Finding a frequent pattern (or items) play essentials role in data mining. Efficient algorithm to discover frequent patterns is essential in data mining research. A number of research works have been published that presenting new algorithm or improvements on existing algorithm to solve data mining problem efficiently. In that Apriori algorithm is the first algorithm proposed in this field. By the time of change or improvement in Apriori algorithm, which compressed large database in to small tree data structure like FP tree, CAN tree and CP tree have been discovered. In CP tree, like FP tree it contains frequent and non frequent items at the mining time. So item required to extract frequent pattern is more. In this paper I propose a new novel tree structure - extension of CP tree that extract all frequent pattern from transactional database using CP-mine algorithm. So at ...

Frequent pattern mining is a heavily researched area in the field of data mining with wide range ... more Frequent pattern mining is a heavily researched area in the field of data mining with wide range of applications. Finding a frequent pattern (or items) plays as essentials role in data mining. Efficient algorithm to discover frequent patterns is essential in data mining research. A number of research works have been published that presenting new algorithm or improvements on existing algorithm to solve data mining problem efficiently. In that Apriori algorithm is the first algorithm proposed in this field. By the time of change or improvement in Apriori algorithm, which compressed large database in to small tree data structure like FP tree, CAN tree and CP tree have been discovered. In CP tree, like FP tree it contains frequent and non frequent items at the mining time. So item required to extract frequent pattern is more. In this paper I propose a new novel tree structure extension of CP tree that extract all frequent pattern from transactional database using CP-mine algorithm. So a...

International journal of engineering research and technology, 2013
Image processing is one huge field nowadays. To recognize small part of the image is very importa... more Image processing is one huge field nowadays. To recognize small part of the image is very important concept of image processing. That recognized part can be used for different security purpose. Another thing is to find one image from large image or from set of image is also very essential in security system. To find the criminal or any rule breaker can be possible using this system. This is only possible with different steps like capture image, convert image in to black and white image, segmentation, matching and identification. Here in second step such as convert colour image into black and white is very important step. If it is proper conversion then and then we can get optimal output. So here in this paper I have used new proposed thresholding technique for better conversion. I have also used correlation co-efficient for matching image. One of the big problems of computer image is image matching or matching of pixel of one image to another image. So here I have used this correlat...

Modification in „KNN‟ Clustering Algorithm for Distributed Data
Clustering has become an increasingly important task in modern application domains such as market... more Clustering has become an increasingly important task in modern application domains such as marketing and purchasing assistance, multimedia, molecular biology etc. The goal of clustering is to decompose or partition a data set into groups such that both the intra-group similarity and the inter-group dissimilarity are maximized. In many applications, the size of the data that needs to be clustered is much more than what can be processed at a single site. Further, the data to be clustered could be inherently distributed. The increasing demand to scale up to these massive data sets which are inherently distributed over networks with limited bandwidth and computational resources has led to methods for parallel and distributed data clustering. In this thesis, we present a cohesive framework for cluster identification and outlier detection for distributed data. The core idea is to generate independent local models and combine the local models at a central server to obtain global clusters. ...

Frequent pattern mining is a heavily researched area in the field of data mining with wide range ... more Frequent pattern mining is a heavily researched area in the field of data mining with wide range of applications. Finding a frequent pattern (or items) plays as essentials role in data mining. Efficient algorithm to discover frequent patterns is essential in data mining research. A number of research works have been published that presenting new algorithm or improvements on existing algorithm to solve data mining problem efficiently. In that Apriori algorithm is the first algorithm proposed in this field. By the time of change or improvement in Apriori algorithm, the algorithms that compressed large database in to small tree data structure like FP tree, CAN tree and CP tree have been discovered. These algorithms are partitioned based , divide and conquer method used that decompose mining task in to smaller set of task for mining confined patterns in conditional database, which dramatically reduce search space. In this paper I propose a new novel tree structure - extension of CP tree...
ArXiv, 2016
The traditional algorithms do not meet the latest multiple requirements simultaneously for object... more The traditional algorithms do not meet the latest multiple requirements simultaneously for objects. Density-based method is one of the methodologies, which can detect arbitrary shaped clusters where clusters are defined as dense regions separated by low density regions. In this paper, we present a new clustering algorithm to enhance the density-based algorithm DBSCAN. This enables an automatic parameter generation strategy to create clusters with different densities and enables noises recognition, and generates arbitrary shaped clusters. The kdtree is used for increasing the memory efficiency. Experimental result shows that proposed algorithm is capable of handling complex objects with good memory efficiency and accuracy.
A Critical Review on Security Issues in Cloud Computing
Recognizing Face Image Based on Gabor and DCT Feature Extraction using SVM
SKIT Research Journal
Uploads
Papers by Priyanka Trikha