Papers by Ghanshyam Thakur

International journal of computer and communication technology, 2015
Clustering is one of the very important technique used for classification of large dataset and wi... more Clustering is one of the very important technique used for classification of large dataset and widely applied to many applications including analysis of social networking sites, aircraft accidental, company performance etc. In recent days, Communication, advertising through social networking sites are most popular and interactive strategy among the users. This research attempts to find the large scale measurement study and analysis, effectiveness of communication strategy, analyzing the information about the usage, people's interest in social network sites in promoting and advertising their brand in social networking sites. The significance of the proposed work is determined with the help of various surveys, and from people who use these sites. Further a more specific pre-processing method is applied to clean data and perform the clustering method to generate patterns that will be work as heuristics for designing more effective social networking sites.
International Journal of Computer Applications, Oct 31, 2011
In this study we investigate the significance of textual document which is now commonly recognize... more In this study we investigate the significance of textual document which is now commonly recognized by researchers for better management, smart navigation, well-organized filtering, and finding the results. The challenging part is to extract the meaningfulness and to manage the purpose of the "best" Mining Rule .This research study is proposed to refine the Mining Rule from textual data set by performing Graph based approach.
National Academy Science Letters-india, Feb 17, 2017
Social networking service (SNS) is one of the most encouraging directions of web applications. It... more Social networking service (SNS) is one of the most encouraging directions of web applications. It is one of the services where people may communicate with one another; and may also exchange messages even of any multimedia communication. In this paper, we proposed practical approach by applying data mining techniques in SNS. Three parameters are considered here viz. user, time and image. This approach generates strong patterns among these parameters. This is a very efficient technique for identifying user's behavior in a SNS environment. Most frequently used pattern can be identified using these three parameters in SNS. Through this approach advertisement recommendation is possible in SNS.
ROAD SIDE UNIT PREVENTION MECHANISM FOR DDoS ATTACK USING ARTIFICIAL BEE COLONY

Framework for finding maximal association rules in mobile web service environment using soft set
International Journal of Data Science, 2018
Electronic commerce is very popular nowadays. It is a fast and convenient way to transfer informa... more Electronic commerce is very popular nowadays. It is a fast and convenient way to transfer information and communicate with people. E-commerce uses various web services to perform a specific task. When a particular user accessed web services, they are sequentially stored into a database that is called web service sequences. Association rules are used to correlate different web services for knowledge prediction. In this paper, we design a framework for generating maximal association rules of accessed web service sequences using soft set. Soft set uses binary values for their standard representation. This framework converts web service sequences into Boolean-valued information system using the concept of coexistence attributes in a sequence. We define the concept of maximal association rules between attribute sets. Here, maximal support and confidence are also defined using soft set. Experimental results show that the proposed soft-set-based framework provides identical rules when compared with other maximal association rules and rough-set-based rules.
Hesitant Fuzzy Linguistic Term Set Based Document Classification
2013 International Conference on Communication Systems and Network Technologies, 2013
This Paper Presents Hesitant Fuzzy Information About Data Sets. Hesitant Fuzzy Linguistic Term Se... more This Paper Presents Hesitant Fuzzy Information About Data Sets. Hesitant Fuzzy Linguistic Term Set (HFLTS) is based on the fuzzy linguistic Approach that will serve as basis to Increase the flexibility of elicitation of linguistic Information. For experimental Classification results analysis evaluated using the Analytical SAS 9.0 Software is used. The Experimental Classification Results show the proposed approach Best performs.
Numerical Result Analysis of Document Classification for Large Data Sets
2013 International Conference on Communication Systems and Network Technologies, 2013
This paper presents new approach Numerical Result Analysis of Classification Based Unique Visitor... more This paper presents new approach Numerical Result Analysis of Classification Based Unique Visitor for Social Networking. The proposed Numerical Result Analysis of Classification Based Unique Visitor for Social Networking are based on Unique Visitor. In this paper we have used Numerical Result Analysis for Document Classification results. The steps Document collection, Text Pre-processing, Feature Selection, Indexing, classification Process and Results Analysis are used. Twenty News group data sets [4] are used in the Experiments. The experimental results are evaluated using the numerical computing MATLAB 7.14 Software. The Experimental Results show the proposed approach better performs.

Pattern and Data Analysis in Healthcare Settings
Problem of decision making is a crucial task in every business. Profit Pattern Mining hit the tar... more Problem of decision making is a crucial task in every business. Profit Pattern Mining hit the target by minimizes the gap between statistical based pattern generation and value base decision making. But this job is found very difficult when it depends on the large, imprecise and vague environment, which is frequent in recent years. The concept of soft computing with data mining is novel way to address this difficulty. The general approaches to association rule mining focus on inducting rule by using correlation among data and finding frequent occurring patterns. The major technique uses support and confidence measures for generating rules which is not adequate nowadays as a measure of interest, since the data have become more multifaceted these days, it's a necessary to find solution that deals with such problems and uses some new measures like profit, significance etc. In this chapter, authors apply concept of pattern mining with vague set theory, Genetic algorithm theory and r...

Journal of Information Science Theory and Practice, 2014
Mobile devices are the most important equipment for accessing various kinds of services. These se... more Mobile devices are the most important equipment for accessing various kinds of services. These services are accessed using wireless signals, the same used for mobile calls. Today mobile services provide a fast and excellent way to access all kinds of information via mobile phones. Mobile service providers are interested to know the access behavior pattern of the users from different locations at different timings. In this paper, we have introduced an associated tree for analyzing user behavior patterns while moving from one location to another. We have used four different parameters, namely user, location, dwell time, and services. These parameters provide stronger frequent accessing patterns by matching joins. These generated patterns are valuable for improving web services, recommending new services, and predicting useful services for individuals or groups of users. In addition, an experimental evaluation has been conducted on simulated data. Finally, performance of the proposed approach has been measured in terms of efficiency and scalability. The proposed approach produces excellent results.

Large Document Set Clustering: an Integrated Approach
Data Mining and Knowledge Engineering, 2013
Document clustering is an important mining task used by the different peoples for different kind ... more Document clustering is an important mining task used by the different peoples for different kind of purposes. It is generally used to find the similar document from the large amount of documents. The document set may be the collection of blogs, website access patterns, or any transaction files. By the document clustering one can find out the similar kind of habits of different peoples, which can play large role in future trend analysis and taking some decisions. Most of the clustering methods uses distance calculation for similarity measure. They scans document multiple times for knowing class and then prepare cluster. If the documents are large then these methods takes more time for clustering. We propose an advanced environment for document clustering, in which only one time documents are scan and immediately assign into the appropriate cluster. Experiments are conducted with the 20 news group datasets by the MATLAB software. Experimental results show the effectiveness of the proposed environment for large document sets.

Document Clustering Using Message Passing between Data Points
2013 International Conference on Communication Systems and Network Technologies, 2013
This paper presents an efficient approach for Document Classification based on FDCKE. The paper i... more This paper presents an efficient approach for Document Classification based on FDCKE. The paper introduces a new Framework for Document Classification and Knowledge Extraction (FDCKE). The FDCKE approach is an integration of document classification phases like document collection from heterogeneous sources, Text Pre-processing of the documents, Feature Selection, Indexing, Classification Process, Results Analysis and Performance Measures. The proposed FDCKE is a unified interface that can be used for Classification, Association and Clustering. Twenty News group data sets [23] are used in the Experiments. The performance evaluation of Experimental Results done by SAS Software. The Experimental Results show that the proposed approach out performs.

International Journal of Computer Applications, 2013
Clusters are useful to identify required object from the huge amount of datasets. There are lots ... more Clusters are useful to identify required object from the huge amount of datasets. There are lots of clustering methods, used to create clusters. Single linkage clustering method is an example of hierarchical agglomerative clustering which is used to merge objects in a cluster, based on minimum distance. In this paper we performed an experiment on two dimensional spaces where multiple objects are available and combine in clusters by Euclidean distance. In this paper, MATLAB is used to calculate the distance between two objects and constructing distance matrix. After completing the whole single linage clustering method dendogram has been prepared. This dendogram is similar to minimum spanning tree because it is prepared using minimum distance of objects. These prepared clusters and dendogram are useful for finding different knowledge from the huge data.

Vedic knowledge states that yajñas have effects on the environment and people. Yajñas of various ... more Vedic knowledge states that yajñas have effects on the environment and people. Yajñas of various kinds are elaborated in the Vedic literature. Among Soma yajñas, the Apthoryama Yajña is the chief and the largest. From 17-26 April 2007, an Apthoryama yajña was organized in Bangalore and evaluation of its effects on the environment, society and human beings was encouraged. Our study used a Random Event Generator (REG) placed 12 m from the site of the yajña to evaluate its effects on the consciousness-field. Significant increases in REG values were found on several occasions on all days, compared to control days with no yajña. Particularly significant changes occurred on the following days: the second day during Vedic chanting (p<0.001); the third day on four occasions, Agni Prasthapana (p<0.05), Soma Kriya (p<0.05), Shainchitti (p<0.001) and Pravargya (p<0.01); the fifth day during Pravargya (p<0.05); the sixth day during Subramanyam Ahwan (p<0.001) and Garun Chayana (p<0.05); the eighth day during Garun Cayana (p<0.05), Agni udbhava (p<0.05) and Subramanyam Ahwan (p<0.05); and the ninth day during Soma yajña (p<0.05) and the final Puja (p<0.05).

IETE Technical Review, 2014
Among the broad assortment of Machine Learning approaches, deep learning has recently attracted a... more Among the broad assortment of Machine Learning approaches, deep learning has recently attracted attention particularly in the domain of user behavior analysis. The notion to study user behavior from the unstructured tweets shared on social media is an interesting yet challenging task. A social platform such as Twitter yield access to the unprompted views of the wide-ranging users on particular events like election. These views cater government and corporates to remold strategies, assess the areas where better measures need to be put forward and monitor common opinion. With the advent of the general election in India (largest democracy) people tend to articulate their views or issues. Tweets related to general elections 2019 of India is used as data corpus for the study. Multi-class classification fabricated with novel deep learning approach is implemented to analyses the user opinion. Here, we have used nine different classes, which is representing larger issues in the nation for election agenda. Moreover, comparative analysis between tradition approaches such as Naïve Bayes, SVM, decision tree, logistic regression and employed approach with deep learning method is presented. Experimental results revels that the proposed model can reach up to 98.70% accuracy on multiclass based prediction in machine learning. The results assist the government and businesses to know about grave issue offering a shot to revise strategic policy and make welfare scheme program.
International Journal of Computer Applications, 2014
Today's world is a social world. Recommending resources in social networking is very common ... more Today's world is a social world. Recommending resources in social networking is very common thing. There are various methods available to recommend friend, music, video, items in social networks. The users look at the web as a place where they can find an individual or group of people with the same or similar interests, or even find new friends. And many times the recommendation system used in social networks suggests users about these resources. We want to apply Markov models and their variations for addressing this problem. It is generally found that higher order Markov models display high predictive accuracy.
Hesitant Distance Similarity Measures for Document Clustering
2011 World Congress on Information and Communication Technologies, 2011
This paper presents new approach, Hesitant Distance Similarity Measures for Document Clustering. ... more This paper presents new approach, Hesitant Distance Similarity Measures for Document Clustering. The proposed Hesitant Distance Similarity Measures approach is based on Fuzzy Hesitant Sets. In this paper we have used fifty Similarity Measures from f1 to f50. The steps, Document collection, Text Pre-processing, Feature Selection, Indexing, Clustering Process and Results Analysis are used. Twenty News group data sets [27]
International Journal of Computer and Communication Technology, 2014
This paper presents Clustering Based Document classification and analysis of data. The proposed C... more This paper presents Clustering Based Document classification and analysis of data. The proposed Clustering Based classification and analysis of data approach is based on Unsupervised and Supervised Document Classification. In this paper Unsupervised Document and Supervised Document Classification are used. In this approach Document collection, Text Preprocessing, Feature Selection, Indexing, Clustering Process and Results Analysis steps are used. Twenty News group data sets [20] are used in the Experiments. For experimental results analysis evaluated using the Analytical SAS 9.0 Software is used. The Experimental Results show the proposed approach out performs.
International Journal of Intelligent Engineering and Systems, 2017

A new framework for sex prediction based on text categorization
In this research paper we have done the work for sex prediction based on developed binary matrix ... more In this research paper we have done the work for sex prediction based on developed binary matrix model (08). We developed training sets based classifier model which produce the best result from the others algorithms like centroid(09), naïve bayes,k-nn act. Sex classification based on trained model in which we classify unknown sex. In this research we have done a lot of work for data collection I have visited door to door for data set. On the basis of databases I have develop matrix model. This model work very accurate and the result are matched with sonography report. This research becomes an expert system that we can know the sex at the time of pregnancy without help of sonography. This research can develop the intelligent model on the basis training data sets and we can predict the sex. This intelligent system doesn't have any disadvantage but

An Efficient Algorithm Distance Calculation of Page Sequences Using Dynamic Programming
Today web data is rapidly growing, but the information residing in the web includes inconsistent ... more Today web data is rapidly growing, but the information residing in the web includes inconsistent information because it is having different types of information, moreover the data are heterogeneous. Due to heterogeneity of data it is a critical task to extract relevant information from the web. Web uses mining technique; extracts the relevant information from huge amount of data available in the web logs format that enclose intrinsic information regarding web pages accessed. Because of this large amount of web log data, it is better to deal with small set of data at a time, instead of handling with complete data. Now we need to find the distance between two user sessions, using some distance similarity function which can accomplish this kind of tasks. Clustering of users tends to establish groups of users exhibiting similar browsing patterns. In this paper we propose an efficient algorithm for calculating the similarity between two user sessions based on sequence alignment that uses...
Uploads
Papers by Ghanshyam Thakur