Papers by Christos Makris
The paper is concerned with the effective and efficient processing of temporal selection queries ... more The paper is concerned with the effective and efficient processing of temporal selection queries in Video Database and generally Temporal Database Management Systems (TDBMS). Based on both general spatio-temporal retrieval framework ((3)) and recent versions of internal-external Priority Search Trees, we present an optimal in time and space algorithm for the problem that answers certain temporal content queries invoking video

Journal of Systems and Software
Service Oriented Computing and its most famous implementation technology Web Services (WS) are be... more Service Oriented Computing and its most famous implementation technology Web Services (WS) are becoming an important enabler of networked business models. Discovery mechanisms are a critical factor to the overall utility of Web Services. So far, discovery mechanisms based on the UDDI standard rely on many centralized and area-specific directories, which poses information stress problems such as performance bottlenecks and fault tolerance. In this context, decentralized approaches based on Peer to Peer overlay networks have been proposed by many researchers as a solution. In this paper, we propose a new structured P2P overlay network infrastructure designed for Web Services Discovery. We present theoretical analysis backed up by experimental results, showing that the proposed solution outperforms popular decentralized infrastructures for web discovery, Chord (and some of its successors), BATON (and it’s successor) and Skip-Graphs.
ASSESSING THE MICROECONOMIC FACET OF ASSOCIATION RULES VIA AN EFFICIENT WEIGHTING SCHEME
People, Knowledge and Technology: What have we Learnt so Far? - Proceedings of the First iKMS International Conference on Knowledge Management, 2004
Towards Intelligent Information Retrieval Engines: A Multi-agent Approach
Lecture Notes in Computer Science, 2000
Abstract: The amount of information available in on-line shops and catalogues is rapidly increasi... more Abstract: The amount of information available in on-line shops and catalogues is rapidly increasing. A single on-line catalogue may contain thousands of products thus posing the increasing need for fast and efficient methods for information filtering and retrieval. Intelligent agent ...
Reducing Redundant Information in Search Results Employing Approximation Algorithms
Lecture Notes in Computer Science, 2014

Improved text annotation with Wikipedia entities
Proceedings of the 28th Annual ACM Symposium on Applied Computing - SAC '13, 2013
ABSTRACT Text annotation is the procedure of initially identifying, in a segment of text, a set o... more ABSTRACT Text annotation is the procedure of initially identifying, in a segment of text, a set of dominant in meaning words and later on attaching to them extra information (usually drawn from a concept ontology, implemented as a catalog) that expresses their conceptual content in the current context. Attaching additional semantic information and structure helps to represent, in a machine interpretable way, the topic of the text and is a fundamental preprocessing step to many Information Retrieval tasks like indexing, clustering, classification, text summarization and cross-referencing content on web pages, posts, tweets etc. In this paper, we deal with automatic annotation of text documents with entities of Wikipedia, the largest online knowledge base; a process that is commonly known as Wikification. Moving similarly to previous approaches the cross-reference of words in the text to Wikipedia articles is based on local compatibility between the text around the term and textual information embedded in the article. The main contribution of this paper is a set of disambiguation techniques that enhance previously published approaches by employing both the WordNet lexical database and the Wikipedia article's PageRank scores in the disambiguation process. The experimental evaluation performed depicts that the exploitation of these additional semantic information sources leads to more accurate Text Annotation.
Reciprocal Rank Using Web Page Popularity
IFIP Advances in Information and Communication Technology, 2014

Extracting Knowledge from Web Search Engine Using Wikipedia
Communications in Computer and Information Science, 2013
ABSTRACT Nowadays, search engines are definitely a dominating web tool for finding information on... more ABSTRACT Nowadays, search engines are definitely a dominating web tool for finding information on the web. However, web search engines usually return web page references in a global ranking making it difficult to the users to browse different topics captured in the result set. Recently, there are meta-search engine systems that discover knowledge in these web search results providing the user with the possibility to browse different topics contained in the result set. In this paper, we focus on the problem of determining different thematic groups on web search engine results that existing web search engines provide. We propose a novel system that exploits semantic entities of Wikipedia for grouping the result set in different topic groups, according to the various meanings of the provided query. The proposed method utilizes a number of semantic annotation techniques using Knowledge Bases, like WordNet and Wikipedia, in order to perceive the different senses of each query term. Finally, the method annotates the extracted topics using information derived from clusters which in following are presented to the end user.
International Journal on Artificial Intelligence Tools, 2015
Nowadays, people frequently use search engines in order to find the information they need on the ... more Nowadays, people frequently use search engines in order to find the information they need on the Web. Especially, Web search constitutes of a basic tool used by million researchers in their everyday work. A very popular indexing engine, concerning life sciences and biomedical research, is PubMed. PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The present search engines usually return search results in a global ranking, making it difficult for the users to browse in different topics or subtopics.

On Topic Categorization of PubMed Query Results
IFIP Advances in Information and Communication Technology, 2012
ABSTRACT Nowadays, people frequently use search engines in order to find the information they nee... more ABSTRACT Nowadays, people frequently use search engines in order to find the information they need on the Web. Especially Web search constitutes a basic tool used by million researchers in their everyday work. A very popular indexing engine, concerning life sciences and biomedical research is PubMed. PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The present search engines usually return search results in a global ranking making it difficult to the users to browse in different topics or subtopics that they query. Because of this mixing of results belonging to different topics, the average users spend a lot of time to find Web pages, best matching their query. In this paper, we propose a novel system to address this problem. We present and evaluate a methodology that exploits semantic text clustering techniques in order to group biomedical document collections in homogeneous topics. In order to provide more accurate clustering results, we utilize various biomedical ontologies, like MeSH and GeneOntology. Finally, we embed the proposed methodology in an online system that post-processes the PubMed online database in order to provide to users the retrieved results according to well formed topics.
Lecture Notes in Computer Science, 2002
A molecular sequence "model" is a (structured) sequence of distinct or identical strings separate... more A molecular sequence "model" is a (structured) sequence of distinct or identical strings separated by gaps; here we design and analyze efficient algorithms for variations of the "Model Matching" and "Model Identification" problems.
Lecture Notes in Computer Science, 2006
We present a purely functional implementation of search trees that requires O(log n) time for sea... more We present a purely functional implementation of search trees that requires O(log n) time for search and update operations and supports the join of two trees in worst case constant time. Hence, we solve an open problem posed by Kaplan and Tarjan as to whether it is possible to envisage a data structure supporting simultaneously the join operation in O(1) time and the search and update operations in O(log n) time.
International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II, 2005
Web services are becoming an important enabler of the Semantic Web. Besides the need for a rich d... more Web services are becoming an important enabler of the Semantic Web. Besides the need for a rich description mechanism, Web Service information should be made available in an accessible way for machine processing. In this paper, we propose a new P2P based approach for Web Services discovery. Peers that store Web Services information, such as data item descriptions, are efficiently located using a scalable and robust data indexing structure for Peerto-Peer data networks, NIPPERS. We present a theoretical analysis which shows that the communication cost of the query and update operations scale double-logarithmically with the number of NIPPERS nodes. Furthermore, we show that the network is robust with respect to failures fulfilling quality of web services requirements.
In this paper we present a simple and efficient implementation of a min-max priority queue, refle... more In this paper we present a simple and efficient implementation of a min-max priority queue, reflected min-max priority queues. The main merits of our construction are threefold. First, the space utilization of the reflected min-max heaps is much better than the naive solution of putting two heaps back-to-back. Second, the methods applied in this structure can be easily used to transform ordinary priority queues into min-max priority queues. Third, when considering only the setting of min-max priority queues, we support merging in constant worst-case time which is a clear improvement over the best worst-case bounds achieved by Høyer.
Clustering is a classic problem in the machine learning and pattern recognition area, however a f... more Clustering is a classic problem in the machine learning and pattern recognition area, however a few complications arise when we try to transfer proposed solutions in the data stream model. Recently there have been proposed new algorithms for the basic clustering problem for massive data sets that produce an approximate solution using efficiently the memory, which is the most critical resource for streaming computation. In this paper, based on these solutions, we present a new model for clustering clickstream data which applies three different phases in the data processing, and is validated through a set of experiments.
Data & Knowledge Engineering, 2007
This paper presents a methodology for knowledge acquisition from source code. We use data mining ... more This paper presents a methodology for knowledge acquisition from source code. We use data mining to support semi- automated software maintenance and comprehension and provide practical insights into systems specifics, assuming one has limited prior familiarity with these systems. We propose a methodology and an associated model for extracting information from object oriented code by applying clustering and association rules
Improving Search Engines’ Document Ranking Employing Semantics and an Inference Network
Lecture Notes in Business Information Processing, 2014
Identifying Personality-Based Communities in Social Networks
Lecture Notes in Computer Science, 2014
ABSTRACT In this paper we present a novel algorithm for forming communities in a graph representi... more ABSTRACT In this paper we present a novel algorithm for forming communities in a graph representing social relations as they emerge from the use of services like Twitter. The main idea centers in the careful use of features to characterize the members in the community, and in the hypothesis that well formed communities are those that designate diversity in the features of the participating members.

Predicting Information Diffusion Patterns in Twitter
IFIP Advances in Information and Communication Technology, 2014
ABSTRACT The prediction of social media information propagation is a problem that has attracted a... more ABSTRACT The prediction of social media information propagation is a problem that has attracted a lot of interest over the recent years, especially because of the application of such predictions for effective marketing campaigns. Existing approaches have shown that the information cascades in social media are small and have a large width. We validate these results for Tree-Shaped Tweet Cascades created by the ReTweet action. The main contribution of our work is a methodology for predicting the information diffusion that will occur given a user's tweet. We base our prediction on the linguistic features of the tweet as well as the user profile that created the initial tweet. Our results show that we can predict the Tweet-Pattern with good accuracy. Moreover, we show that influential networks within the Twitter graph tend to use different Tweet-Patterns.
Modeling ReTweet Diffusion Using Emotional Content
IFIP Advances in Information and Communication Technology, 2014
ABSTRACT In this paper we present a prediction model for forecasting the depth and the width of R... more ABSTRACT In this paper we present a prediction model for forecasting the depth and the width of ReTweeting using data mining techniques. The proposed model utilizes the analyzers of tweet emotional content based on Ekman emotional model, as well as the behavior of users in Twitter. In following, our model predicts the category of ReTweeting diffusion. The model was trained and validated with real data crawled by Twitter. The aim of this model is the estimation of spreading of a new post which could be retweeted by the users in a particular network. The classification model is intended as a tool for sponsors and people of marketing to specify the tweets that spread more in Twitter network.
Uploads
Papers by Christos Makris