Abstract—Access to Web data has become an integral part of many applications and services. In the... more Abstract—Access to Web data has become an integral part of many applications and services. In the past, such data has usually been accessed through human-tailored HTML interfaces. Nowadays, rich client interfaces in desktop applications or, increasingly, in browser-based clients ease data access and allow more complex client processing based on XML or RDF data retrieved through Web service interfaces. Convenient speci cations of the data processing on the client and exible, expressive service interfaces for data access ...
XML is a now a dominant standard for storing and exchanging information. With its increasing use ... more XML is a now a dominant standard for storing and exchanging information. With its increasing use in areas such a s d a t a w arehousing and e-commerce, there is a rapidly growing need for rule-based technology to support reactive functionality on XML repositories. Eventcondition-action (ECA) rules automatically perform actions in response to events and are a natural facility to support such functionality. In this paper, we study ECA rules in the context of XML data. We de ne a simple language for specifying ECA rules on XML repositories. The language is illustrated by means of some examples, and its syntax and semantics are then speci ed more formally. We t h e n i n vestigate methods for analysing and optimising these ECA rules, a task which has added complexity in this XML setting compared with conventional active databases.
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, 2010
Discovery of alternative clusterings is an important method for exploring complex datasets. It pr... more Discovery of alternative clusterings is an important method for exploring complex datasets. It provides the capability for the user to view clustering behaviour from different perspectives and thus explore new hypotheses. However, current algorithms for alternative clustering have focused mainly on linear scenarios and may not perform as desired for datasets containing clusters with non linear shapes. Our goal in this paper is to address this challenge of non linearity. In particular, we propose a novel algorithm to uncover an alternative clustering that is distinctively different from an existing, reference clustering. Our technique is information theory based and aims to ensure alternative clustering quality by maximizing the mutual information between clustering labels and data observations, whilst at the same time ensuring alternative clustering distinctiveness by minimizing the information sharing between the two clusterings. We perform experiments to assess our method against a large range of alternative clustering algorithms in the literature. We show our technique's performance is generally better for non-linear scenarios and furthermore, is highly competitive even for simpler, linear scenarios.
Abstract—Access to Web data has become an integral part of many applications and services. In the... more Abstract—Access to Web data has become an integral part of many applications and services. In the past, such data has usually been accessed through human-tailored HTML interfaces. Nowadays, rich client interfaces in desktop applications or, increasingly, in browser-based clients ease data access and allow more complex client processing based on XML or RDF data retrieved through Web service interfaces. Convenient speci cations of the data processing on the client and exible, expressive service interfaces for data access ...
XML is a now a dominant standard for storing and exchanging information. With its increasing use ... more XML is a now a dominant standard for storing and exchanging information. With its increasing use in areas such a s d a t a w arehousing and e-commerce, there is a rapidly growing need for rule-based technology to support reactive functionality on XML repositories. Eventcondition-action (ECA) rules automatically perform actions in response to events and are a natural facility to support such functionality. In this paper, we study ECA rules in the context of XML data. We de ne a simple language for specifying ECA rules on XML repositories. The language is illustrated by means of some examples, and its syntax and semantics are then speci ed more formally. We t h e n i n vestigate methods for analysing and optimising these ECA rules, a task which has added complexity in this XML setting compared with conventional active databases.
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, 2010
Discovery of alternative clusterings is an important method for exploring complex datasets. It pr... more Discovery of alternative clusterings is an important method for exploring complex datasets. It provides the capability for the user to view clustering behaviour from different perspectives and thus explore new hypotheses. However, current algorithms for alternative clustering have focused mainly on linear scenarios and may not perform as desired for datasets containing clusters with non linear shapes. Our goal in this paper is to address this challenge of non linearity. In particular, we propose a novel algorithm to uncover an alternative clustering that is distinctively different from an existing, reference clustering. Our technique is information theory based and aims to ensure alternative clustering quality by maximizing the mutual information between clustering labels and data observations, whilst at the same time ensuring alternative clustering distinctiveness by minimizing the information sharing between the two clusterings. We perform experiments to assess our method against a large range of alternative clustering algorithms in the literature. We show our technique's performance is generally better for non-linear scenarios and furthermore, is highly competitive even for simpler, linear scenarios.
Uploads
Papers by James Bailey