Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009
Networks of sensors are used in many different fields, from industrial applications to surveillance applications. A common feature of these applications is the necessity of a monitoring infrastructure that analyzes a large number of data streams and outputs values that satisfy certain constraints.
2008 IEEE 24th International Conference on Data Engineering, 2008
In this paper we present algorithms for building and maintaining efficient collection trees that provide the conduit to disseminate data required for processing monitoring queries in a wireless sensor network. While prior techniques base their operation on the assumption that the sensor nodes that collect data relevant to a specified query need to include their measurements in the query result at every query epoch, in many event monitoring applications such an assumption is not valid. We introduce and formalize the notion of event monitoring queries and demonstrate that they can capture a large class of monitoring applications. We then show techniques which, using a small set of intuitive statistics, can compute collection trees that minimize important resources such as the number of messages exchanged among the nodes or the overall energy consumption. Our experiments demonstrate that our techniques can organize the data collection process while utilizing significantly lower resources than prior approaches.
2007
Sensor networks are widely used in many applications for collecting information from the physical environment. In these applications, it is usually necessary to track the relationships between sensor data readings within a time window to detect events of interest. However, it is difficult to detect such events by using the common aggregate or selection queries. We address the problem of processing window self-join in order to detect events of interest. Self-joins are useful in tracking correlations between different sensor readings, which can indicate an event of interest. We propose the Two-Phase Self-Join (TPSJ) scheme to efficiently evaluate self-join queries for event detection in sensor networks. Our TPSJ scheme takes advantage of the properties of the events and carries out data filtering during in-network processing. We discuss TPSJ execution with one window and we extend it for continuous event monitoring. Our experimental evaluation results indicate that the TPSJ scheme is effective in reducing the amount of radio transmissions during event detection.
Proceedings of the VLDB Endowment, 2010
Prediction is emerging as an essential ingredient for real-time monitoring, planning and decision support applications such as intrusion detection, e-commerce pricing and automated resource management. This paper presents a system that efficiently supports continuous prediction queries (CPQs) over streaming data using seamlessly-integrated probabilistic models. Specifically, we describe how to execute and optimize CPQs using discrete (Dynamic) Bayesian Networks as the underlying predictive model. Our primary contribution is a novel cost-based optimization framework that employs materialization, sharing, and model-specific optimization techniques to enable highly-efficient point-and range-based CPQ execution. Furthermore, we support efficient execution of top-k and threshold-based high probability queries. We characterize the behavior of our system and demonstrate significant performance gains using a prototype implementation operating on realworld network intrusion data and deployed as part of a real-time software-performance monitoring system.
2003
Sensor networks represent a non traditional source of information, as readings generated by sensors flow continuously, leading to an infinite stream of data. Traditional DBMSs, which are based on an exact and detailed representation of information, are not suitable in this context, as all the information carried by a data stream cannot be stored within a bounded storage space. Thus, compressing data (by possibly loosing less relevant information) and storing their compressed representation, rather than the original one, becomes mandatory. This approach aims to store as much information carried by the stream as possible, but makes it unfeasible to provide exact answers to queries on the stream content. However, exact answers to queries are often not necessary, as approximate ones usually suffice to get useful reports on the world monitored by the sensors. In this paper we propose a technique for providing fast approximate answers to aggregate queries on sensor data streams. Our proposal is based on a hierarchical summarization of the data stream embedded into a flexible indexing structure, which permits us to both access and update compressed data efficiently. The compressed representation of data is updated continuously, as new sensor readings arrive. When the available storage space is not enough to store new data, some space is released by compressing the "oldest" stored data progressively, so that recent information (which is usually the most relevant to retrieve) is represented with more detail than old one.
Real-Time Systems, 2006
An approach to improve the reliability of query results based on error-prone sensors is to use redundant sensors. However, this approach is expensive; moreover, some sensors may malfunction and their readings need to be discarded. In this paper, we propose a statistical approach to decide which sensors to be used to answer a query. In particular, we propose to solve the problem with the aid of continuous probabilistic query (CPQ), which is originally used to manage uncertain data and is associated with a probabilistic guarantee on the query result. Based on the historical data values from the sensors, the query type, and the requirement on the query, we present methods to select an appropriate set of sensors and provide reliable answers for aggregate queries. Our algorithm is demonstrated in simulation experiments to provide accurate and robust query results.
World Wide Web, 2011
A sliding-window k-NN query (k-NN/w query) continuously monitors incoming data stream objects within a sliding window to identify k closest objects to a query. It enables effective filtering of data objects streaming in at high rates from potentially distributed sources, and offers means to control the rate of object insertions into result streams. Therefore k-NN/w processing systems may be regarded as one of the prospective solutions for the information overload problem in applications that require processing of structured data in real-time, such as the Sensor Web. Existing k-NN/w processing systems are mainly centralized and cannot cope with multiple data streams, where data sources are scattered over the Internet. In this paper, we propose a solution for distributed continuous k-NN/w processing of structured data from distributed streams. We define a k-NN/w processing model for such setting, and design a distributed k-NN/w processing system on top of the Content-Addressable Network (CAN) overlay. An extensive evaluation using both real and synthetic data sets demonstrates the feasibility of the proposed solution because it balances the load among the peers, while the messaging overhead within the P2P network remains reasonable. Moreover, our results clearly show the solution is scalable for an increasing number of queries and peers.
2005
Although advances in sensor node hardware, which comprises sensors, embedded processors, and communication components, have made the large-scale deployment of sensor networks a reality, sensor networks are quite limited in resources. A sensor network includes a numerous battery-operated wireless sensor nodes, which have limited processing and storage capabilities, and a few base stations, which are powered PCs that are possibly connected to the Internet. Furthermore, typical sensor network applications, ranging from monitoring to military and health care, generate various complex continuous queries. The querying of sensor networks requires a rich set of abstractions, techniques, and heuristics.
Information Fusion, 2008
Data processing applications for sensor streams have to deal with multiple continuous data streams with inputs arriving at highly variable and unpredictable rates from various sources. These applications perform various operations (e.g. filter, aggregate, join etc) on incoming data streams in real-time according to predefined queries or rules. Since the data rate and data distribution fluctuate over time, an appropriate join tree for processing join queries must be adaptively maintained in response to dynamic changes to prevent rapid degradation of the system performance. In this paper, we address the problem of finding an optimal join tree that maximizes throughput for sliding window based multi-join queries over continuous data streams and prove its NP-Hardness. We present a dynamic programming algorithm, OptDP, which produces the optimal tree but runs in an exponential time in the number of input streams. We then present a polynomial time greedy algorithm, XGreedyJoin. We tested these algorithms in ARES, an adaptively re-optimizing engine for stream queries, which we developed by extending Jess 1 .
2006 International Conference on Collaborative Computing: Networking, Applications and Worksharing, 2006
Sensors are envisioned to be at the center of distributed collaborative computing services involving time-critical decision support. Sensors are small devices with limited communication and computational capabilities that collect data on their neighboring physical world and send the data periodically to server machines. Sensors form a collaborative network with these servers, where the sensors gather information and the servers perform various operations (e.g. filter, aggregate, join etc) on the information streams in real-time according to predefined queries or rules. Sensor data streams are continuous, un-ending and have highly volatile characteristics. As a result, traditional database systems are inappropriate for handling queries for sensor streams, and several stream data management systems have been proposed in the literature. In this paper we focus on a special type of query, namely join queries, which is the most expensive query operator. Here, we address the problem of finding an optimal join tree that maximizes throughput for sliding window based multi-join queries over continuous sensor data streams. We present a polynomial time algorithm Fodp and three variants of Fodp. Our experiments in ARES 1 show that for almost all instances, trees from Fodp and its variants perform close to the optimal trees from our exponential time algorithm OptDP [15], and significantly better than existing XJoin based heuristic algorithms.
2002
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work relevant to data stream systems and current projects in the area, the paper explores topics in stream query languages, new requirements and challenges in query processing, and algorithmic issues. £ systems, view management, sequence databases, and others. Although much of this work clearly has applications to data stream processing, we hope to show in this paper that there are many new problems to address in realizing a complete DSMS.
2009
Top-k monitoring queries are useful in many wireless sensor network applications. There is a well-known approach called FILA to process this kind of queries. Its basic idea is to install a filter at each sensor node to avoid unnecessary transmissions of sensor readings. FILA uses two algorithms to ensure the correctness and efficiency of the approach: a query reevaluation algorithm and a filter setting algorithm. In this paper, we propose improvements to each of these two algorithms. First, we propose a decentralized query reevaluation algorithm to reduce the communication cost of sending probe messages. Second, we propose a linear regression-based filter setting algorithm to improve the effectiveness of filters. Experimental results on real data traces show that our proposed improvements further enhance the performance of FILA in terms of network lifetime.
The VLDB Journal, 2008
Sensor networks consist of battery-powered wireless devices that are required to operate unattended for long periods of time. Thus, reducing energy drain is of utmost importance when designing algorithms and applications for such networks. Aggregate queries are often used by monitoring applications to assess the status of the network and detect abnormal behavior. Since radio transmission often constitutes the biggest factor of energy drain in a node, in this paper we propose novel algorithms for the evaluation of bandwidth-constrained queries over sensor networks. The goal of our techniques is, given a target bandwidth utilization factor, to program the sensor nodes in a way that seeks to maximize the accuracy of the produced query results at the monitoring node, while always providing strong error guarantees to the monitoring application. This is a distinct difference of our framework from previous techniques that only provide probabilistic guarantees on the accuracy of the query result. Our algorithms are equally applicable when the nodes have ample power resources, but bandwidth consumption needs to be minimized, for instance in densely distributed networks, to ensure proper operation of the nodes. Our experiments with real sensor data show that bandwidth-constrained queries can substantially reduce the number of messages in the network while providing very tight error bounds on the query result.
Distributed and Parallel Databases, 2010
Hardware for sensor nodes that combine physical sensors, actuators, embedded processors, and communication components has advanced significantly over the last decade, and made the large-scale deployment of such sensors a reality. Applications range from monitoring applications such as inventory maintenance over health care to military applications. In this paper, we evaluate the design of a query layer for sensor networks. The query layer accepts queries in a declarative language that are then optimized to generate efficient query execution plans with in-network processing which can significantly reduce resource requirements. We examine the main architectural components of such a query layer, concentrating on in-network aggregation, interaction of in-network aggregation with the wireless routing protocol, and distributed query processing. Initial simulation experiments with the ns-2 network simulator show the tradeoffs of our system.
Signal Processing, 2007
2009
One of the most important input providers for data stream management systems (DSMSs) is a sensor network. Such a network can have query functionality offered as a sensor network query processor (SNQP). Then some of the data stream operators can be executed in the DSMS as well as in the SNQP. This paper addresses the problem of finding the optimal solution. It shows first steps like the moving of operators and the modification of the epoch duration. The primary goal is to prolong the lifetime of the sensor network. A QoS-based goal function is introduced, and the optimization process is explained. It has been implemented with Borealis and TinyDB. Some preliminary results from the ongoing evaluation are given. 1.
Advances in Database Systems
ACM Transactions on Database Systems, 2004
Continuous queries often require significant run-time state over arbitrary data streams. However, streams may exhibit certain data or arrival patterns, or constraints , that can be detected and exploited to reduce state considerably without compromising correctness. Rather than requiring constraints to be satisfied precisely, which can be unrealistic in a data streams environment, we introduce k-constraints , where k is an adherence parameter specifying how closely a stream adheres to the constraint. (Smaller k 's are closer to strict adherence and offer better memory reduction.) We present a query processing architecture, called k-Mon , that detects useful k -constraints automatically and exploits the constraints to reduce run-time state for a wide range of continuous queries. Experimental results showed dramatic state reduction, while only modest computational overhead was incurred for our constraint monitoring and query execution algorithms.
2006
Networks of sensors arise naturally in many different fields, from industrial applications (e.g., monitoring of environmental parameters in a chemical plant) to surveillance applications (e.g., sensors that detect the presence of intruders in a private property). The common feature of these applications is the necessity of a monitoring infrastructure that analyzes continuous supplies of data streams and outputs the values that satisfy certain constraints.
Proceedings of the 2006 ACM SIGMOD international conference on Management of data - SIGMOD '06, 2006
Recent data stream systems such as TelegraphCQ have employed the well-known property of duality between data and queries. In these systems, query processing methods are classified into two dual categories -data-initiative and query-initiative -depending on whether query processing is initiated by selecting a data element or a query. Although the duality property has been widely recognized, previous data stream systems do not fully take advantages of this property since they use the two dual methods independently: data-initiative methods only for continuous queries and query-initiative methods only for ad-hoc queries. We contend that continuous query processing can be better optimized by adopting an approach that integrates the two dual methods. Our primary contribution is based on the observation that spatial join is a powerful tool for achieving this objective. In this paper, we first present a new viewpoint of transforming the continuous query processing problem to a multi-dimensional spatial join problem. We then present a continuous query processing algorithm based on spatial join, which we name Spatial Join CQ. This algorithm processes continuous queries by finding the pairs of overlapping regions from a set of data elements and a set of queries, both defined as regions in the multi-dimensional space. The algorithm achieves the advantages of the two dual methods simultaneously. Experimental results show that the proposed algorithm outperforms earlier algorithms by up to 36 times for simple selection continuous queries and by up to 7 times for sliding window join queries.
Distributed and Parallel Databases, 2010
Wireless sensor networks are powerful, distributed, self-organizing systems used for event and environmental monitoring. In-network query processors like TinyDB afford a user friendly SQL-like application development. Due to sensor nodes' resource limitations, monolithic approaches often support only a restricted number of operators. For this reason, complex processing is typically outsourced to the base station as a part of processing tasks. Nevertheless, previous work has shown that complete or partial in-network processing can be more efficient than the base station approach. In this paper, we introduce AnduIN , a system for developing, deploying, and running complex innetwork processing tasks. Particularly, we present the query planning and execution strategies used in An-duIN , which combines sensor-local in-network processing and a data stream engine. Query planning employs a multi-dimensional cost model taking energy consumption into account and decides autonomously which query parts will be processed within the sensor network and which parts will be processed at the central instance. Keywords sensor networks • data streams • power awareness • distributed computation • in-network query processing • query planning This work was in parts supported by the BMBF under grant 03WKBD2B and by the Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Lion-2) and 08/SRC/I1407 (Clique).
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.