Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2003
Sensor networks represent a non traditional source of information, as readings generated by sensors flow continuously, leading to an infinite stream of data. Traditional DBMSs, which are based on an exact and detailed representation of information, are not suitable in this context, as all the information carried by a data stream cannot be stored within a bounded storage space. Thus, compressing data (by possibly loosing less relevant information) and storing their compressed representation, rather than the original one, becomes mandatory. This approach aims to store as much information carried by the stream as possible, but makes it unfeasible to provide exact answers to queries on the stream content. However, exact answers to queries are often not necessary, as approximate ones usually suffice to get useful reports on the world monitored by the sensors. In this paper we propose a technique for providing fast approximate answers to aggregate queries on sensor data streams. Our proposal is based on a hierarchical summarization of the data stream embedded into a flexible indexing structure, which permits us to both access and update compressed data efficiently. The compressed representation of data is updated continuously, as new sensor readings arrive. When the available storage space is not enough to store new data, some space is released by compressing the "oldest" stored data progressively, so that recent information (which is usually the most relevant to retrieve) is represented with more detail than old one.
On the Move to Meaningful …, 2004
The problem of representing and querying sensor-network data issues new research challenges, as traditional techniques and architectures used for managing relational and object oriented databases are not suitable in this context. In this paper we present a Grid-based architecture that supports aggregate query answering on sensor network data, and uses a summarization technique to efficiently accomplish this task. In particular, grid nodes are used either to collect, compress and store sensor readings, and to extract information from stored data. Grid nodes can exchange information among each other, so that the same piece of information can be stored (with a different degree of accuracy) into several nodes. Queries are evaluated by locating the grid nodes containing the needed information, and choosing (among these nodes) the most convenient ones, according to a cost model.
2004
algorithm to reduce the number of duplicate/overlapping queries and save overall energy consumption in the sensor network. Our performance evaluations show that by applying our query aggregation algorithm, the overall energy consumption can be significantly reduced and the sensor network lifetime can be prolonged correspondingly.
Computer Communications, 2006
Providing efficient data services has been required by many sensor network applications. While most existing work in this area focuses on data aggregation, not much attention has been paid to query aggregation. For many applications, especially ones with high query rates, query aggregation is very important. In this paper, we study a query aggregation-based approach to provide efficient data services. In particular: (1) we propose a multi-layer overlay-based framework consisting of a query manager and access points (nodes), where the former provides the query aggregation plan and the latter executes the plan; (2) we design an effective query aggregation algorithm to reduce the number of duplicate/overlapping queries and save overall energy consumption in the sensor network. We also design protocols to effectively deliver aggregated queries and query results in the sensor network. Our performance evaluations show that by applying our query aggregation algorithm, the overall energy consumption can be significantly reduced and the sensor network lifetime can be prolonged correspondingly.
2010
Data streams constitute the core of many traditional (e.g. financial) and emerging (e.g. environmental) applications. The sources of streams are ubiquitous in daily life (e.g. web clicks). One feature of these data is the high speed of their arrival. Thus, their processing entails a special constraint. Despite the exponential growth in the capacity of storage devices, it is very expensive - even impossible - to store a data stream in its entirety. Consequently, queries are evaluated only on the recent data of the stream, the old ones are expired. However, some applications need to query the whole data stream. Therefore, the inability to store a complete stream suggests the storage of a compact representation of its data, called summaries. These structures allow users to query the past without an explosion of the required storage space, to provide historical aggregated information, to perform data mining tasks or to detect anomalous behavior in computer systems. The side effect of using summaries is that queries over historical data may not return exact answers, but only approximate ones. This paper introduces a new approach which is a trade-off between the accuracy of query results and the time consumed in building summaries.
ArXiv, 2017
With the recent proliferation of sensor data, there is an increasing need for the efficient evaluation of analytical queries over multiple sensor datasets. The magnitude of such datasets makes exact query answering infeasible, leading researchers into the development of approximate query answering approaches. However, existing approximate query answering algorithms are not suited for the efficient processing of queries over sensor data, as they exhibit at least one of the following shortcomings: (a) They do not provide deterministic error guarantees, resorting to weaker probabilistic error guarantees that are in many cases not acceptable, (b) they allow queries only over a single dataset, thus not supporting the multitude of queries over multiple datasets that appear in practice, such as correlation or cross-correlation and (c) they support relational data in general and thus miss speedup opportunities created by the special nature of sensor data, which are not random but follow a t...
Information Systems, 2006
In-network data aggregation has been recently proposed as an effective means to reduce the number of messages exchanged in wireless sensor networks. Nodes of the network form an aggregation tree, in which parent nodes aggregate the values received from their children and propagate the result to their own parents. However, this schema provides little flexibility for the end-user to control the operation of the nodes in a data sensitive manner. For large sensor networks with severe energy constraints, the reduction (in the number of messages exchanged) obtained through the aggregation tree might not be sufficient. In this paper we present new algorithms for obtaining approximate aggregate statistics from large sensor networks. The user specifies the maximum error that he is willing to tolerate and, in turn, our algorithms program the nodes in a way that seeks to minimize the number of messages exchanged in the network, while always guaranteeing that the produced estimate lies within the specified error from the exact answer. A key ingredient to our framework is the notion of the residual mode of operation that is used to eliminate messages from sibling nodes when their cumulative change to the computed aggregate is small. We introduce two new algorithms, based on potential gains, which adaptively redistribute the error thresholds to those nodes that benefit the most and try to minimize the total number of transmitted messages in the network. Our techniques significantly reduce the number of messages, often by a factor of 10 for a modest 2% relative error bound, and consistently outperform previous techniques for computing approximate aggregates, which we have adapted for sensor networks.
2002
We present a framework for stream data processing that incorporates a stream database se11Jer as a fundamental component. The server operates as the stream control iflterjace between arrays of distributed data stream sources and end-user clients thaJ access and analyze the streams. The underlying framework provides novel stream managemem and query processing mechanisms to support the online acquisition, management, storage, non-blocking query. and imegration of data streams for distributed muLti-sensor networks. In this paper, we define OUT stream model and stream representation for the stream database, and we describe the functionality alld implementation of key components of the stream processing framework, including the query processing interface for source streams, the stream manager, the stream buffer manager, non-blocking query execution, and a new class ofjoin aLgorithms for joining multipLe data streams constrained by a sliding time window. We conduct experiments using real data streams to evaluate the performance of the new aLgoritluns against traditional stream join aLgorithms. The experiments show significant performance improvements and aLso demonstrate the flexibility of our system ;n handling data streams. A muLti-sensor network appLicatioll for the intelligent detection of lwzardous materials ;s presented to illustrate the capabilities ofourframework.
Lecture Notes in Computer Science, 2009
In this paper we present algorithms for building and maintaining efficient aggregation trees that provide the conduit to disseminate data required for processing monitoring queries in a wireless sensor network. While prior techniques base their operation on the assumption that the sensor nodes that collect data relevant to a specified query need to include their measurements in the query result at every query epoch, in many event monitoring applications such an assumption is not valid. We introduce and formalize the notion of event monitoring queries and demonstrate that they can capture a large class of monitoring applications. We then show techniques which, using a small set of intuitive statistics, can compute aggregation trees that minimize important resources such as the number of messages exchanged among the nodes or the overall energy consumption. Our experiments demonstrate that our techniques can organize the data aggregation process while utilizing significantly lower resources than prior approaches.
2000
In this paper we present algorithms for building and maintaining effi- cient aggregation trees that provide the conduit to dissemi nate data required for processing monitoring queries in a wireless sensor network. While prior tech- niques base their operation on the assumption that the sensor nodes that collect data relevant to a specified query need to include their measu rements
2002
Recent years have witnessed an increasing interest in designing algorithms for querying and analyzing streaming data (i.e., data that is seen only once in a fixed order) with only limited memory. Providing (perhaps approximate) answers to queries over such continuous data streams is a crucial requirement for many application environments; examples include large telecom and IP network installations where performance data from different parts of the network needs to be continuously collected and analyzed.
2002
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work relevant to data stream systems and current projects in the area, the paper explores topics in stream query languages, new requirements and challenges in query processing, and algorithmic issues. £ systems, view management, sequence databases, and others. Although much of this work clearly has applications to data stream processing, we hope to show in this paper that there are many new problems to address in realizing a complete DSMS.
2006
We assume a sensor network with data-centric storage, where sensor data is stored within the sensor network and ad hoc queries are disseminated and processed inside the network. In such an environment, there are often similarities among submitted queries. Using current solutions, similar queries may have to go through the same expensive query processing steps thus wasting energy. In this paper, we propose a similarity-aware query processing scheme (SAQP) that materializes previous query results within the sensor network and utilizes these materialized results to answer future similar queries. Through simulation, we demonstrate that our SAQP scheme reduces energy consumption on queries with negligible increase in response time, and without compromising the quality of data.
2006
To process aggregation queries issued through different sensors as access points in sensor networks, existing algorithms handle queries independently and perform in-network aggregation only at the query time. As a result of ad-hoc and independent execution of queries, no partial result is sharable and reusable among the queries. Consequently, scarce sensor network resources can be easily overconsumed, particularly, those sensors commonly accessed by queries. In this paper, we address this issue by examining strategies to maintain Materialized In-Network Views (MINVs) that pre-compute and store commonly used aggregation results in the sensor network. With MINVs, aggregated sensed results for some spatial regions are available and sharable to queries. Thus, the number of sensor accesses is greatly reduced. Through simulations, we validate the effectiveness of proposed strategies.
Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.
We present a novel approach to approximate evaluation of standing aggregate queries over streaming data, subject to user-specified error bounds. Our method models the behavior of aggregates as Brownian motions, and adaptively updates the model according to stream characteristics. This approach has two advantages. First, it greatly improves system scalability since we can defer query evaluation as long as the difference between the returned and true aggregate values remains within user-specified bounds. Second, we are able to provide approximate answers during stream interruptions by estimating the rate at which the streams and the aggregate drift during the blackout periods. We also study processor allocation issues in such approximate aggregate evaluation systems. Our experiments show that our model captures the behavior of real-world streams such as sensor data and stock traces with excellent fidelity, and scales very well for large numbers of standing queries.
2005
In many data st.reaming applications. streams may cont ain data tuples that are either redundant. repetitive, or that are not "interesting" to any of the standing continuous queries. Processing such tuples may waste s~'stem resources \\'ithout producing useful answers. To the contrary, some other tuples can be categorized as promi8ing. This paper proposes that stream query engines can have the option to execute on promising tuples only and not on all tuples. 'Ve propose to maintain intermediate stream summaries and indices that can direct the stream query engine to detect and operate on promising tuples. As an illustration. the proposed intermediate stream summaries are tuned towards capturing promising tuples that (1) maximize the number of output tuples. (2) contribute to producing a faithful representative sample of the output tuples (compared to the output produced when assuming infinite resources), or (3) produce the outlier or deviant results. Experiments are conducted in the context of Nile [24]. a prototype stream query processing engine developed at Purdue Unil l ersity.
Statistical and Scientific Database Management, 2005
The widespread use of sensor networks in scientific and engineering applications leads to increased demand on the efficient computation of the collected sensor data. Recent research in sensor and stream data systems adopts the no- tion of sliding windows to process continuous queries over infinite sensor readings. Ordered processing of input data is essential during query execution for many application
2004
Recent sensor networks research has produced a class of data storage and query processing techniques called< i> Data-Centric Storage</i> that leverages locality-preserving distributed indexes like DIM, DIFS, and GHT to efficiently answer multi-dimensional range and range-aggregate queries. These distributed indexes offer a rich design space of a) logical decompositions of sensor relation schema into indexes, as well as b) physical mappings of these indexes onto sensors. In this poster, we explore this space for energy- ...
Journal of Systems and Software, 2008
A wireless sensor network (WSN) is composed of tens or hundreds of spatially distributed autonomous nodes, called sensors. Sensors are devices used to collect data from the environment related to the detection or measurement of physical phenomena. In fact, a WSN consists of groups of sensors where each group is responsible for providing information about one or more physical phenomena (e.g., group for collecting temperature data). Sensors are limited in power, computational capacity, and memory. Therefore, a query engine and query operators for processing queries in WSNs should be able to handle resource limitations such as memory and battery life. Adaptability has been explored as an alternative approach when dealing with these conditions. Adaptive query operators (algorithms) can adjust their behavior in response to specific events that take place during data processing. In this paper, we propose an adaptive innetwork aggregation operator for query processing in sensor nodes of a WSN, called ADAGA (ADaptive AGgregation Algorithm for sensor networks). The ADAGA adapts its behavior according to memory and energy usage by dynamically adjusting data-collection and data-sending time intervals. ADAGA can correctly aggregate data in WSNs with packet replication. Moreover, ADAGA is able to predict non-performed detection values by analyzing collected values. Thus, ADAGA is able to produce results as close as possible to real results (obtained when no resource constraint is faced). The results obtained through experiments prove the efficiency of ADAGA.
2010
Wireless sensor networks (WSN) are composed of several sensors having limited memory, processing power, communication bandwidth, and energy, which cooperate in performing a given task. The use of the database paradigm has emerged in the last few years as a viable solution to manage data in such a context. In this paper we present the MaD-WiSe system, a distributed query processing framework that moves the processing of the query into the network. MaD-WiSe reconsiders various aspects related to database system design and it reinterprets them according to the WSN constraints and requirements. In particular it considers the aspects related to the definition of a query language to formalize the queries, a stream model to manage data acquired by the sensors, a query algebra to define the operators that actually perform the query, and energy efficiency and query optimization strategies for saving energy.
tls.wydd.free.fr
The use of sensor based applications is in expansion in many contexts. Sensors are involved at several scales ranging from the individual (e.g. personal monitoring, smart homes) to regional and even world wide contexts (i.e. logistics, natural resource monitoring and forecast). Easy and efficient management of data streams produced by a large number of heterogeneous sensors is a key issue to support such applications. Numerous solutions for query processing on data streams have been proposed by the scientific community. Several query processors have been implemented and offer heterogeneous querying capabilities and semantics.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.