International Journal of Parallel, Emergent and Distributed Systems, 2015
In many systems, due to the lack of an adequate positioning capability or the need for energy sav... more In many systems, due to the lack of an adequate positioning capability or the need for energy saving, it is infeasible to track the location of a mobile device as it is moving. Its trajectory, however, may be reconstructed from the real-time fingerprint data that are obtained by the sensors built in the device. For this purpose, we investigate a regularization framework aimed to maximize the localization accuracy by taking into account the spatiotemporal properties regarding the fingerprint space in relation to the location space. The viability of this framework is demonstrated in an evaluation using real-world datasets, which shows its potential to outperform conventional approaches to location fingerprinting.
Online social networking has become ubiquitous. For a social storage system to keep pace with inc... more Online social networking has become ubiquitous. For a social storage system to keep pace with increasing amounts of user data and activities, a natural solution is to deploy more servers. An important design problem then is how to partition the data across the servers so that server efficiency and load balancing can both be maximized. Although data partitioning is well-studied in the literature of distributed data systems, social data storage presents a unique challenge because of the social locality in data access: we need to factor in not only how actively users read and write their own data but also how often socially connected users read the data of one another. We investigate the socially aware data partitioning problem by modeling it as a multi-objective optimization problem and exploring the applicability of evolutionary algorithms in order to achieve highly-efficient and well-balanced data partitions. Especially, we propose a solution framework that is closer to being optimal than existing techniques are, which is substantiated in our evaluation study.
2008 Proceedings of 17th International Conference on Computer Communications and Networks, 2008
CAN is a well-known DHT technique for contentbased P2P networks, where each node is assigned a zo... more CAN is a well-known DHT technique for contentbased P2P networks, where each node is assigned a zone in a virtual coordinate space to store the index of the data hashed into this zone. The dimension of this space is usually lower than the data dimension, thus we have the problem of dimension mismatch. This problem is widely addressed in the context of data retrieval that follows the traditional request/response model. However, little has been done for the publish/subscribe model, which is the focus of our paper. We show that dimension mismatch in CAN-based publish/subscribe applications poses new challenges. We furthermore investigate how a random projection approach can help reduce the negative effects of dimension mismatch. Our theoretical findings are complemented by a simulation-based evaluation.
2007 International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2007), 2007
Subscription covering detection is useful to improving the performance of any publish/subscribe s... more Subscription covering detection is useful to improving the performance of any publish/subscribe system. However, an exact solution to querying coverings among a large set of subscriptions in high dimension is computationally too expensive to be practicable. Therefore, we are interested in an approximate approach. We focus on spherical subscriptions and propose a solution based on random projections. Our complexities are substantially better than that of the exact approach. The proposed solution can potentially find exact coverings with a success probability 100% asymptotically approachable.
2009 Proceedings of 18th International Conference on Computer Communications and Networks, 2009
Target tracking techniques usually assume that the target is within the detection range of one or... more Target tracking techniques usually assume that the target is within the detection range of one or more sensor nodes at known locations, so that the target's location can be determined based on these locations. Instead, we are interested in networks where most sensor nodes do not know their location; hence, it is most likely that the target is many hops away from the closest location-known sensors. For such networks, we propose a target localization system that can estimate the target location based on hop-count information only. We evaluate the performance of this system via simulation.
2006 First International Conference on Communications and Electronics, 2006
We consider the problem of estimating the geographic locations of nodes in a wireless sensor netw... more We consider the problem of estimating the geographic locations of nodes in a wireless sensor network where most sensors are without an effective self-positioning functionality. A solution to this localization problem is proposed, which uses Support Vector Machines (SVM) and mere connectivity information only. We investigate two versions of this solution, each employing a different multi-class SVM strategy. They are shown to perform well in various aspects such as localization error, processing efficiency, and effectiveness in addressing the border issue.
IEEE Transactions on Parallel and Distributed Systems, 2008
We consider the problem of estimating the geographic locations of nodes in a wireless sensor netw... more We consider the problem of estimating the geographic locations of nodes in a wireless sensor network where most sensors are without an effective self-positioning functionality. We propose LSVM-a novel solution with the following merits. First, LSVM localizes the network based on mere connectivity information (i.e., hop counts only), and, therefore, is simple and does not require specialized ranging hardware or assisting mobile devices as in most existing techniques. Second, LSVM is based on Support Vector Machine (SVM) learning. Although SVM is a classification method, we show its applicability to the localization problem and prove that the localization error can be upper-bounded by any small threshold given an appropriate training data size. Third, LSVM addresses the border and coverage-hole problems effectively. Last but not least, LSVM offers fast localization in a distributed manner with efficient use of processing and communication resources. We also propose a modified version of mass-spring optimization to further improve the location estimation in LSVM. The promising performance of LSVM is exhibited by our simulation study.
We propose a P2P search solution, called EZSearch, that enables efficient multidimensional search... more We propose a P2P search solution, called EZSearch, that enables efficient multidimensional search for remotely located contents that best match the search criteria. EZSearch is a hierarchical approach; it organizes the network into a hierarchy in a way fundamentally different from existing search techniques. EZSearch is based on Zigzag, a P2P overlay architecture known for its scalability and robustness under network growth and dynamics. The indexing architecture of EZSearch is built on top of the Zigzag hierarchy, that allows both k-nearest-neighbor and range queries to be answered with low search overhead and worst-case search time logarithmic with the network size. The indices are fairly distributed over a small number of nodes at a modest cost for index storage and update. The performance results of EZSearch drawn from our performance study are encouraging.
Location fingerprinting is an approach to GPS-free localization. This approach requires a prior t... more Location fingerprinting is an approach to GPS-free localization. This approach requires a prior training set of fingerprints sampled at known locations, by comparing to which the locations of future fingerprints can be determined. For good accuracy, the training set should be large enough to appropriately cover the area. However, in practice, a quality training set is not easy to obtain and as such recent studies have resorted to utilizing fingerprints that are available without location information; these are called unlabeled fingerprints. This chapter presents several ways one can use regularization to learn from unlabeled fingerprints. Regularization is a mathematical framework to learn a function from data by enforcing regularizers to improve generalizability. The following scenarios are discussed: (1) how the training set can be enriched with unlabeled fingerprints, (2) how a trajectory of a moving device can be computed given its sequential fingerprints, labeled or unlabeled, ...
International Journal of Parallel, Emergent and Distributed Systems, 2015
In many systems, due to the lack of an adequate positioning capability or the need for energy sav... more In many systems, due to the lack of an adequate positioning capability or the need for energy saving, it is infeasible to track the location of a mobile device as it is moving. Its trajectory, however, may be reconstructed from the real-time fingerprint data that are obtained by the sensors built in the device. For this purpose, we investigate a regularization framework aimed to maximize the localization accuracy by taking into account the spatiotemporal properties regarding the fingerprint space in relation to the location space. The viability of this framework is demonstrated in an evaluation using real-world datasets, which shows its potential to outperform conventional approaches to location fingerprinting.
Online social networking has become ubiquitous. For a social storage system to keep pace with inc... more Online social networking has become ubiquitous. For a social storage system to keep pace with increasing amounts of user data and activities, a natural solution is to deploy more servers. An important design problem then is how to partition the data across the servers so that server efficiency and load balancing can both be maximized. Although data partitioning is well-studied in the literature of distributed data systems, social data storage presents a unique challenge because of the social locality in data access: we need to factor in not only how actively users read and write their own data but also how often socially connected users read the data of one another. We investigate the socially aware data partitioning problem by modeling it as a multi-objective optimization problem and exploring the applicability of evolutionary algorithms in order to achieve highly-efficient and well-balanced data partitions. Especially, we propose a solution framework that is closer to being optimal than existing techniques are, which is substantiated in our evaluation study.
2008 Proceedings of 17th International Conference on Computer Communications and Networks, 2008
CAN is a well-known DHT technique for contentbased P2P networks, where each node is assigned a zo... more CAN is a well-known DHT technique for contentbased P2P networks, where each node is assigned a zone in a virtual coordinate space to store the index of the data hashed into this zone. The dimension of this space is usually lower than the data dimension, thus we have the problem of dimension mismatch. This problem is widely addressed in the context of data retrieval that follows the traditional request/response model. However, little has been done for the publish/subscribe model, which is the focus of our paper. We show that dimension mismatch in CAN-based publish/subscribe applications poses new challenges. We furthermore investigate how a random projection approach can help reduce the negative effects of dimension mismatch. Our theoretical findings are complemented by a simulation-based evaluation.
2007 International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2007), 2007
Subscription covering detection is useful to improving the performance of any publish/subscribe s... more Subscription covering detection is useful to improving the performance of any publish/subscribe system. However, an exact solution to querying coverings among a large set of subscriptions in high dimension is computationally too expensive to be practicable. Therefore, we are interested in an approximate approach. We focus on spherical subscriptions and propose a solution based on random projections. Our complexities are substantially better than that of the exact approach. The proposed solution can potentially find exact coverings with a success probability 100% asymptotically approachable.
2009 Proceedings of 18th International Conference on Computer Communications and Networks, 2009
Target tracking techniques usually assume that the target is within the detection range of one or... more Target tracking techniques usually assume that the target is within the detection range of one or more sensor nodes at known locations, so that the target's location can be determined based on these locations. Instead, we are interested in networks where most sensor nodes do not know their location; hence, it is most likely that the target is many hops away from the closest location-known sensors. For such networks, we propose a target localization system that can estimate the target location based on hop-count information only. We evaluate the performance of this system via simulation.
2006 First International Conference on Communications and Electronics, 2006
We consider the problem of estimating the geographic locations of nodes in a wireless sensor netw... more We consider the problem of estimating the geographic locations of nodes in a wireless sensor network where most sensors are without an effective self-positioning functionality. A solution to this localization problem is proposed, which uses Support Vector Machines (SVM) and mere connectivity information only. We investigate two versions of this solution, each employing a different multi-class SVM strategy. They are shown to perform well in various aspects such as localization error, processing efficiency, and effectiveness in addressing the border issue.
IEEE Transactions on Parallel and Distributed Systems, 2008
We consider the problem of estimating the geographic locations of nodes in a wireless sensor netw... more We consider the problem of estimating the geographic locations of nodes in a wireless sensor network where most sensors are without an effective self-positioning functionality. We propose LSVM-a novel solution with the following merits. First, LSVM localizes the network based on mere connectivity information (i.e., hop counts only), and, therefore, is simple and does not require specialized ranging hardware or assisting mobile devices as in most existing techniques. Second, LSVM is based on Support Vector Machine (SVM) learning. Although SVM is a classification method, we show its applicability to the localization problem and prove that the localization error can be upper-bounded by any small threshold given an appropriate training data size. Third, LSVM addresses the border and coverage-hole problems effectively. Last but not least, LSVM offers fast localization in a distributed manner with efficient use of processing and communication resources. We also propose a modified version of mass-spring optimization to further improve the location estimation in LSVM. The promising performance of LSVM is exhibited by our simulation study.
We propose a P2P search solution, called EZSearch, that enables efficient multidimensional search... more We propose a P2P search solution, called EZSearch, that enables efficient multidimensional search for remotely located contents that best match the search criteria. EZSearch is a hierarchical approach; it organizes the network into a hierarchy in a way fundamentally different from existing search techniques. EZSearch is based on Zigzag, a P2P overlay architecture known for its scalability and robustness under network growth and dynamics. The indexing architecture of EZSearch is built on top of the Zigzag hierarchy, that allows both k-nearest-neighbor and range queries to be answered with low search overhead and worst-case search time logarithmic with the network size. The indices are fairly distributed over a small number of nodes at a modest cost for index storage and update. The performance results of EZSearch drawn from our performance study are encouraging.
Location fingerprinting is an approach to GPS-free localization. This approach requires a prior t... more Location fingerprinting is an approach to GPS-free localization. This approach requires a prior training set of fingerprints sampled at known locations, by comparing to which the locations of future fingerprints can be determined. For good accuracy, the training set should be large enough to appropriately cover the area. However, in practice, a quality training set is not easy to obtain and as such recent studies have resorted to utilizing fingerprints that are available without location information; these are called unlabeled fingerprints. This chapter presents several ways one can use regularization to learn from unlabeled fingerprints. Regularization is a mathematical framework to learn a function from data by enforcing regularizers to improve generalizability. The following scenarios are discussed: (1) how the training set can be enriched with unlabeled fingerprints, (2) how a trajectory of a moving device can be computed given its sequential fingerprints, labeled or unlabeled, ...
Uploads
Papers by Đức Trần