Academia.eduAcademia.edu

Distributed Databases

3,164 papers
1,456 followers
AI Powered
Distributed databases are databases that store data across multiple physical locations, which can be on different servers or networks. They enable data to be accessed and managed concurrently by multiple users, ensuring data consistency and availability while allowing for scalability and fault tolerance in data management.
Domain Name System (DNS) is the system for the mapping between easily memorizable host names and their IP addresses. Due to its criticality, security extensions to DNS have been proposed in an Internet Engineering Task Force (IETF)... more
Domain Name System (DNS) is the system for the mapping between easily memorizable host names and their IP addresses. Due to its criticality, security extensions to DNS have been proposed in an Internet Engineering Task Force (IETF)... more
Most of the existing nonlinear data analysis and modelling techniques including neural networks become computationally prohibitively expensive when the available data set exceeds the capacity of the computer main memory due to the slow... more
Due to the dramatic increase of data volumes in different applications, it is becoming infeasible to keep these data in one centralized machine. It is becoming more and more natural to deal with distributed databases and networks. That is... more
Peer-to-peer file-sharing networks are currently receiving much attention as a means of sharing and distributing information. However, as recent experience shows, the anonymous, open nature of these networks offers an almost ideal... more
The healthcare industry is undergoing a significant transformation in data management, spurred by the integration of artificial intelligence (AI) and cloud technologies in data warehousing. This paper investigates the transformative... more
The insurance and risk management sectors are experiencing a significant transformation driven by the need for more sophisticated data analysis and predictive modeling. This paper explores the critical role of artificial intelligence... more
Abstract- The healthcare industry is at a pivotal moment in terms of data management. Legacy data warehouses, which once served the sector’s needs, are now proving inefficient in an era of rapid technological advancements. This article... more
Scientific workflow design is usually complex and demands integration of numerous resources. Geographical distribution and semantic heterogeneity of resources add to this complexity. The cost effectiveness of such workflow design thus... more
Apache Hadoop, written in Java, is open-source framework that supports processing of large data sets. It can store a large volume of structured, semistructured, and unstructured data in a distributed file system and process them in... more
In big data storage, architecture data reaches users through multiple organization data structures. The big data revolution provides significant improvements to the data storage architecture. New tools such as Hadoop, an open-source... more
Sincere appreciation from my heart can not be conveyed in words to all the people who helped me carry out the research reported in this dissertation, by their support and encouragement, and to all those who have made my stay in France,... more
Los mapas auto-organizativos (SOM-Self organizing maps) son un tipo de red neuronal ampliamente conocido por su capacidad para preservar la topología de los datos de entrada permitiendo mapear un espacio n-dimensional en otro de dos... more
Interoperability plays an important role for a variety of applications. One of them are Peer Data Management Systems, where autonomous data sources (peers) interact with each other based on semantic mappings between their schemas. The... more
In order to join two sub queries involving data from multiple sites data has to be transmitted from one site to other. While transmitting the data within a network, the factors involved in distributed databases is communication cost and... more
The vision of the upcoming 6G technologies that have fast data rate, low latency, and ultra-dense network, draws great attentions to the Internet of Vehicles (IoV) and Vehicle-to-Everything (V2X) communication for intelligent... more
The success of the World Wide Web has led to a steep increase in the user population and the amount of traffic on the Internet. Popular Web pages create "hot spots" of network load due to their great demand for bandwidth and increase the... more
The version presented here may differ from the published version or, version of record, if you wish to cite this item you are advised to consult the publisher's version. Please see the 'permanent WRAP url' above for details on accessing... more
Similarity search has been proved suitable for searching in very large collections of unstructured data objects. We are interested in efficient parallel query processing under situations of continuous streams of queries as in search... more
We describe a case-study in which formal methods were used to verify an important responsiveness property of a distributed database system which is used heavily at Ericsson in a number of recent products. One of the aims of the project... more
Despite increasingly distributed internet information sources with diverse storage formats and access-control constraints, most of the end applications (e.g., filters and media players) that view and manipulate data from these sources... more
It is well-known that fine tuning in database physical design is an important strategy for speeding up data access. In this paper, we introduce a new approach, denoted HypoPlans, to make relational database systems able to execute... more
Whereas many natural language researchers have identi ed ways in which natural language systems need to be augmented in order to make current natural language interactions more natural, most evaluations of natural language interfaces have... more
Data Fragments Allocation is an important issue in designing Distributed Database System. This problem is NP-complete, and thus requires fast heuristics and random algorithms to generate efficient solutions, so many algorithms for solving... more
The purpose of this study is to improve the web server performance. Bottlenecks such as network traffic overload and congested web server has still yet to be solved due to the increasing of internet usage. Caching is one of the popular... more
As more data management software is designed for deployment in public and private clouds, or on a cluster of commodity servers, new distributed storage systems increasingly achieve high data access throughput via partitioning and... more
As more data management software is designed for deployment in public and private clouds, or on a cluster of commodity servers, new distributed storage systems increasingly achieve high data access throughput via partitioning and... more
Storage Area Networks (SAN) are based on direct interaction between clients and storage servers. This unmediated access exposes the storage server to network attacks, necessitating a verification, by the server, that the client requests... more
Workload consolidation, sharing physical resources among multiple workloads, is a promising technique to save cost and energy in cluster computing systems. This paper highlights a few challenges of workload consolidation for Hadoop as one... more
Remote data integrity checking is of crucial importance in cloud storage. It can make the clients verify whether their outsourced data is kept intact without downloading the whole data. In some application scenarios, the clients have to... more
A real-time database is a database in which both the data and the operations upon the data may have timing constraints. The design of this kind of database requires the introduction of new concepts to modelize both data structures and the... more
While simulation is a valuable research and design tool, the time and difficulty required to create new simulations (or re-use existing simulations) often limits their application. This report describes the design of the software... more
Lafayette, IN, a position he has held since 1976. His research interests include algorithms for data storage and retrieval, programming languages, and software engineering. Dr. Comer is a member of the Association for Computing Machinery... more
Due to the dramatic increase of data volumes in different applications, it is becoming infeasible to keep these data in one centralized machine. It is becoming more and more natural to deal with distributed databases and networks. That is... more
Laboratorio de Investigación y Desarrollo en Inteligencia Artificial (LIDIA) Dep. de Ciencias e Ingenieŕıa de la Computación, Universidad Nacional del Sur (UNS), Instituto de Ciencias e Ingenieŕıa de la Computación (UNS-CONICET), Consejo... more
1 Laboratorio de Investigación y Desarrollo en Inteligencia Artificial (LIDIA), Depto. de Ciencias e Ingenieŕıa de la Computación, Universidad Nacional del Sur, Bah́ıa Blanca, Argentina. 2 Consejo Nacional de Investigaciones Cient́ıficas... more
We describe ongoing efforts towards developing language resources for a transnational digital government project aimed at applying information technology (IT) to a problem of international concern: detecting and monitoring activities... more
OLAP tools for distributed data warehouses generally assume underlying replicated tables are up to date. Unfortunately, maintaining updated replicas is difficult due to the inherent tradeoff between consistency and availability. In this... more
The success of the World Wide Web has led to a steep increase in the user population and the amount of traffic on the Internet. Popular Web pages create "hot spots" of network load due to their great demand for bandwidth and increase the... more
Data warehouses are nowadays an important component in every competitive system, it's one of the main components on which business intelligence is based. We can even say that many companies are climbing to the next level and use a set of... more
Resumen. El diseño de bases de datos distribuidas es un problema de optimización que implica la solución de problemáticas como la fragmentación de los datos y su ubicación. Típicamente, los criterios que determinan si la fragmentación y... more
El diseño de bases de datos distribuidas es un problema de optimización que implica la solución de problemáticas como la fragmentación de los datos y su ubicación. Típicamente, los criterios que determinan si la fragmentación y la... more
by David Mankins and 
1 more
Current DoD enterprise networks routinely face targeted cyber attacks, and even though attack-related information is recorded in various places, this information is often left unexamined until after attacker objectives have been achieved.... more
Supply chains (SC) span many geographies, modes and industries and involve several phases where data flows in both directions from suppliers, manufacturers, distributors, retailers, to customers. This data flow is necessary to support... more