Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2013, International Journal of Modern Education and Computer Science
In this paper, the detailed overview of the database replication is presented. Thereafter, PDDRA (Pre-fetching based dynamic data replication algorithm) algorithm as recently published is detailed. In this algorithm, further, modifications are suggested to minimize the delay in data replication. Finally a mathematical framework is presented to evaluate mean waiting time before a data can be replicated on the requested site.
International Journal of Modern Education and Computer Science, 2013
Recently, (PDDRA) a Pre-fetching based dynamic data replication algorithm has been published. In our previous work, modifications to the algorithm have been suggested to minimize the delay in data replication. In this paper a mathematical framework is presented to evaluate mean waiting time before data can be replicated on the requested site. The idea is further investigated and simulation results are presented to estimate the throughput and average delay.
International Journal of Computer Applications, 2013
In this paper, a database replication algorithm is presented. The main idea is to reduce the latency in database replication while maintaining very high throughput. The main points of the algorithm are detailed in this paper. Simulation results are presented to evaluate throughput and average delay. For the in-depth analysis of the algorithm, various cases are considered and it has been found that if the numbers of servers that can serve requests are larger in number then throughput is very high with very less average delay. The overall, throughput and average delay also depend heavily on the load and if load is comparatively less (< 0.8) then the throughput is very high and average delay is nearly zero.
The distribution of information systems on multiple nodes ensures maximum availability of resources and business continuity in case of disasters by using data replication. There are two basic parameters to select when designing a replication strategy: "where" and "when" the updates are propagated. In this paper I presented the advantages and disadvantages of data replication. In addition, we analyze the available methods and strategies of data replication depending on needs, for improving performance and increasing data availability.
International Journal of Integrated Engineering, 2019
The database replication keeps up contents of database such as tables, in various geographical locations that fabricate an arrangement of distributed database. The necessities of database replication are expanding with the time as the use of internet is also increasing. With the purpose of meeting these prerequisites, priorities among requests can be included. In this work two types of priorities i.e., low and high priority is considered in this research article. It is noticeable that the high priority requests are more vital in comparison to low priority requests and therefore low priority requests can be delayed or dropped. The rate of request loss can be lessened utilizing the conditions of load balancing where a portion of the contending requests are transferred to other nodes. In this paper, internet is modelled as small world network, and performance assessment of A-PDDRA (Advanced Prebringing Based Dynamic Data Replication Algorithm) is carried out on small world network while considering both priorities and load balancing using computer simulation
Procedia Computer Science, 2016
A data grid is a structural design or cluster of services that provides persons or assortments of users' ability to access modify and transfer very great amount of geographically distributed data. As a result of this needed massive storage resources for store massive data files. For take away that drawback we have a tendency to use dynamic data replication is applied to scale back data time interval and to utilize network and storage resources expeditiously. Dynamic data replication through making several replicas in numerous websites. Here, through up the modified BHR (MBHR) methodology, we have a tendency to project a dynamic algorithmic program for data replication in data grid system. This algorithmic program uses variety of parameter for locating replicating appropriate web site wherever the file is also needed in future with high likelihood. The algorithmic program predicts future wants of replicated appropriate grid web site square measure supported file access history.
The Journal of Supercomputing, 2012
Data replication is the creation and maintenance of multiple copies of the same data. Replication is used in Data Grid to enhance data availability and fault tolerance. One of the main objectives of replication strategies is reducing response time and bandwidth consumption. In this paper, a dynamic replication strategy that is based on Fast Spread but superior to it in terms of total response time and total bandwidth consumption is proposed. This is achieved by storing only the important replicas on the storage of the node. The main idea of this strategy is using a threshold to determine if the requested replica needs to be copied to the node. The simulation results show that the proposed strategy achieved better performance compared with Fast Spread with Least Recently Used (LRU), and Fast Spread with Least Frequently Used (LFU).
Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000, 2000
Data replication is an increasingly important topic as databases are more and more deployed over clusters of workstations. One of the challenges in database replication is to introduce replication without severely affecting performance. Because of this difficulty, current database products use lazy replication, which is very efficient but can compromise consistency. As an alternative, eager replication guarantees consistency but most existing protocols have a prohibitive cost. In order to clarify the current state of the art and open up new avenues for research, this paper analyses existing eager techniques using three key parameters. In our analysis, we distinguish eight classes of eager replication protocols and, for each category, discuss its requirements, capabilities, and cost. The contribution lies in showing when eager replication is feasible and in spelling out the different aspects a database replication protocol must account for.
2011
Distributed databases are an optimal solution to manage important amounts of data. We present an algorithm for replication in distributed databases, in which we improve queries. This algorithm can move information between databases, replicating pieces of data through databases.
IEEE Transactions on Parallel and Distributed Systems, 2016
Data Grid allows many organizations to share data across large geographical area. The idea behind data replication is to store copies of the same file at different locations. Therefore, if a copy at a location is lost or not available, it can be brought from another location. Additionally, data replication results in a reduced time and bandwidth because of bringing the file from a closer location. However, the files that need to be replicated have to be selected wisely. In this paper, a round-based data replication strategy is proposed to select the most appropriate files for replication at the end of each round based on a number of factors. The proposed strategy is based on Popular File Replicate First (PFRF) strategy, and it overcomes the drawbacks of PFRF. The simulation results show that the proposed strategy yields better performance in terms of average file delay per request, average file bandwidth consumption per request, and percentage of files found.
Journal of Advanced Computer Science & Technology, 2014
Nowadays, scientific applications generate a huge amount of data in terabytes or petabytes. Data grids currently proposed solutions to large scale data management problems including efficient file transfer and replication. Data is typically replicated in a Data Grid to improve the job response time and data availability. A reasonable number and right locations for replicas has become a challenge in the Data Grid. In this paper, a four-phase dynamic data replication algorithm based on Temporal and Geographical locality is proposed. It includes: 1) evaluating and identifying the popular data and triggering a replication operation when the popularity data passes a dynamic threshold; 2) analyzing and modeling the relationship between system availability and the number of replicas, and calculating a suitable number of new replicas; 3) evaluating and identifying the popular data in each site, and placing replicas among them; 4) removing files with least cost of average access time when encountering insufficient space for replication. The algorithm was tested using a grid simulator, OptorSim developed by European Data Grid Projects. The simulation results show that the proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, effective network usage and percentage of storage filled.
2011
In recent years, grid technology has had such a fast growth that it has been used in many scientific experiments and research centers. A large number of storage elements and computational resources are combined to generate a grid which gives us shared access to extra computing power. In particular, data grid deals with data intensive applications and provides intensive resources across widely distributed communities. Data replication is an efficient way for distributing replicas among the data grids, making it possible to access similar data in different locations of the data grid. Replication reduces data access time and improves the performance of the system. In this paper, we propose a new dynamic data replication algorithm named PDDRA that optimizes the traditional algorithms. Our proposed algorithm is based on an assumption: members in a VO (Virtual Organization) have similar interests in files. Based on this assumption and also file access history, PDDRA predicts future needs of grid sites and pre-fetches a sequence of files to the requester grid site, so the next time that this site needs a file, it will be locally available. This will considerably reduce access latency, response time and bandwidth consumption. PDDRA consists of three phases: storing file access patterns, requesting a file and performing replication and pre-fetching and replacement. The algorithm was tested using a grid simulator, OptorSim developed by European Data Grid projects. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, effective network usage, total number of replications, hit ratio and percentage of storage filled.
2011
The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.
Database and Expert Systems …, 1996
In this paper we investigate the performance issues of data replication in a loosely coupled distributed database system, where a set of database servers are connected via a network. A database replication scheme, Replication with Divergence, which allows some degree of divergence between the primary and the secondary copies of the same data object, is compared to other two schemes that, respectively, disallows replication and maintains all replicated copies consistent at all times. The impact of some tunable factors, such as cache size and the update propagation probability, on the performance of Replication with Divergence is also investigated. These results shed light on the performance issues that were not addressed in previous studies on replication of distributed database systems.
International Journal of Computer and Communication Engineering, 2013
The production database are transactions intensive transactions can be insert, update on the tables, failover is the replica of the production server, if there is any change we have to implement on the production and it will be automatically implemented on failover or standby database. Now a days the data on the production server is increasing and we need extra storage space on production server to keep data and this is same required on the failover. To generate reports from that data will increase load on the production server and can affect the performance of the server. There are also some threats which can cause loss of data from which we have to protect our database like Hardware failure, loss of machine. Replication is one of the methods for Backup of the running database and its immediate failover during failure. This paper represents some parts for the replication techniques, failover, and transaction states.
International Journal of Intelligent Systems and Applications, 2015
Due to the emergence of more data centric applications, the replication of data has become a more common phenomenon. In the similar context, recently, (PDDRA) a Prefetching based dynamic data replication algorithm is developed. The main idea is to pre-fetch some data using the heuristic algorithm before actual replication start to reduce latency In the algorithm further modifications (M-PDDRA) are suggested to minimize the delay in data replication. In this paper, M-PDDRA algorithm is tested under shared and output buffering scheme. Simulation results are presented to estimate the packet loss rate and average delay for both shared and output buffered schemes. The simulation results clearly reveal that the shared buffering with load balancing scheme is as good as output buffered scheme with much less buffering resources.
International Journal on Information Theory, 2014
In this paper, a review for consistency of data replication protocols has been investigated. A brief deliberation about consistency models in data replication is shown. Also we debate on propagation techniques such as eager and lazy propagation. Differences of replication protocols from consistency view point are studied. Also the advantages and disadvantages of the replication protocols are shown. We advent into essential technical details and positive comparisons, in order to determine their respective contributions as well as restrictions are made. Finally, some literature research strategies in replication and consistency techniques are reviewed.
Data grids provide large-scale geographically distributed data resources for data intensive applications. These applications handle large data sets that need to be transferred and replicated among different grid sites so availability and efficient access are the most important factors affecting the performance. It is obvious that, managing the volume of data is very important. Data replication is an important technique to reduces data access time which improves the performance of the system by creating identical replicas of data files and distributing them on grid sites. In this paper, we propose a novel dynamic data replication strategy called DRPF (Dynamic Replication of Popular File), which is based on access history and file's popularity. As grid sites within a virtual organization(VO) have similar interest of files, the basic idea of DRPF is to improve locality in accesses through increasing the the number of replicas in the VO. DRPF first selects the popular files that are needed to be copied to other nodes, then tries to find the best places for placement of new replicas by taking into account parameters such as the number of demands per site for files and bandwidth between replication sites. The algorithm is simulated using a data grid simulator, OptorSim. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time and effective network usage.
Replication is a well-known solution used to improve data availability. Data replication is one of the key components in data grid architecture, clustering and distributed system as it enhances data access and reliability and minimizes the cost of data transmission. This paper proposed a persistence layer synchronous replication technique (PSR). The persistence layer is a synchronous layer based on Computer threading technique .The whole system is a service oriented application. All the replication will have been done through the service oriented practices and in the new technique adding more replicated database server will be easier than ever.
Journal of Parallel and Distributed Computing, 2008
This paper compares and analyzes 10 heuristics to solve the fine-grained data replication problem over the Internet. In fine-grained replication, frequently accessed data objects (as opposed to the entire website contents) are replicated onto a set of selected sites so as to minimize the average access time perceived by the end users. The paper presents a unified cost model that captures the minimization of the total object transfer cost in the system, which in turn leads to effective utilization of storage space, replica consistency, fault-tolerance, and load-balancing. The set of heuristics include six A-Star based algorithms, two bin packing algorithms, one greedy and one genetic algorithm. The heuristics are extensively simulated and compared using an experimental test-bed that closely mimics the Internet infrastructure and user access patterns. GT-ITM and Inet topology generators are used to obtain 80 well-defined network topologies based on flat, link distance, power-law and hierarchical transit-stub models. The user access patterns are derived from real access logs collected at the websites of Soccer World Cup 1998 and NASA Kennedy Space Center. The heuristics are evaluated by analyzing the communication cost incurred due to object transfers under the variance of server capacity, object size, read access, write access, number of objects and sites. The main benefit of this study is to facilitate readers with the choice of algorithms that guarantee fast or optimal or both types of solutions. This allows the selection of a particular algorithm to be used in a given scenario.
2013
A Data Grid consists of a collection of geographically distributed computer and storage resources located in different places, and enables users to share data and other resources. Data replication is used in Data Grid to enhance data availability, fault tolerance, load balancing and reliability. Although replication is a key technique, but the problem of selecting proper locations for placing replicas i.e. replica placement in Data Grid has not been widely studied, yet. In this paper an efficient replica selection strategy is proposed to select the best replica location from among the many replicas. Due to the limited storage capacity, a good replica replacement algorithm is needed. We present a novel replacement strategy which deletes files in two steps when free space is not enough for the new replica. A Grid simulator, OptorSim, is used to evaluate the performance of this dynamic replication strategy. The simulation results show that the proposed algorithm outperforms comparing t...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.