Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
…
9 pages
1 file
MOSIX is a cluster computing enhancement of Linux that supports preemptive process migration. This paper presents the MOSIX Direct File System Access (DFSA), a provision that can improve the performance of cluster le systems by migrating the process to the le, rather then the traditional way of bringing the le's data to the process. DFSA is suitable for clusters that manage a pool of shared disks among multiple machines. With DFSA, it is possible to migrate parallel processes from a client node to le servers enabling parallel access to di erent les. DFSA can work with any le system that maintains cache consistency. Since no such le system is currently available for Linux, we implemented the MOSIX File-System (MFS) as a rst prototype using DFSA. The paper describes DFSA and presents the performance of MFS with and without DFSA.
2002
As Linux clusters have matured as platforms for lowcost, high-performance parallel computing, software packages to provide many key services have emerged, especially in areas such as message passing and networking. One area devoid of support, however, has been parallel file systems, which are critical for highperformance I/O on such clusters. We have developed a parallel file system for Linux clusters, called the Parallel Virtual File System (PVFS). PVFS is intended both as a high-performance parallel file system that anyone can download and use and as a tool for pursuing further research in parallel I/O and parallel file systems for Linux clusters.
2009
Today, clusters are the de facto cost effective platform both for high performance computing (HPC) as well as IT environments. HPC and IT are quite different environments and differences include, among others, their choices on file systems and storage: HPC favours parallel file systems geared towards maximum I/O bandwidth, but which are not fully POSIX-compliant and were devised to run on top of (fault prone) partitioned storage; conversely, IT data centres favour both external disk arrays (to provide highly available storage) and POSIX compliant file systems, (either general purpose or shared-disk cluster file systems, CFSs). These specialised file systems do perform very well in their target environments provided that applications do not require some lateral features, e.g., no file locking on parallel file systems, and no high performance writes over cluster-wide shared files on CFSs. In brief, we can say that none of the above approaches solves the problem of providing high levels of reliability and performance to both worlds. Our pCFS proposal makes a contribution to change this situation: the rationale is to take advantage on the best of both-the reliability of cluster file systems and the high performance of parallel file systems. We don't claim to provide the absolute best of each, but we aim at full POSIX compliance, a rich feature set, and levels of reliability and performance good enough for broad usage-e.g., traditional as well as HPC applications, support of clustered DBMS engines that may run over regular files, and video streaming. pCFS' main ideas include:
This paper proposes the design a prototype of scalable distributed file system, called SDFS. SDFS stripes pieces of file across I/O nodes and provides a global large file system image that can logically span across workstation cluster. An application does not aware that pieces of file are physically stored on which machine. SDFS also distributes its components such as I/O daemons, control daemons, cache daemons over workstation cluster system. We balances functionality of all hosts which their type are the same by putting file system components equally on each of them. Scalability is also obtained from this equally component assignment. SDFS also provides C library interfaces to the applications, which is Unix compatible. A prototype implementation of SDFS is currently build on top of EXT2 Linux file system.
Cluster Computing and the …, 2002
In this paper, we report on the experiences in designing a portable parallel file system for clusters. The file system offers to the applications an interface compliant with MPI-IO, the I/O interface of the MPI-2 standard. The file system implementation relies upon MPI for internal coordination and communication. This guarantees high performance and portability over a wide range of hardware and software cluster platforms. The internal architecture of the file system has been designed to allow rapid prototyping and experimentation of novel strategies for managing parallel I/O in a cluster environment. The discussion of the file system design and early implementation is completed with basic performance measures confirming the potential of the approach.
2002
GPFS is IBM's parallel, shared-disk file system for cluster computers, available on the RS/6000 SP parallel supercomputer and on Linux clusters. GPFS is used on many of the largest supercomputers in the world. GPFS was built on many of the ideas that were developed in the academic community over the last several years, particularly distributed locking and recovery technology. To date it has been a matter of conjecture how well these ideas scale. We have had the opportunity to test those limits in the context of a product that runs on the largest systems in existence. While in many cases existing ideas scaled well, new approaches were necessary in many key areas. This paper describes GPFS, and discusses how distributed locking and recovery techniques were extended to scale to large clusters.
There are no high performance file systems that allow sharing of data between different clusters presently. In this paper, we address this issue by developing FICUS (File System for Inter Cluster Unified Storage). FICUS provides the convenience of file system level access to remote data while preserving the performance of striped file systems such as PVFS. We achieve a bandwidth of 80 MB/s for local access using four I/O nodes, and 75 MB/s in accessing the same number of I/O nodes on a remote storage cluster. A parallel data intensive application achieves comparable performance when accessing 6 GB of data in local and remote storage. We show that careful pipelining of data transfer and proper integration with underlying file system and communication layers are crucial for preserving the performance of remote access.
Future Gener. Comput. Syst., 2002
PODOS is a performance oriented distributed operating system being developed to harness the performance capabilities of a cluster-computing environment. In order to address the growing demand for performance, we are designing a distributed operating system (DOS) that can utilize the computing potential of a number of systems. Earlier clustering approaches have traditionally stressed more on resource sharing or reliability and have given lesser priority to performance. PODOS adds just four new components to the existing Linux operating system to make it distributed. These components are a Communication Manager (CM), a PODOS Distributed File System (PDFS), a Resource Manager (RM), and Global Interprocess Communication (GIPC). This paper addresses the design and implementation of the various components of the PODOS system.
2008
Sysman is a system management infrastructure for clusters and data centers similar to the /proc file system. It provides a familiar yet powerful interface for the management of servers, storage systems, and other devices. In the Sysman virtual file system each managed entity (e.g., power on/off button of a server, CPU utilization of a server, a LUN on a storage device), is represented by a file. Reading from Sysman files obtains active information from devices being managed by Sysman. Writing to Sysman files initiates tasks such as turning on/off server blades, discovering new devices, and changing the boot order of a blade. The combination of the file access semantics and existing UNIX utilities such as grep and find that operate on multiple files allow the creation of very short but powerful system management procedures for large clusters. Sysman is an extensible framework and has a simple interface through which new system management procedures can be easily added. We show that by using a few lines of Linux commands system management operations can be issued to more than 2000 servers in one second and the results can be collected at a rate of more than seven servers per second. We have been using Sysman (and its earlier implementations) in a cluster of Intel and PowerPC blade servers containing hundreds of blades with various software configurations.
Lecture Notes in Computer Science, 2004
NFSG aims at providing a solution for file accesses within a cluster of clusters. Criteria of easiness (installation, administration, usage) but also efficiency as well as a minimal hardware and software intrusivity have led our developments. By using several facilities such as distributed file systems (NFSP) and a high-performance data transfer utility (GXfer), we hope to offer a software architecture fully compatible with the ubiquitous NFS protocol. Thanks to a distributed storage (especially multiple I/O servers provided by NFSP), several parallel streams may be used when copying a file from one cluster to another within a same grid. This technique improves data transfers by connecting distributed file system at both ends. The GXfer component implements this functionality. Thus, performances only reachable with dedicated and expensive hardware may be achieved. This work is supported by APACHE which is a joint project funded by CNRS, INPG, INRIA and UJF. GXfer is a software component developed for the RNTL E-Toile
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '01, 2001
2018 IEEE International Conference on Cluster Computing (CLUSTER), 2018
CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings., 2003
Future Generation Computer Systems, 2006
2006 IEEE International Performance Computing and Communications Conference, 2006
ACM Transactions on Computer Systems, 1996
Parallel Computing, 2002
Works - Scholarship, Research, & Creative Expression (Swarthmore College), 2008
Concurrency and Computation: Practice and Experience, 2003
2013 IEEE International Conference on Cluster Computing (CLUSTER), 2013
Parallel Computing, 1997