This paper examines synchronization of computer clocks connected via a data network and proposes ... more This paper examines synchronization of computer clocks connected via a data network and proposes a skewless algorithm to synchronize them. Unlike existing solutions, which either estimate and compensate the frequency difference (skew) among clocks or introduce offset corrections that can generate jitter and possibly even backward jumps, our solution achieves synchronization without these problems. We first analyze the convergence property of the algorithm and provide explicit necessary and sufficient conditions on the parameters to guarantee synchronization. We then study the effect of noisy measurements (jitter) and frequency drift (wander) on the offsets and synchronization frequency, and further optimize the parameter values to minimize their variance. Our study reveals a few insights, for example, we show that our algorithm can converge even in the presence of timing loops and noise, provided that there is a well defined leader. This marks a clear contrast with current standards such as NTP and PTP, where timing loops are specifically avoided. Furthermore, timing loops can even be beneficial in our scheme as it is demonstrated that highly connected subnetworks can collectively outperform individual clients when the time source has large jitter. The results are supported by experiments running on a cluster of IBM BladeCenter servers with Linux.
A Tool for Scalable Profiling and Tracing of Java and Native Code Interactions
Java workloads have two different execution spaces: one in JVM and the other in the native enviro... more Java workloads have two different execution spaces: one in JVM and the other in the native environment. Un- derstanding workload activity in native and non-native (Java) spaces and its impact on the overall resource consumption of Java workloads can be very useful. For example, this knowledge can be exploited in code optimization and for efficient process- level scheduling especially in emerging hybrid systems. Existing Java run time tracing tools are quite heavyweight and/or offer limited useful information for understanding Java and native space interactions. We developed an extremely lightweight tracing tool for enterprise Java workloads. The tool captures detailed per-thread statistics related to resource usage and activity in JVM and native spheres. Efficient design based on innovative thread and memory management principles enables us to achieve scalable monitoring with our tool on multi-core systems running enterprise workloads. The information captured by the tool is used to build workload profiles which can then be used for predictive performance of Java workloads in emerging systems and architectures.
This document contains a detailed proposal for a future IEEE 1788 standard on interval arithmetic... more This document contains a detailed proposal for a future IEEE 1788 standard on interval arithmetic. It is written in a form that should be not too difficult to transform into a formal, complete and fully precise document specifying the standard to be. Part 1 contains a concise summary of the basic assumptions (some of which may be controversial) upon which the remaining document is based. The items in Part 1 are grouped such that separate voting on each issue is meaningful. Parts 2-5 specify the internal representation of intervals and the operations defined on them. Part 6 specifies the external representation of intervals and the conversion between internal and external representations. Part 7 is optional and discusses useful modifications of the directed rounding behavior specified in the IEEE 754-2008 standard that would simplify an implementation of the proposed standard. This final version incorporates many suggestions and corrections by the people mentioned above. In particular, comprehensive discussions with Michel Hack, who read and commented in detail many intermediate versions, had a strong influence on this proposal.
As byte-addressable, high-density, and non-volatile memory (NVM) is around the corner to be equip... more As byte-addressable, high-density, and non-volatile memory (NVM) is around the corner to be equipped alongside the DRAM memory, issues on enabling the important key-value cache services, such as memcached, on the new storage medium must be addressed. While NVM allows data in a KV cache to survive power outage and system crash, in practice their integrity and accessibility depend on data consistency enforced during writes to NVM. Though techniques for enforcing the consistency, such as journaling, COW, or checkpointing, are available , they are often too expensive by frequently using CPU cache flushes to ensure crash consistency, leading to (much) reduced performance and excessively compromised NVM's lifetime. In this paper we design and evaluate NVMcached, a KV cache for non-volatile byte-addressable memory that can significantly reduce use of flushes and minimize data loss by leveraging consistency-friendly data structures and batched space allocation and reclamation. Experiments show that NVMcached can improve its system throughput by up to 2.8× for write-intensive real-world workloads, compared to a non-volatile memcached. 1. Introduction With recent announcement of 3D XPoint technology [13], non-volatile memory (NVM) becomes the reality and will change how major system components are designed and built. The emerging byte-addressable, high-density NVMs also include PCM [30], STT-RAM [23], and RRAM [28]. They enable alternatives to DRAM as main memory of much higher energy-efficiency and larger capacity. When computer servers configured with NVM become commonly available, porting popular key-value (KV) caches, whose designs assume DRAM memory, onto NVM-equipped servers would allow their data to survive power outage and system
This standard specifies interchange and arithmetic formats and methods for binary and decimal flo... more This standard specifies interchange and arithmetic formats and methods for binary and decimal floating-point arithmetic in computer programming environments. This standard specifies exception conditions and their default handling. An implementation of a floating-point system conforming to this standard may be realized entirely in software, entirely in hardware, or in any combination of software and hardware. For operations specified in the normative part of this standard, numerical results and exceptions are uniquely determined by the values of the input data, sequence of operations, and destination formats, all under user control.
This paper examines synchronization of computer clocks connected via a data network and proposes ... more This paper examines synchronization of computer clocks connected via a data network and proposes a skewless algorithm to synchronize them. Unlike existing solutions, which either estimate and compensate the frequency difference (skew) among clocks or introduce offset corrections that can generate jitter and possibly even backward jumps, our solution achieves synchronization without these problems. We first analyze the convergence property of the algorithm and provide explicit necessary and sufficient conditions on the parameters to guarantee synchronization. We then study the effect of noisy measurements (jitter) and frequency drift (wander) on the offsets and synchronization frequency, and further optimize the parameter values to minimize their variance. Our study reveals a few insights, for example, we show that our algorithm can converge even in the presence of timing loops and noise, provided that there is a well defined leader. This marks a clear contrast with current standards such as NTP and PTP, where timing loops are specifically avoided. Furthermore, timing loops can even be beneficial in our scheme as it is demonstrated that highly connected subnetworks can collectively outperform individual clients when the time source has large jitter. The results are supported by experiments running on a cluster of IBM BladeCenter servers with Linux.
A Tool for Scalable Profiling and Tracing of Java and Native Code Interactions
Java workloads have two different execution spaces: one in JVM and the other in the native enviro... more Java workloads have two different execution spaces: one in JVM and the other in the native environment. Un- derstanding workload activity in native and non-native (Java) spaces and its impact on the overall resource consumption of Java workloads can be very useful. For example, this knowledge can be exploited in code optimization and for efficient process- level scheduling especially in emerging hybrid systems. Existing Java run time tracing tools are quite heavyweight and/or offer limited useful information for understanding Java and native space interactions. We developed an extremely lightweight tracing tool for enterprise Java workloads. The tool captures detailed per-thread statistics related to resource usage and activity in JVM and native spheres. Efficient design based on innovative thread and memory management principles enables us to achieve scalable monitoring with our tool on multi-core systems running enterprise workloads. The information captured by the tool is used to build workload profiles which can then be used for predictive performance of Java workloads in emerging systems and architectures.
This document contains a detailed proposal for a future IEEE 1788 standard on interval arithmetic... more This document contains a detailed proposal for a future IEEE 1788 standard on interval arithmetic. It is written in a form that should be not too difficult to transform into a formal, complete and fully precise document specifying the standard to be. Part 1 contains a concise summary of the basic assumptions (some of which may be controversial) upon which the remaining document is based. The items in Part 1 are grouped such that separate voting on each issue is meaningful. Parts 2-5 specify the internal representation of intervals and the operations defined on them. Part 6 specifies the external representation of intervals and the conversion between internal and external representations. Part 7 is optional and discusses useful modifications of the directed rounding behavior specified in the IEEE 754-2008 standard that would simplify an implementation of the proposed standard. This final version incorporates many suggestions and corrections by the people mentioned above. In particular, comprehensive discussions with Michel Hack, who read and commented in detail many intermediate versions, had a strong influence on this proposal.
As byte-addressable, high-density, and non-volatile memory (NVM) is around the corner to be equip... more As byte-addressable, high-density, and non-volatile memory (NVM) is around the corner to be equipped alongside the DRAM memory, issues on enabling the important key-value cache services, such as memcached, on the new storage medium must be addressed. While NVM allows data in a KV cache to survive power outage and system crash, in practice their integrity and accessibility depend on data consistency enforced during writes to NVM. Though techniques for enforcing the consistency, such as journaling, COW, or checkpointing, are available , they are often too expensive by frequently using CPU cache flushes to ensure crash consistency, leading to (much) reduced performance and excessively compromised NVM's lifetime. In this paper we design and evaluate NVMcached, a KV cache for non-volatile byte-addressable memory that can significantly reduce use of flushes and minimize data loss by leveraging consistency-friendly data structures and batched space allocation and reclamation. Experiments show that NVMcached can improve its system throughput by up to 2.8× for write-intensive real-world workloads, compared to a non-volatile memcached. 1. Introduction With recent announcement of 3D XPoint technology [13], non-volatile memory (NVM) becomes the reality and will change how major system components are designed and built. The emerging byte-addressable, high-density NVMs also include PCM [30], STT-RAM [23], and RRAM [28]. They enable alternatives to DRAM as main memory of much higher energy-efficiency and larger capacity. When computer servers configured with NVM become commonly available, porting popular key-value (KV) caches, whose designs assume DRAM memory, onto NVM-equipped servers would allow their data to survive power outage and system
This standard specifies interchange and arithmetic formats and methods for binary and decimal flo... more This standard specifies interchange and arithmetic formats and methods for binary and decimal floating-point arithmetic in computer programming environments. This standard specifies exception conditions and their default handling. An implementation of a floating-point system conforming to this standard may be realized entirely in software, entirely in hardware, or in any combination of software and hardware. For operations specified in the normative part of this standard, numerical results and exceptions are uniquely determined by the values of the input data, sequence of operations, and destination formats, all under user control.
Uploads
Papers by Michel Hack