Access Registers in Java
Access Registers in Java
Native
Library (JDK) TCP/IP
Native
JVM java.net
Native
Ethernet
Library (JDK) TCP/IP
TCP/IP
Ethernet
Native
OS (Linux) JVM
Ethernet
Hardware
Java processor (JVM)
Hardware Hardware
CPU Memory Ethernet CPU Memory Ethernet CPU Memory Ethernet
Figure 1. (a) Standard layers for embedded Java with an operating system, (b) a JVM on the bare metal, and (c) a
JVM as a Java processor
typedef struct {
corresponding structure is bound to a location using the Ad- int data;
dress attribute. int control;
More recently, the RTSJ [1] does not give much support. } parallel_port;
Essentially, one has to use RawMemoryAccess at the level #define PORT_ADDRESS 0x10000;
of primitive data types. A similar approach is used in the int inval, outval;
Ravenscar Java profile [8]. Although the solution is efficient, parallel_port *mypp;
this representation of physical memory is not object oriented mypp = (parallel_port *) PORT_ADDRESS;
and there are some safety issues: When one raw memory ...
inval = mypp->data;
area represents an address range where several devices are mypp->data = outval;
mapped to there is no protection between them.
The aJile processor [6] uses native functions to access de- Figure 2. Definition and usage of a parallel port in C
vices through IO pins. The Squawk VM [15], which is a
JVM mostly written Java that runs without an operating sys-
and I/O pins. The example PIO contains two registers: the
tem, uses device drivers written in Java. These device drivers
data register and the control register. Writing to the data
use a form of peek and poke interface to access the device’s
register stores the value into a register that drives the output
memory. The JX Operating System [3] uses memory objects
pins. Reading from the data register returns the value that is
to provide read-only memory and device access, which are
present at the input pins.
both required by an OS. Memory objects represent a region
of the main address space and accesses to the regions are han- The control register configures the direction for each PIO
dled via normal method invocations on the memory objects pin. When bit n in the control register is set to 1, pin n drives
representing the different regions. out the value of bit n of the data register. A 0 at bit n in
The distinctive feature of our proposal is that it maps a the control register configures pin n as input pin. At reset
hardware object onto the OO address space and provide, if the port is usually configured as input port2 – a safe default
desired, access methods for individual fields, such that it lifts configuration.
the facilities of Ada into the object oriented world of Java. When the I/O address space is memory mapped, such a
The remainder of the paper is structured as follows: in parallel port is represented in C as a structure and a constant
Section 2 we motivate hardware objects and present the idea. for the address. This definition is part of the board level con-
Section 3 provides details on the integration of hardware figuration. Figure 2 shows the parallel port example. The
objects into three different JVMs: a Java processor, a Just- parallel port is directly accessed via a pointer in C. For a
in-time (JIT) compiling JVM, and an interpreting JVM. We system with a distinct I/O address space access to the de-
conclude and evaluate the proposal in Section 4. vice registers is performed via distinct machine instructions.
Those instructions are represented by C functions that take
2 Hardware Objects the address as argument, which is not a type-safe solution.
Let us consider a simple parallel input/output (PIO) de- 2 Output
can result in a short circuit between the I/O pin and the external
vice. The PIO provides an interface between I/O registers device when the logic levels are different.
public final class ParallelPort { public class IOFactory {
public volatile int data;
public volatile int control; private final static int SYS_ADDRESS = ...;
} private final static int SERIAL_ADDRESS = ...;
private SysDevice sys;
int inval, outval; private SerialPort sp;
myport = JVMMagic.getParallelPort(); IOFactory() {
... sys = (SysDevice) JVMIOMagic(SYS_ADDRESS);
inval = myport.data; sp = (SerialPort) JVMIOMagic(SERIAL_ADDRESS);
myport.data = outval; };
private static IOFactory single = new IOFactory();
Figure 3. The parallel port device as a simple Java public static IOFactory getFactory() {
return single;
class }
public SerialPort getSerialPort() { return sp; }
package com.board-vendor.io; public SysDevice getSysDevice() { return sys; }
// here comes the magic!
public class IOSystem { Object JVMIOMagic(int address) {...}
}
// do some JVM magic to create the PP object
private static ParallelPort pp = JVMPPMagic(); public class DspioFactory extends IOFactory {
private static SerialPort sp = JVMSPMagic();
private final static int USB_ADDRESS = ...;
public static ParallelPort getParallelPort() { private SerialPort usb;
return pp; DspioFactory() {
} usb = (SerialPort) JVMIOMagic(USB_ADDRESS);
public static SerialPort getSerialPort() {..} };
} static DspioFactory single = new DspioFactory();
public static DspioFactory getDspioFactory() {
Figure 4. A Factory with static methods for Single- return single;
ton hardware objects }
public SerialPort getUsbPort() { return usb; }
}
This simple representation of memory mapped I/O de-
vices in C is efficient but unsafe. On a standard JVM, na- Figure 5. A base class of a hardware object Factory
tive functions, written in C or C++, allow low-level access to and a Factory subclass
devices from Java. This approach is neither safe nor object-
oriented (OO) and incurs a lot of overheads; parameters and 2.1 Hardware Object Creation
return values have to be converted between Java and C.
In an OO language the most natural way to represent an Representing the registers of each I/O device by an object
I/O device is as an object. Figure 3 shows a class definition or an array is clearly a good idea; but how are those objects
and object instantiation for our simple parallel port. The created? An object that represents an I/O device is a typical
class ParallelPort is equivalent to the structure definition for Singleton [4]. Only one object should map to a single device.
C in Figure 2. Reference myport points to the hardware ob- Therefore, hardware objects cannot be instantiated by a sim-
ject. The device register access is similar to the C version. ple new: (1) they have to be mapped by some JVM magic
The main difference to the C structure is that the access to the device registers; (2) each device is represented by a
requires no pointers. To provide this convenient representa- single object.
tion of I/O devices as objects we just need some magic in One may assume that the board manufacturer provides the
the JVM and a mechanism to create the device object and re- classes for the hardware objects and the configuration class
ceive a reference to it. Representing I/O devices as first class for the board. This configuration class provides the Factory
objects has following benefits: [4] methods (a common design pattern to create Singletons)
Safe: The safety of Java is not compromised. We can access to instantiate hardware objects.
only those device registers that are represented by the Each I/O device object is created by its own Factory
class definition. method. The collection of those methods is the board con-
figuration which itself is also a Singleton (we have only one
Efficient: For the most common case of memory mapped board). The configuration Singleton property is enforced by
I/O device access is through the bytecodes getfield a class that contains only static methods. Figure 4 shows
and putfield; for a separate I/O address space the IO- an example for such a class. The class IOSystem represents
instructions can be included in the JVM as variants of a minimal system with two devices: a parallel port as dis-
these bytecodes for hardware objects. Both solutions cussed before to interact with the environment and a serial
avoid expensive native calls. port for program download and debugging.
IODevice IOFactory
-single : IOFactory
«creates» -serial : SerialPort
#IODevice()
-parallel : ParallelPort
-IOFactory()
«creates» +getFactory() : IOFactory
+getSerialPort() : SerialPort
+getParallelPort() : ParallelPort
SerialPort ParallelPort
+data : int +data : int
DspioFactory
+status : int +control : int
+read() : char -single : DspioFactory
+write() -usb : SerialPort
«creates» -DspioFactory()
+getDspioFactory() : DspioFactory
+getUsbPort() : SerialPort
Class handle a
info b
class reference
M0 GC info
M1 ... handle [0]
M2 [1]
reference
Constant [2]
4
Pool [3]
GC info
...
Runtime
structures Handle area Heap
3.1 HW Objects on JOP obtain the class reference for the HW object; (2) return the
address of a static field as a reference to the hardware object.
We have implemented the proposed hardware objects in We have two options to get a pointer to the class informa-
the JVM for the Java processor JOP [11, 13]. No changes tion of a hardware object, such as SerialPort, in a method of
inside the JVM (the microcode in JOP) were necessary. The IOFactory:
tricky part is the creation of hardware objects (the Factory
classes). 1. Create a normal instance of SerialPort with new on the
heap and copy the pointer to the class information.
3.1.1 Object Layout 2. Invoke a static method of SerialPort. The method exe-
In JOP objects and arrays are referenced through an indirec- cutes in the context of the class SerialPort and has ac-
tion, called the handle. This indirection is a lightweight read cess to the constant pool of that class and the rest of the
barrier for the compacting real-time garbage collector (GC) class information.
[12, 14]. All handles for objects in the heap are located in a Option 1 is simple and results in following code for the
distinct memory region, the handle area. Besides the indirec- object factory:
tion to the real object the handle contains auxiliary data, such
SerialPort s = new SerialPort();
as a reference to the class information, the array length, and
int ref = Native.toInt(s);
GC related data. Figure 8 shows an example with a small ob- SP_MTAB = Native.rdMem(ref+1);
ject that contains two fields and an integer array of length 4.
We can see that the object and the array on the heap just con- All methods in class Native, a JOP system class, are na-
tain the data and no additional hidden fields. This object lay- tive3 methods for low-level functions – the code we want
out greatly simplifies our object to I/O device mapping. We to avoid in application code. Method toInt(Object o) defeats
just need a handle where the indirection points to the mem- Java’s type safety and returns a reference as an int. Method
ory mapped device registers. This configuration is shown in rdMem(int addr) performs a memory read. In our case the
the upper part of Figure 8. Note that we do not need the GC second word from the handle, the pointer to the class infor-
information for the HW object handles. mation. The main drawback of option 1 is the creation of
normal instances of the hardware class. With option 1 the
visibility of the constructor has to be relaxed to package.
3.1.2 The Hardware Object Factory
For option 2 we have to extend each hardware object by a
As described in Section 2.1 we do not allow applications to class method to retrieve the address of the class information.
create hardware objects; the constructor is private. Two static Figure 9 shows the version of SerialPort with this method.
fields are used to store the handle to the hardware object. The We use again native functions to access JVM internal infor-
first field is initialized with the base address of the I/O device; mation. In this case rdIntMem(1) loads one word from the
the second field contains a pointer to the class information. 3 There are no native functions in JOP – bytecode is the native instruction
The address of the first static field is returned as the reference set. The very few native functions in class Native are replaced by a special,
to the serial port object. We have to solve two issues: (1) unused bytecode during class linking.
public final class SerialPort {
store the address of the hardware-field in a Java object field
public volatile int status; and access the correct data size in JIT code.
public volatile int data; When it comes to storing the hardware address in a
Java object field, we hit another problem. CACAO sup-
static int getClassRef() {
// we can access the constant pool pointer ports 32 and 64-bit architectures and obviously a hard-
// and therefore get the class reference ware address of a byte-field on a 64-bit architecture won’t
int cp = Native.rdIntMem(1); fit into a widened 32-bit object field. To get around
... this problem we widen all object fields of sub-classes of
return ref;
} org.cacaovm.io.IODevice to the pointer size on 64-bit ma-
} chines. To be able to widen these fields and to generate the
correct code later on in the JIT compiler, we add a VM in-
Figure 9. A static method to retrieve the address of ternal flag ACC CLASS HARDWARE FIELDS and set it for
the class information the class org.cacaovm.io.IODevice and all its subclasses, so
the JIT compiler can generate the correct code without the
on-chip memory onto the top-of-stack. The on-chip memory need to do super-class tests during the JIT compiler run. For
contains the stack cache and some JVM internal registers. At hardware-arrays we have to implement a similar approach.
address 1 the pointer to the constant pool of the actual class The object layout of an array in CACAO looks like this:
is located. From that address we can calculate the address of
the class information. The main drawback of option 2 is the typedef struct java_array_t {
java_object_t objheader;
repetitive copy of getClassRef() in each hardware class. As int32_t size;
this method has to be static (we need it before we have an } java_array_t;
actual instance of the class) we cannot move it to a common
typedef struct java_intarray_t {
superclass.
java_array_t header;
We decided to use option 1 to avoid the code duplication. int32_t data[1];
The resulting package visibility of the hardware object con- } java_intarray_t;
structor is a minor issue.
All I/O device classes and the Factory classes are grouped The data field of the array structure is expanded to the ac-
into a single package, in our case in com.jopdesign.io. To tual size when the array object is allocated on the Java heap.
avoid exposing the native functions (class Native) that reside This is a common practice in C.
in a system package we use delegation. The Factory con- When we want to access a hardware array we have the
structor delegates all low-level work to a helper method from same problem as for fields – the array header. We cannot put
the array directly on the hardware addresses. Therefore we
the system package. add a union to the java xxxarray t-structures:
3.2 HW Objects in CACAO typedef struct java_intarray_t {
java_array_t header;
As a second experiment we have implemented the hard- union {
int32_t array[1];
ware objects in the CACAO VM [5]. The CACAO VM is a
intptr_t address;
research JVM developed at the Vienna University of Tech- } data;
nology and has a Just-In-Time (JIT) compiler for various ar- } java_intarray_t;
chitectures.
Now we can allocate the required memory for Java arrays
or store the hardware address for hardware arrays into the
3.2.1 Object layout
array object.
As most other JVMs, CACAO’s Java object layout includes
an object header which is part of the object itself and resides 3.2.2 Implementation
on the garbage collected heap (GC heap). This fact makes
the idea of having a real hardware-object impossible without CACAO’s JIT compiler generates widened loads and stores
changing the CACAO VM radically. Thus we have to use for getfield and putfield instructions. But when we want to
an indirection for accessing hardware-fields and hardware- load byte or short fields from a hardware object we need to
arrays. Having an indirection adds obviously an overhead generate 8-bit or 16-bit loads and stores, respectively. To get
for accesses to hardware-fields or hardware-arrays. On the these instructions generated we implement additional cases
other hand, CACAO does widening of primitive fields of the in the JIT compiler for the various primitive types.
type boolean, byte, char, and short to int which would make it Whether the JIT compiler needs to generate 8-bit or 16-
impossible to access hardware-fields smaller than int directly bit loads and stores for boolean, byte, char, or short fields is
in a Java object. With indirection we can solve this issue. We decided on the flags of the declared class.
Contrary to hardware fields, when accessing hardware ar- 3.4 Summary
rays we have to generate some dedicated code for array ac-
cesses to distinguish between Java arrays and hardware ar- We have described the implementation of hardware ob-
rays at runtime and generate two different code paths, one to jects on JOP in great detail and outlined the implementa-
access Java arrays and the other to access hardware arrays. tion in CACAO and in SimpleRTJ. Other JVMs use different
structures for their class and object representations and the
3.3 HW Objects in SimpleRTJ presented solutions cannot be applied directly. However, the
provided details give guidelines for changing other JVMs to
In a third experiment we have implemented hardware ob- implement hardware objects.
jects for the SimpleRTJ interpreter [10]. The SimpleRTJ VM On JOP all the code could be written in Java,4 it was not
is described in more detail in [9]. To support the direct read- necessary to change the microcode (the low-level implemen-
ing and writing from/to raw memory we introduced an ad- tation of the JVM bytecodes in JOP). Only a single change in
ditional version of the put/get-field bytecodes. We changed the runtime representation of classes proved necessary. The
the VM locally to use these versions at bytecode addresses implementation in CACAO was straightforward. Adding a
where access to hardware objects is performed. The original new internal flag to flag classes which contain hardware-
versions of put/get-field are not changed and are still used to fields and generating slightly more code for array accesses,
access normal Java object fields. was enough to get hardware objects working in CACAO.
The new versions of put/get-field to handle hardware ob-
jects are different. An object is identified as a hardware ob- 4 Conclusion
ject if it inherits from the base class IODevice. This base
We have introduced the notation of hardware objects.
class defines one 32 bit integer field called address. During
They provide an object-oriented abstraction of low-level de-
initialization of the hardware object the address field vari-
vices. They are first class objects providing safe and efficient
able is set to the absolute address of the device register range
access to device registers from Java.
that this hardware object accesses.
To show that the concept is practical we have imple-
The hardware object specific versions of put/get-field cal-
mented it in three different JVMs: in the Java processor JOP,
culates the offset of the field being accessed as a sum of the
in the research VM CACAO, and in the embedded JVM Sim-
width of all fields preceding it. In the following example con-
pleRTJ. The implementation on JOP was surprisingly simple
trol has an offset of 0, data an offset of 1, status an offset of
– the coding took about a single day. The changes in the
3 and finally reset an offset of 7.
JIT JVM and in the interpreter JVM have been slightly more
complex.
class DummyDevice extends IODevice {
public byte control;
The proposed hardware objects are an important step for
public short data; embedded Java systems without a middleware layer. Device
public int status; drivers can be efficiently programmed in Java and benefit
public int reset; from the same safety aspects as Java application code.
}
4.1 Performance
The field offset is added to the base address as stored in
the super class instance variable address to get the absolute Our main objective for hardware objects is a clean OO
address of the device register or raw memory to access. The interface to I/O devices. Performance of the access of de-
width (or number of bytes) of the data to access is derived vice registers is an important secondary goal, because short
from the type of the field. access time is important on relatively slow embedded pro-
To ensure that the speed by which normal objects are cessors while it matters less on general purpose processors,
accessed do not suffer from the presence of hardware ob- where the slow I/O bus essentially limits the access time. In
jects we use the following strategy: The first time a put/get- Table 1 we compare the access time to a device register with
field bytecode is executed a check is made if the objects ac-
native functions to the access via hardware objects.
cessed is a hardware object. If so, the bytecode is substituted On JOP the native access is faster than using hardware
with the hardware object specific versions of put/get-field. If objects. This is due to the fact that a native access is a spe-
not the bytecode is substituted with the normal versions of cial bytecode and not a function call. The special bytecode
put/get-field.
accesses memory directly, where the bytecodes putfield and
getfield perform a null pointer check and indirection through
For this to be sound, a specific put/get-field instruction is
the handle.
never allowed to access both normal and hardware objects.
The performance evaluation with the CACAO JVM has
In a polymorphic language like Java this is in general not a
been performed on a 2 GHz x86 64 machine under Linux
sound assumption. However, with the inheritance hierarchy
of hardware object types this is a safe assumption. 4 except the already available primitive native functions
JOP CACAO SimpleRTJ Acknowledgement
read write read write read write
We thank the anonymous reviewers for their detailed and
native 8 9 24004 23683 2588 1123 insightfully comments that helped to improve the paper.
HWO 21 24 22630 21668 3956 3418
References
Table 1. Access time to a device register in clock [1] G. Bollella, J. Gosling, B. Brosgol, P. Dibble, S. Furr, and
cycles M. Turnbull. The Real-Time Specification for Java. Java Se-
ries. Addison-Wesley, June 2000.
[2] A. Burns and A. J. Wellings. Real-Time Systems and Pro-
with reads and writes to the serial port. The access via hard- gramming Languages: ADA 95, Real-Time Java, and Real-
ware objects is slightly faster (6% for read and 9% for write, Time POSIX. Addison-Wesley Longman Publishing Co., Inc.,
respectively). The kernel trap and the access time on the I/O 3rd edition, 2001.
bus dominate the cost of the access in both versions. On an [3] M. Felser, M. Golm, C. Wawersich, and J. Kleinöder. The
experiment with shared memory instead of a real I/O device JX operating system. In Proceedings of the USENIX Annual
Technical Conference, pages 45–58, 2002.
the cost of the native function call was considerable. [4] E. Gamma, R. Helm, R. Johnson, and J. M. Vlissides. De-
On the SimpleRTJ VM the native access is slightly faster sign Patterns: Elements of Reusable Object-Oriented Soft-
than access to hardware objects. The reason is that the Sim- ware. Addison Wesley Professional, 1994.
pleRTJ VM does not implement JNI, but has it’s own pro- [5] R. Grafl. CACAO: A 64-Bit JavaVM Just-in-Time Compiler.
prietary, more efficient, way to invoke native methods. It Master’s thesis, Vienna University of Technology, 1997.
[6] D. S. Hardin. Real-time objects on the bare metal: An effi-
does this very efficiently using a pre-linking phase where the
cient hardware realization of the Java virtual machine. In Pro-
invokestatic bytecode is instrumented with information to al- ceedings of the Fourth International Symposium on Object-
low an immediate invocation of the target native function. Oriented Real-Time Distributed Computing, page 53. IEEE
On the other hand, hardware object field access needs a field Computer Society, 2001.
lookup that is more time consuming than invoking a static [7] S. Korsholm, M. Schoeberl, and A. P. Ravn. Java interrupt
method. handling. In Proceedings of the 11th IEEE International Sym-
posium on Object/component/service-oriented Real-time dis-
tributed Computing (ISORC 2008), Orlando, Florida, USA,
4.2 Safety and Portability Aspects
May 2008. IEEE Computer Society.
[8] J. Kwon, A. Wellings, and S. King. Ravenscar-Java: A high
Hardware objects map object fields to the device registers. integrity profile for real-time Java. In Proceedings of the 2002
When the class that represents an I/O device is correct, access joint ACM-ISCOPE conference on Java Grande, pages 131–
to the low-level device is safe – it is not possible to read from 140. ACM Press, 2002.
or write to an arbitrary memory address. A memory area [9] E. Potratz. A practical comparison between Java and Ada in
implementing a real-time embedded system. In SigAda ’03:
represented by an array is protected by Java’s array bounds
Proceedings of the 2003 annual ACM SIGAda international
check. conference on Ada, pages 71–83. ACM Press, 2003.
It is obvious that hardware objects are platform depen- [10] RTJComputing. http://www.rtjcom.com. Visited June 2007.
dent, after all the idea is to have an interface to the bare metal. [11] M. Schoeberl. JOP: A Java Optimized Processor for Em-
Nevertheless, hardware objects give device manufacturers an bedded Real-Time Systems. PhD thesis, Vienna University of
opportunity to supply supporting software that fits into Java’s Technology, 2005.
[12] M. Schoeberl. Real-time garbage collection for Java. In Pro-
object-oriented framework and thus cater for developers of ceedings of the 9th IEEE International Symposium on Object
embedded software. and Component-Oriented Real-Time Distributed Computing
(ISORC 2006), pages 424–432, Gyeongju, Korea, April 2006.
4.3 Interrupts [13] M. Schoeberl. A Java processor architecture for embedded
real-time systems. Article in press and online: Journal of Sys-
tems Architecture, doi:10.1016/j.sysarc.2007.06.001, 2007.
Hardware objects are a vehicle to write device drivers in [14] M. Schoeberl and J. Vitek. Garbage collection for safety
Java and benefit from the safe language. However, most critical Java. In Proceedings of the 5th International Work-
device drivers also need to handle interrupts. We have not shop on Java Technologies for Real-time and Embedded Sys-
covered the topic of writing interrupt handlers in Java. This tems (JTRES 2007), pages 85–93, Vienna, Austria, September
topic is covered by a companion paper [7], where we discuss 2007. ACM Press.
interrupt handlers implemented in Java. Jointly Java hard- [15] D. Simon, C. Cifuentes, D. Cleal, J. Daniels, and D. White.
ware objects and interrupt handlers makes it attractive to de- Java on the bare metal of wireless sensor devices: the squawk
java virtual machine. In VEE ’06: Proceedings of the 2nd
velop platform dependent middleware fully within an object-
international conference on Virtual execution environments,
oriented framework with excellent structuring facilities and pages 78–88. ACM Press, 2006.
fine grained control over the unavoidable unsafe facilities.