0% found this document useful (0 votes)
65 views10 pages

Java Class File - Wikipedia

A Java class file, with a .class extension, contains Java bytecode executable on the Java Virtual Machine (JVM) and is generated from Java source files. The structure of a class file includes sections such as a magic number, version information, constant pool, access flags, class and superclass names, interfaces, fields, methods, and attributes. The format is platform-independent, allowing class files compiled on one platform to run on another, and it has undergone modifications as per Java Specification Request (JSR) 202 since its inception.

Uploaded by

Elavarasan V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views10 pages

Java Class File - Wikipedia

A Java class file, with a .class extension, contains Java bytecode executable on the Java Virtual Machine (JVM) and is generated from Java source files. The structure of a class file includes sections such as a magic number, version information, constant pool, access flags, class and superclass names, interfaces, fields, methods, and attributes. The format is platform-independent, allowing class files compiled on one platform to run on another, and it has undergone modifications as per Java Specification Request (JSR) 202 since its inception.

Uploaded by

Elavarasan V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Java class file

A Java class file is a file (wit h t he .class filename ext ension) cont aining Java byt ecode t hat
can be execut ed on t he Java Virt ual Machine (JVM). A Java class file is usually produced by a
Java compiler from Java programming language source files (.java files) cont aining Java
classes (alt ernat ively, ot her JVM languages can also be used t o creat e class files). If a source
file has more t han one class, each class is compiled int o a separat e class file. Thus, it is called a
.class file because it cont ains t he byt ecode for a single class.

JVMs are available for many plat forms, and a Java class file
class file compiled on one plat form will
Internet media type application/java-vm,
execut e on a JVM of anot her plat form. This application/x-httpd-
makes Java applicat ions plat form- java, application/x-
java,
independent .
application/java,
application/java-
History byte-code,
application/x-java-
class, application/x-
On 11 December 2006, t he class file format java-vm
was modified under Java Specificat ion
Developed by Sun Microsystems
Request (JSR) 202.[1]

File layout and structure

Sections

There are 10 basic sect ions t o t he Java class file st ruct ure:

Magic Number : 0xCAFEBABE

Version of Class File Format : t he minor and major versions of t he class file

Constant Pool: Pool of const ant s for t he class

Access Flags : for example whet her t he class is abst ract , st at ic, et c.

This Class : The name of t he current class

Super Class : The name of t he super class

Interfaces : Any int erfaces in t he class

Fields : Any fields in t he class

Methods : Any met hods in t he class


Attributes : Any at t ribut es of t he class (for example t he name of t he sourcefile, et c.)

Magic Number

Class files are ident ified by t he following 4 byt e header (in hexadecimal): CA FE BA BE (t he
first 4 ent ries in t he t able below). The hist ory of t his magic number was explained by James
Gosling referring t o a rest aurant in Palo Alt o:[2]

"We used to go to lunch at a place called St Michael's Alley. According to


local legend, in the deep dark past, the Grateful Dead used to perform
there before they made it big. It was a pretty funky place that was
definitely a Grateful Dead Kinda Place. When Jerry died, they even put
up a little Buddhist-esque shrine. When we used to go there, we
referred to the place as Cafe Dead. Somewhere along the line it was
noticed that this was a HEX number. I was re-vamping some file format
code and needed a couple of magic numbers: one for the persistent
object file, and one for classes. I used CAFEDEAD for the object file
format, and in grepping for 4 character hex words that fit after "CAFE"
(it seemed to be a good theme) I hit on BABE and decided to use it. At
that time, it didn't seem terribly important or destined to go anywhere
but the trash-can of history. So CAFEBABE became the class file format,
and CAFEDEAD was the persistent object format. But the persistent
object facility went away, and along with it went the use of CAFEDEAD -
it was eventually replaced by RMI."

General layout

Because t he class file cont ains variable-sized it ems and does not also cont ain embedded file
offset s (or point ers), it is t ypically parsed sequent ially, from t he first byt e t oward t he end. At t he
lowest level t he file format is described in t erms of a few fundament al dat a t ypes:

u1: an unsigned 8-bit int eger

u2: an unsigned 16-bit int eger in big-endian byt e order

u4: an unsigned 32-bit int eger in big-endian byt e order

table : an array of variable-lengt h it ems of some t ype. The number of it ems in t he t able is
ident ified by a preceding count number (t he count is a u2), but t he size in byt es of t he t able
can only be det ermined by examining each of it s it ems.

Some of t hese fundament al t ypes are t hen re-int erpret ed as higher-level values (such as st rings
or float ing-point numbers), depending on cont ext . There is no enforcement of word alignment ,
and so no padding byt es are ever used. The overall layout of t he class file is as shown in t he
following t able.
Type
Byte offset Size or Description
value

u1 =
0 0xCA
hex

u1 =
1 0xFE
hex magic number (CAFEBABE) used to identify file as conforming to t
4 bytes
u1 = class file format
2 0xBA
hex

u1 =
3 0xBE
hex

4
2 bytes u2 minor version number of the class file format being used
5

2 bytes u2 major version number of the class file format being used.[3]

Java SE 25 = 69 (0x45 hex),


Java SE 24 = 68 (0x44 hex),
Java SE 23 = 67 (0x43 hex),
Java SE 22 = 66 (0x42 hex),
Java SE 21 = 65 (0x41 hex),
Java SE 20 = 64 (0x40 hex),
6 Java SE 19 = 63 (0x3F hex),
Java SE 18 = 62 (0x3E hex),
Java SE 17 = 61 (0x3D hex),
Java SE 16 = 60 (0x3C hex),
Java SE 15 = 59 (0x3B hex),
Java SE 14 = 58 (0x3A hex),
Java SE 13 = 57 (0x39 hex),
Java SE 12 = 56 (0x38 hex),
7 Java SE 11 = 55 (0x37 hex),
Java SE 10 = 54 (0x36 hex),[4]
Java SE 9 = 53 (0x35 hex),[5]
Java SE 8 = 52 (0x34 hex),
Java SE 7 = 51 (0x33 hex),
Java SE 6.0 = 50 (0x32 hex),
Java SE 5.0 = 49 (0x31 hex),
JDK 1.4 = 48 (0x30 hex),
JDK 1.3 = 47 (0x2F hex),
JDK 1.2 = 46 (0x2E hex),
JDK 1.1 = 45 (0x2D hex).
For det ails of earlier version numbers see foot not e 1 at
JavaTM Virt ual Machine Specificat ion 2nd edit ion (ht t ps
s.oracle.com/javase/specs/jvms/se6/ht ml/ClassFile.doc
l)
8 constant pool count, number of entries in the following constant p
2 bytes u2 table. This count is at least one greater than the actual number of
9 see following discussion.

10
constant pool table, an array of variable-sized constant pool entrie
... c ps ize containing items such as literal numbers, strings, and references to
table
... (variable) classes or methods. Indexed starting at 1, containing ( c ons tant po
c ount - 1) number of entries in total (see note).
...

10+ c ps ize
2 bytes u2 access flags, a bitmask
11+ c ps ize

12+ c ps ize
2 bytes u2 identifies this class, index into the constant pool to a "Class"-type
13+ c ps ize

14+ c ps ize identifies s uper class, index into the constant pool to a "Class"-typ
2 bytes u2
15+ c ps ize entry

16+ c ps ize
2 bytes u2 interface count, number of entries in the following interface table
17+ c ps ize

18+ c ps ize

... is ize interface table: a variable-length array of constant pool indexes


table
... (variable) describing the interfaces implemented by this class

...

18+ c ps ize+ is ize


2 bytes u2 field count, number of entries in the following field table
19+ c ps ize+ is ize

20+ c ps ize+ is ize field table, variable length array of fields

... fs ize each element is a field_ info st ruct ure defined in


table
... (variable) ht t ps://docs.oracle.com/javase/specs/jvms/se8/ht ml/jv

...
4.ht ml#jvms-4.5

20+ c ps ize+ is ize+ fs ize


2 bytes u2 method count, number of entries in the following method table
21+ c ps ize+ is ize+ fs ize

22+ c ps ize+ is ize+ fs ize ms ize table method table, variable length array of methods
(variable)
... each element is a met hod_ info st ruct ure defined in
... ht t ps://docs.oracle.com/javase/specs/jvms/se8/ht ml/jv
... 4.ht ml#jvms-4.6
22+ c ps ize+ is ize+ fs ize+ ms ize
2 bytes u2 attribute count, number of entries in the following attribute table
23+ c ps ize+ is ize+ fs ize+ ms ize

24+ c ps ize+ is ize+ fs ize+ ms ize attribute table, variable length array of attributes

... as ize each element is an at t ribut e_ info st ruct ure defined in


table
... (variable) ht t ps://docs.oracle.com/javase/specs/jvms/se8/ht ml/jv
4.ht ml#jvms-4.7
...

Representation in a C-like programming language

Since C does not support mult iple variable lengt h arrays wit hin a st ruct , t he code below won't
compile and only serves as a demonst rat ion.

struct Class_File_Format {
u4 magic_number;

u2 minor_version;
u2 major_version;

u2 constant_pool_count;

cp_info constant_pool[constant_pool_count - 1];

u2 access_flags;

u2 this_class;
u2 super_class;

u2 interfaces_count;

u2 interfaces[interfaces_count];

u2 fields_count;
field_info fields[fields_count];

u2 methods_count;
method_info methods[methods_count];

u2 attributes_count;
attribute_info attributes[attributes_count];
}
The constant pool

The const ant pool t able is where most of t he lit eral const ant values are st ored. This includes
values such as numbers of all sort s, st rings, ident ifier names, references t o classes and met hods,
and t ype descript ors. All indexes, or references, t o specific const ant s in t he const ant pool t able
are given by 16-bit (t ype u2) numbers, where index value 1 refers t o t he first const ant in t he t able
(index value 0 is invalid).

Due t o hist oric choices made during t he file format development , t he number of const ant s in t he
const ant pool t able is not act ually t he same as t he const ant pool count which precedes t he
t able. First , t he t able is indexed st art ing at 1 (rat her t han 0), but t he count should act ually be
int erpret ed as t he maximum index plus one.[6] Addit ionally, t wo t ypes of const ant s (longs and
doubles) t ake up t wo consecut ive slot s in t he t able, alt hough t he second such slot is a phant om
index t hat is never direct ly used.

The t ype of each it em (const ant ) in t he const ant pool is ident ified by an init ial byt e tag. The
number of byt es following t his t ag and t heir int erpret at ion are t hen dependent upon t he t ag
value. The valid const ant t ypes and t heir t ag values are:
Tag Additional Version
Description of constant
byte bytes introduced

UTF-8 (Unicode) string: a character string prefixed by a 16-bit number (type u2)
indicating the number of bytes in the encoded string which immediately follows
2+ x bytes
1 (which may be different than the number of characters). Note that the 1.0.2
(variable)
encoding used is not actually UTF-8, but involves a slight modification of the
Unicode standard encoding form.

3 4 bytes Integer: a signed 32-bit two's complement number in big-endian format 1.0.2

4 4 bytes Float: a 32-bit single-precision IEEE 754 floating-point number 1.0.2

Long: a signed 64-bit two's complement number in big-endian format (takes


5 8 bytes 1.0.2
two slots in the constant pool table)

Double: a 64-bit double-precision IEEE 754 floating-point number (takes two


6 8 bytes 1.0.2
slots in the constant pool table)

Class reference: an index within the constant pool to a UTF-8 string containing
7 2 bytes 1.0.2
the fully qualified class name (in internal format) (big-endian)

String reference: an index within the constant pool to a UTF-8 string (big-endian
8 2 bytes 1.0.2
too)

Field reference: two indexes within the constant pool, the first pointing to a
9 4 bytes 1.0.2
Class reference, the second to a Name and Type descriptor. (big-endian)

Method reference: two indexes within the constant pool, the first pointing to a
10 4 bytes 1.0.2
Class reference, the second to a Name and Type descriptor. (big-endian)

Interface method reference: two indexes within the constant pool, the first
11 4 bytes pointing to a Class reference, the second to a Name and Type descriptor. (big- 1.0.2
endian)

Name and type descriptor: two indexes to UTF-8 strings within the constant
12 4 bytes pool, the first representing a name (identifier) and the second a specially 1.0.2
encoded type descriptor.

Method handle: this structure is used to represent a method handle and


15 3 bytes consists of one byte of type descriptor, followed by an index within the 7
[6]
constant pool.

Method type: this structure is used to represent a method type, and consists of
16 2 bytes 7
an index within the constant pool.[6]

Dynamic: this is used to specify a dynamically computed constant produced by


17 4 bytes 11
invocation of a bootstrap method.[6]

InvokeDynamic: this is used by an invokedynamic instruction to specify a


bootstrap method, the dynamic invocation name, the argument and return types
18 4 bytes 7
of the call, and optionally, a sequence of additional constants called static
arguments to the bootstrap method.[6]

19 2 bytes Module: this is used to identify a module.[6] 9

20 2 bytes Package: this is used to identify a package exported or opened by a module.[6] 9


There are only t wo int egral const ant t ypes, int eger and long. Ot her int egral t ypes appearing in t he
high-level language, such as boolean, byt e, and short must be represent ed as an int eger const ant .

Class names in Java, when fully qualified, are t radit ionally dot -separat ed, such as
"java.lang.Object ". However wit hin t he low-level Class reference const ant s, an int ernal form
appears which uses slashes inst ead, such as "java/lang/Object ".

The Unicode st rings, despit e t he moniker "UTF-8 st ring", are not act ually encoded according t o
t he Unicode st andard, alt hough it is similar. There are t wo differences (see UTF-8 for a complet e
discussion). The first is t hat t he code point U+0000 is encoded as t he t wo-byt e sequence C0
80 (in hex) inst ead of t he st andard single-byt e encoding 00 . The second difference is t hat
supplement ary charact ers (t hose out side t he BMP at U+10000 and above) are encoded using a
surrogat e-pair const ruct ion similar t o UTF-16 rat her t han being direct ly encoded using UTF-8. In
t his case each of t he t wo surrogat es is encoded separat ely in UTF-8. For example, U+1D11E is
encoded as t he 6-byt e sequence ED A0 B4 ED B4 9E , rat her t han t he correct 4-byt e UTF-
8 encoding of F0 9D 84 9E .

See also

Computer programming
portal

Java byt ecode

References

1. JSR 202 (ht t p://www.jcp.org/en/jsr/det ail?id=202) Java Class File Specificat ion Updat e

2. James Gosling privat e communicat ion t o Bill Bumgarner (ht t p://radio-weblogs.com/010049


0/2003/01/28.ht ml)

3. "Table 4.1-A. class file format major versions" (ht t ps://docs.oracle.com/javase/specs/jvms/


se23/ht ml/jvms-4.ht ml#jvms-4.1-200-B.2) .

4. "JDK 10 Release Not es" (ht t p://www.oracle.com/t echnet work/java/javase/10-relnot e-issue


s-4108729.ht ml#Remaining) .

5. "[JDK-8148785] Updat e class file version t o 53 for JDK-9 - Java Bug Syst em" (ht t ps://bugs.
openjdk.java.net /browse/JDK-8148785) .

6. "Chapt er 4. The class File Format " (ht t ps://docs.oracle.com/javase/specs/jvms/se11/ht ml/j


vms-4.ht ml#jvms-4.4) .
Further reading

Tim Lindholm, Frank Yellin (1999). The Java Virtual Machine Specification (ht t ps://docs.oracle.
com/javase/specs/jvms/se6/ht ml/ClassFile.doc.ht ml) (Second ed.). Prent ice Hall. ISBN 0-
201-43294-3. Ret rieved 2008-10-13. The official defining document of t he Java Virt ual
Machine, which includes t he class file format . Bot h t he first and second edit ions of t he book
are freely available online for viewing and/or download (ht t ps://docs.oracle.com/javase/spec
s/) .

You might also like