0% found this document useful (0 votes)
713 views29 pages

Writing An LLVM Compiler Backend

The document provides instructions for writing a compiler backend for LLVM that converts the LLVM intermediate representation (IR) to machine code for a specified target. It outlines the basic steps which include creating a TargetMachine subclass, describing the register set and instruction set using TableGen, implementing instruction selection and code generation passes, and writing an assembly printer.

Uploaded by

nevdull
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
713 views29 pages

Writing An LLVM Compiler Backend

The document provides instructions for writing a compiler backend for LLVM that converts the LLVM intermediate representation (IR) to machine code for a specified target. It outlines the basic steps which include creating a TargetMachine subclass, describing the register set and instruction set using TableGen, implementing instruction selection and code generation passes, and writing an assembly printer.

Uploaded by

nevdull
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

llvm .

o rg

http://www.llvm.o rg/do cs/WritingAnLLVMBackend.html#register-set-and-register-classes

Writing an LLVM Compiler Backend


Writing an LLVM Compiler Backend
Introduction
T his document describes techniques f or writing compiler backends that convert the LLVM Intermediate Representation (IR) to code f or a specif ied machine or other languages. Code intended f or a specif ic machine can take the f orm of either assembly code or binary code (usable f or a JIT compiler). T he backend of LLVM f eatures a target-independent code generator that may create output f or several types of target CPUs including X86, PowerPC, ARM, and SPARC. T he backend may also be used to generate code targeted at SPUs of the Cell processor or GPUs to support the execution of compute kernels. T he document f ocuses on existing examples f ound in subdirectories of llvm/lib/Target in a downloaded LLVM release. In particular, this document f ocuses on the example of creating a static compiler (one that emits text assembly) f or a SPARC target, because SPARC has f airly standard characteristics, such as a RISC instruction set and straightf orward calling conventions.

Audience
T he audience f or this document is anyone who needs to write an LLVM backend to generate code f or a specif ic hardware or sof tware target.

Prerequisit e Reading
T hese essential documents must be read bef ore reading this document: LLVM Language Ref erence Manual a ref erence manual f or the LLVM assembly language. The LLVM Target-Independent Code Generator a guide to the components (classes and code generation algorithms) f or translating the LLVM internal representation into machine code f or a specif ied target. Pay particular attention to the descriptions of code generation stages: Instruction Selection, Scheduling and Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code Insertion, Late Machine Code Optimizations, and Code Emission. TableGen Fundamentals a document that describes the TableGen (tblgen) application that manages domain-specif ic inf ormation to support LLVM code generation. TableGen processes input f rom a target description f ile (.td suf f ix) and generates C++ code that can be used f or code generation. Writing an LLVM Pass T he assembly printer is a FunctionPass, as are several SelectionDAG processing steps. To f ollow the SPARC examples in this document, have a copy of T he SPARC Architecture Manual, Version 8 f or ref erence. For details about the ARM instruction set, ref er to the ARM Architecture Ref erence Manual. For more about the GNU Assembler f ormat (GAS), see Using As , especially f or the assembly printer. Using As contains a list of target machine dependent f eatures.

Basic St eps

To write a compiler backend f or LLVM that converts the LLVM IR to code f or a specif ied target (machine or other language), f ollow these steps: Create a subclass of the TargetMachine class that describes characteristics of your target machine. Copy existing examples of specif ic TargetMachine class and header f iles; f or example, start with SparcTargetMachine.cpp and SparcTargetMachine.h, but change the f ile names f or your target. Similarly, change code that ref erences Sparc to ref erence your target. Describe the register set of the target. Use TableGen to generate code f or register def inition, register aliases, and register classes f rom a target-specif ic RegisterInfo.td input f ile. You should also write additional code f or a subclass of the TargetRegisterInfo class that represents the class register f ile data used f or register allocation and also describes the interactions between registers. Describe the instruction set of the target. Use TableGen to generate code f or target-specif ic instructions f rom target-specif ic versions of TargetInstrFormats.td and TargetInstrInfo.td. You should write additional code f or a subclass of the TargetInstrInfo class to represent machine instructions supported by the target machine. Describe the selection and conversion of the LLVM IR f rom a Directed Acyclic Graph (DAG) representation of instructions to native target-specif ic instructions. Use TableGen to generate code that matches patterns and selects instructions based on additional inf ormation in a target-specif ic version of TargetInstrInfo.td. Write code f or XXXISelDAGToDAG.cpp, where XXX identif ies the specif ic target, to perf orm pattern matching and DAG-to-DAG instruction selection. Also write code in XXXISelLowering.cpp to replace or remove operations and data types that are not supported natively in a SelectionDAG. Write code f or an assembly printer that converts LLVM IR to a GAS f ormat f or your target machine. You should add assembly strings to the instructions def ined in your target-specif ic version of TargetInstrInfo.td. You should also write code f or a subclass of AsmPrinter that perf orms the LLVM-to-assembly conversion and a trivial subclass of TargetAsmInfo. Optionally, add support f or subtargets (i.e., variants with dif f erent capabilities). You should also write code f or a subclass of the TargetSubtarget class, which allows you to use the -mcpu= and -mattr= command-line options. Optionally, add JIT support and create a machine code emitter (subclass of TargetJITInfo) that is used to emit binary code directly into memory. In the .cpp and .h. f iles, initially stub up these methods and then implement them later. Initially, you may not know which private members that the class will need and which components will need to be subclassed.

Preliminaries
To actually create your compiler backend, you need to create and modif y a f ew f iles. T he absolute minimum is discussed here. But to actually use the LLVM target-independent code generator, you must perf orm the steps described in the LLVM Target-Independent Code Generator document. First, you should create a subdirectory under lib/Target to hold all the f iles related to your target. If your target is called Dummy, create the directory lib/Target/Dummy. In this new directory, create a Makefile. It is easiest to copy a Makefile of another target and modif y it. It should at least contain the LEVEL, LIBRARYNAME and TARGET variables, and then include $(LEVEL)/Makefile.common. T he library can be named LLVMDummy (f or example, see the MIPS target).

Alternatively, you can split the library into LLVMDummyCodeGen and LLVMDummyAsmPrinter, the latter of which should be implemented in a subdirectory below lib/Target/Dummy (f or example, see the PowerPC target). Note that these two naming schemes are hardcoded into llvm-config. Using any other naming scheme will conf use llvm-config and produce a lot of (seemingly unrelated) linker errors when linking llc. To make your target actually do something, you need to implement a subclass of TargetMachine. T his implementation should typically be in the f ile lib/Target/DummyTargetMachine.cpp, but any f ile in the lib/Target directory will be built and should work. To use LLVMs target independent code generator, you should do what all current machine backends do: create a subclass of LLVMTargetMachine. (To create a target f rom scratch, create a subclass of TargetMachine.) To get LLVM to actually build and link your target, you need to add it to the TARGETS_TO_BUILD variable. To do this, you modif y the conf igure script to know about your target when parsing the --enable-targets option. Search the conf igure script f or TARGETS_TO_BUILD, add your target to the lists there (some creativity required), and then reconf igure. Alternatively, you can change autotools/configure.ac and regenerate conf igure by running ./autoconf/AutoRegen.sh.

Target Machine
LLVMTargetMachine is designed as a base class f or targets implemented with the LLVM target-independent code generator. T he LLVMTargetMachine class should be specialized by a concrete target class that implements the various virtual methods. LLVMTargetMachine is def ined as a subclass of TargetMachine in include/llvm/Target/TargetMachine.h. T he TargetMachine class implementation (TargetMachine.cpp) also processes numerous command-line options. To create a concrete target-specif ic subclass of LLVMTargetMachine, start by copying an existing TargetMachine class and header. You should name the f iles that you create to ref lect your specif ic target. For instance, f or the SPARC target, name the f iles SparcTargetMachine.h and SparcTargetMachine.cpp. For a target machine XXX, the implementation of XXXTargetMachine must have access methods to obtain objects that represent target components. T hese methods are named get*Info, and are intended to obtain the instruction set (getInstrInfo), register set (getRegisterInfo), stack f rame layout (getFrameInfo), and similar inf ormation. XXXTargetMachine must also implement the getDataLayout method to access an object with target-specif ic data characteristics, such as data type size and alignment requirements. For instance, f or the SPARC target, the header f ile SparcTargetMachine.h declares prototypes f or several get*Info and getDataLayout methods that simply return a class member.

namespace llvm { class Module; class SparcTarget Machine : public LLVMTarget Machine { const Dat aLayout Dat aLayout ; // Calculat es t ype size & alignment SparcSubt arget Subt arget ; SparcInst rInfo Inst rInfo; Target FrameInfo FrameInfo; prot ect ed: virt ual const Target AsmInfo *creat eTarget AsmInfo() const ; public: SparcTarget Machine(const Module &M, const st d::st ring &FS); virt ual const SparcInst rInfo *get Inst rInfo() const {ret urn &Inst rInfo; } virt ual const Target FrameInfo *get FrameInfo() const {ret urn &FrameInfo; } virt ual const Target Subt arget *get Subt arget Impl() const {ret urn &Subt arget ; } virt ual const Target Regist erInfo *get Regist erInfo() const { ret urn &Inst rInfo.get Regist erInfo(); } virt ual const Dat aLayout *get Dat aLayout () const { ret urn &Dat aLayout ; } st at ic unsigned get ModuleMat chQualit y(const Module &M); // Pass Pipeline Congurat ion virt ual bool addInst Select or(PassManagerBase &PM, bool Fast ); virt ual bool addPreEmit Pass(PassManagerBase &PM, bool Fast ); }; } // end namespace llvm getInstrInfo() getRegisterInfo() getFrameInfo() getDataLayout() getSubtargetImpl() For some targets, you also need to support the f ollowing methods: getTargetLowering() getJITInfo() In addition, the XXXTargetMachine constructor should specif y a TargetDescription string that determines the data layout f or the target machine, including characteristics such as pointer size, alignment, and endianness. For example, the constructor f or SparcTargetMachine contains the f ollowing: SparcTarget Machine::SparcTarget Machine(const Module &M, const st d::st ring &FS) : Dat aLayout ("E-p:32:32-f128:128:128"), Subt arget (M, FS), Inst rInfo(Subt arget ), FrameInfo(Target FrameInfo::St ackGrowsDown, 8, 0) { }

Hyphens separate portions of the TargetDescription string. An upper-case E in the string indicates a big-endian target data model. A lower-case e indicates littleendian. p: is f ollowed by pointer inf ormation: size, ABI alignment, and pref erred alignment. If only two f igures f ollow p:, then the f irst value is pointer size, and the second value is both ABI and pref erred alignment. T hen a letter f or numeric type alignment: i, f, v, or a (corresponding to integer, f loating point, vector, or aggregate). i, v, or a are f ollowed by ABI alignment and pref erred alignment. f is f ollowed by three values: the f irst indicates the size of a long double, then ABI alignment, and then ABI pref erred alignment.

Target Registration
You must also register your target with the TargetRegistry, which is what other LLVM tools use to be able to lookup and use your target at runtime. T he TargetRegistry can be used directly, but f or most targets there are helper templates which should take care of the work f or you. All targets should declare a global Target object which is used to represent the target during registration. T hen, in the targets TargetInfo library, the target should def ine that object and use the RegisterTarget template to register the target. For example, the Sparc registration code looks like this: Target llvm::TheSparcTarget ; ext ern "C" void LLVMInit ializeSparcTarget Info() { Regist erTarget <Triple::sparc, /*HasJIT=*/false> X(TheSparcTarget , "sparc", "Sparc"); } T his allows the TargetRegistry to look up the target by name or by target triple. In addition, most targets will also register additional f eatures which are available in separate libraries. T hese registration steps are separate, because some clients may wish to only link in some parts of the target the JIT code generator does not require the use of the assembler printer, f or example. Here is an example of registering the Sparc assembly printer: ext ern "C" void LLVMInit ializeSparcAsmPrint er() { Regist erAsmPrint er<SparcAsmPrint er> X(TheSparcTarget ); } For more inf ormation, see llvm/Target/TargetRegistry.h.

Register Set and Register Classes


You should describe a concrete target-specif ic class that represents the register f ile of a target machine. T his class is called XXXRegisterInfo (where XXX identif ies the target) and represents the class register f ile data that is used f or register allocation. It also describes the interactions between registers. You also need to def ine register classes to categorize related registers. A register class should be added f or groups of registers that are all treated the same way f or some instruction. Typical examples are register

classes f or integer, f loating-point, or vector registers. A register allocator allows an instruction to use any register in a specif ied register class to perf orm the instruction in a similar manner. Register classes allocate virtual registers to instructions f rom these sets, and register classes let the target-independent register allocator automatically choose the actual registers. Much of the code f or registers, including register def inition, register aliases, and register classes, is generated by TableGen f rom XXXRegisterInfo.td input f iles and placed in XXXGenRegisterInfo.h.inc and XXXGenRegisterInfo.inc output f iles. Some of the code in the implementation of XXXRegisterInfo requires hand-coding.

Def ining a Regist er


T he XXXRegisterInfo.td f ile typically starts with register def initions f or a target machine. T he Register class (specif ied in Target.td) is used to def ine an object f or each register. T he specif ied string n becomes the Name of the register. T he basic Register object does not have any subregisters and does not specif y any aliases. class Regist er<st ring n> { st ring Namespace = ""; st ring AsmName = n; st ring Name = n; int SpillSize = 0; int SpillAlignment = 0; list <Regist er> Aliases = []; list <Regist er> SubRegs = []; list <int > DwarfNumbers = []; } For example, in the X86RegisterInfo.td f ile, there are register def initions that utilize the Register class, such as: def AL : Regist er<"AL">, DwarfRegNum<[0, 0, 0]>; T his def ines the register AL and assigns it values (with DwarfRegNum) that are used by gcc, gdb, or a debug inf ormation writer to identif y a register. For register AL, DwarfRegNum takes an array of 3 values representing 3 dif f erent modes: the f irst element is f or X86-64, the second f or exception handling (EH) on X86-32, and the third is generic. -1 is a special Dwarf number that indicates the gcc number is undef ined, and -2 indicates the register number is invalid f or this mode. From the previously described line in the X86RegisterInfo.td f ile, TableGen generates this code in the X86GenRegisterInfo.inc f ile: st at ic const unsigned GR8[] = { X86::AL, ... }; const unsigned AL_AliasSet [] = { X86::AX, X86::EAX, X86::RAX, 0 }; const Target Regist erDesc Regist erDescript ors[] = { ... { "AL", "AL", AL_AliasSet , Empt y_SubRegsSet , Empt y_SubRegsSet , AL_SuperRegsSet }, ...

From the register inf o f ile, TableGen generates a TargetRegisterDesc object f or each register. TargetRegisterDesc is def ined in include/llvm/Target/TargetRegisterInfo.h with the f ollowing f ields: st ruct Target Regist erDesc { const char *AsmName; // Assembly language name for t he regist er const char *Name; // Print able name for t he reg (for debugging) const unsigned *AliasSet ; // Regist er Alias Set const unsigned *SubRegs; // Sub-regist er set const unsigned *ImmSubRegs; // Immediat e sub-regist er set const unsigned *SuperRegs; // Super-regist er set }; TableGen uses the entire target description f ile (.td) to determine text names f or the register (in the AsmName and Name f ields of TargetRegisterDesc) and the relationships of other registers to the def ined register (in the other TargetRegisterDesc f ields). In this example, other def initions establish the registers AX, EAX, and RAX as aliases f or one another, so TableGen generates a null-terminated array (AL_AliasSet) f or this register alias set. T he Register class is commonly used as a base class f or more complex classes. In Target.td, the Register class is the base f or the RegisterWithSubRegs class that is used to def ine registers that need to specif y subregisters in the SubRegs list, as shown here: class Regist erWit hSubRegs<st ring n, list <Regist er> subregs> : Regist er<n> { let SubRegs = subregs; } In SparcRegisterInfo.td, additional register classes are def ined f or SPARC: a Register subclass, SparcReg, and f urther subclasses: Ri, Rf, and Rd. SPARC registers are identif ied by 5-bit ID numbers, which is a f eature common to these subclasses. Note the use of let expressions to override values that are initially def ined in a superclass (such as SubRegs f ield in the Rd class). class SparcReg<st ring n> : Regist er<n> { eld bit s<5> Num; let Namespace = "SP"; } // Ri - 32-bit int eger regist ers class Ri<bit s<5> num, st ring n> : SparcReg<n> { let Num = num; } // Rf - 32-bit oat ing-point regist ers class Rf<bit s<5> num, st ring n> : SparcReg<n> { let Num = num; } // Rd - Slot s in t he FP regist er le for 64-bit oat ing-point values. class Rd<bit s<5> num, st ring n, list <Regist er> subregs> : SparcReg<n> { let Num = num; let SubRegs = subregs; }

In the SparcRegisterInfo.td f ile, there are register def initions that utilize these subclasses of Register, such as: def def ... def def ... def def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>; G1 : Ri< 1, "G1">, DwarfRegNum<[1]>; F0 : Rf< 0, "F0">, DwarfRegNum<[32]>; F1 : Rf< 1, "F1">, DwarfRegNum<[33]>; D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>; D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>;

T he last two registers shown above (D0 and D1) are double-precision f loating-point registers that are aliases f or pairs of single-precision f loating-point sub-registers. In addition to aliases, the sub-register and superregister relationships of the def ined register are in f ields of a registers TargetRegisterDesc.

Def ining a Regist er Class


T he RegisterClass class (specif ied in Target.td) is used to def ine an object that represents a group of related registers and also def ines the def ault allocation order of the registers. A target description f ile XXXRegisterInfo.td that uses Target.td can construct register classes using the f ollowing class: class Regist erClass<st ring namespace, list <ValueType> regTypes, int alignment , dag regList > { st ring Namespace = namespace; list <ValueType> RegTypes = regTypes; int Size = 0; // spill size, in bit s; zero let s t blgen pick t he size int Alignment = alignment ; // CopyCost is t he cost of copying a value bet ween t wo regist ers // default value 1 means a single inst ruct ion // A negat ive value means copying is ext remely expensive or impossible int CopyCost = 1; dag MemberList = regList ; // for regist er classes t hat are subregist ers of t his class list <Regist erClass> SubRegClassList = []; code Met hodProt os = [{}]; // t o insert arbit rary code code Met hodBodies = [{}]; } To def ine a RegisterClass, use the f ollowing 4 arguments: T he f irst argument of the def inition is the name of the namespace. T he second argument is a list of ValueType register type values that are def ined in include/llvm/CodeGen/ValueTypes.td. Def ined values include integer types (such as i16, i32, and i1 f or Boolean), f loating-point types (f32, f64), and vector types (f or example, v8i16 f or an 8 x i16 vector). All registers in a RegisterClass must have the same ValueType, but some registers may store vector data in dif f erent conf igurations. For example a register that can process a 128-bit vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4 32-bit integers, and so on.

T he third argument of the RegisterClass def inition specif ies the alignment required of the registers when they are stored or loaded to memory. T he f inal argument, regList, specif ies which registers are in this class. If an alternative allocation order method is not specif ied, then regList also def ines the order of allocation used by the register allocator. Besides simply listing registers with (add R0, R1, ...), more advanced set operators are available. See include/llvm/Target/Target.td f or more inf ormation. In SparcRegisterInfo.td, three RegisterClass objects are def ined: FPRegs, DFPRegs, and IntRegs. For all three register classes, the f irst argument def ines the namespace with the string SP. FPRegs def ines a group of 32 single-precision f loating-point registers (F0 to F31); DFPRegs def ines a group of 16 double-precision registers (D0-D15). // F0, F1, F2, ..., F31 def FPRegs : Regist erClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>; def DFPRegs : Regist erClass<"SP", [f64], 64, (add D0, D1, D2, D3, D4, D5, D6, D7, D8, D9, D10, D11, D12, D13, D14, D15)>; def Int Regs : Regist erClass<"SP", [i32], 32, (add L0, L1, L2, L3, L4, L5, L6, L7, I0, I1, I2, I3, I4, I5, O0, O1, O2, O3, O4, O5, O7, G1, // Non-allocat able regs: G2, G3, G4, O6, // st ack pt r I6, // frame pt r I7, // ret urn address G0, // const ant zero G5, G6, G7 // reserved for kernel )>; Using SparcRegisterInfo.td with TableGen generates several output f iles that are intended f or inclusion in other source code that you write. SparcRegisterInfo.td generates SparcGenRegisterInfo.h.inc, which should be included in the header f ile f or the implementation of the SPARC register implementation that you write (SparcRegisterInfo.h). In SparcGenRegisterInfo.h.inc a new structure is def ined called SparcGenRegisterInfo that uses TargetRegisterInfo as its base. It also specif ies types, based upon the def ined register classes: DFPRegsClass, FPRegsClass, and IntRegsClass. SparcRegisterInfo.td also generates SparcGenRegisterInfo.inc, which is included at the bottom of SparcRegisterInfo.cpp, the SPARC register implementation. T he code below shows only the generated integer registers and associated register classes. T he order of registers in IntRegs ref lects the order in the def inition of IntRegs in the target description f ile.

// Int Regs Regist er Class... st at ic const unsigned Int Regs[] = { SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5, SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3, SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3, SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3, SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5, SP::G6, SP::G7, }; // Int RegsVTs Regist er Class Value Types... st at ic const MVT::ValueType Int RegsVTs[] = { MVT::i32, MVT::Ot her }; namespace SP { // Regist er class inst ances DFPRegsClass DFPRegsRegClass; FPRegsClass FPRegsRegClass; Int RegsClass Int RegsRegClass; ... // Int Regs Sub-regist er Classess... st at ic const Target Regist erClass* const Int RegsSubRegClasses [] = { NULL }; ... // Int Regs Super-regist er Classess... st at ic const Target Regist erClass* const Int RegsSuperRegClasses [] = { NULL }; ... // Int Regs Regist er Class sub-classes... st at ic const Target Regist erClass* const Int RegsSubclasses [] = { NULL }; ... // Int Regs Regist er Class super-classes... st at ic const Target Regist erClass* const Int RegsSuperclasses [] = { NULL }; Int RegsClass::Int RegsClass() : Target Regist erClass(Int RegsRegClassID, Int RegsVTs, Int RegsSubclasses, Int RegsSuperclasses, Int RegsSubRegClasses, Int RegsSuperRegClasses, 4, 4, 1, Int Regs, Int Regs + 32) {} } T he register allocators will avoid using reserved registers, and callee saved registers are not used until all the volatile registers have been used. T hat is usually good enough, but in some cases it may be necessary to provide custom allocation orders.

Implement a subclass of TargetRegisterInfo


T he f inal step is to hand code portions of XXXRegisterInfo, which implements the interf ace described in TargetRegisterInfo.h (see The TargetRegisterInfo class). T hese f unctions return 0, NULL, or false, unless overridden. Here is a list of f unctions that are overridden f or the SPARC implementation in SparcRegisterInfo.cpp: getCalleeSavedRegs Returns a list of callee-saved registers in the order of the desired callee-save

stack f rame of f set. getReservedRegs Returns a bitset indexed by physical register numbers, indicating if a particular register is unavailable. hasFP Return a Boolean indicating if a f unction should have a dedicated f rame pointer register. eliminateCallFramePseudoInstr If call f rame setup or destroy pseudo instructions are used, this can be called to eliminate them. eliminateFrameIndex Eliminate abstract f rame indices f rom instructions that may use them. emitPrologue Insert prologue code into the f unction. emitEpilogue Insert epilogue code into the f unction.

Instruction Set
During the early stages of code generation, the LLVM IR code is converted to a SelectionDAG with nodes that are instances of the SDNode class containing target instructions. An SDNode has an opcode, operands, type requirements, and operation properties. For example, is an operation commutative, does an operation load f rom memory. T he various operation node types are described in the include/llvm/CodeGen/SelectionDAGNodes.h f ile (values of the NodeType enum in the ISD namespace). TableGen uses the f ollowing target description (.td) input f iles to generate much of the code f or instruction def inition: Target.td Where the Instruction, Operand, InstrInfo, and other f undamental classes are def ined. TargetSelectionDAG.td Used by SelectionDAG instruction selection generators, contains SDTC* classes (selection DAG type constraint), def initions of SelectionDAG nodes (such as imm, cond, bb, add, fadd, sub), and pattern support (Pattern, Pat, PatFrag, PatLeaf, ComplexPattern. XXXInstrFormats.td Patterns f or def initions of target-specif ic instructions. XXXInstrInfo.td Target-specif ic def initions of instruction templates, condition codes, and instructions of an instruction set. For architecture modif ications, a dif f erent f ile name may be used. For example, f or Pentium with SSE instruction, this f ile is X86InstrSSE.td, and f or Pentium with MMX, this f ile is X86InstrMMX.td. T here is also a target-specif ic XXX.td f ile, where XXX is the name of the target. T he XXX.td f ile includes the other .td input f iles, but its contents are only directly important f or subtargets. You should describe a concrete target-specif ic class XXXInstrInfo that represents machine instructions supported by a target machine. XXXInstrInfo contains an array of XXXInstrDescriptor objects, each of which describes one instruction. An instruction descriptor def ines: Opcode mnemonic Number of operands List of implicit register def initions and uses Target-independent properties (such as memory access, is commutable) Target-specif ic f lags

T he Instruction class (def ined in Target.td) is mostly used as a base f or more complex instruction classes. class Inst ruct ion { st ring Namespace = ""; dag Out OperandList ; // A dag cont aining t he MI def operand list . dag InOperandList ; // A dag cont aining t he MI use operand list . st ring AsmSt ring = ""; // The .s format t o print t he inst ruct ion wit h. list <dag> Pat t ern; // Set t o t he DAG pat t ern for t his inst ruct ion. list <Regist er> Uses = []; list <Regist er> Defs = []; list <Predicat e> Predicat es = []; // predicat es t urned int o isel mat ch code ... remainder not shown for space ... } A SelectionDAG node (SDNode) should contain an object representing a target-specif ic instruction that is def ined in XXXInstrInfo.td. T he instruction objects should represent instructions f rom the architecture manual of the target machine (such as the SPARC Architecture Manual f or the SPARC target). A single instruction f rom the architecture manual is of ten modeled as multiple target instructions, depending upon its operands. For example, a manual might describe an add instruction that takes a register or an immediate operand. An LLVM target could model this with two instructions named ADDri and ADDrr. You should def ine a class f or each instruction category and def ine each opcode as a subclass of the category with appropriate parameters such as the f ixed binary encoding of opcodes and extended opcodes. You should map the register bits to the bits of the instruction in which they are encoded (f or the JIT ). Also you should specif y how the instruction should be printed when the automatic assembly printer is used. As is described in the SPARC Architecture Manual, Version 8, there are three major 32-bit f ormats f or instructions. Format 1 is only f or the CALL instruction. Format 2 is f or branch on condition codes and SETHI (set high bits of a register) instructions. Format 3 is f or other instructions. Each of these f ormats has corresponding classes in SparcInstrFormat.td. InstSP is a base class f or other instruction classes. Additional base classes are specif ied f or more precise f ormats: f or example in SparcInstrFormat.td, F2_1 is f or SETHI, and F2_2 is f or branches. T here are three other base classes: F3_1 f or register/register operations, F3_2 f or register/immediate operations, and F3_3 f or f loating-point operations. SparcInstrInfo.td also adds the base class Pseudo f or synthetic SPARC instructions. SparcInstrInfo.td largely consists of operand and instruction def initions f or the SPARC target. In SparcInstrInfo.td, the f ollowing target description f ile entry, LDrr, def ines the Load Integer instruction f or a Word (the LD SPARC opcode) f rom a memory address to a register. T he f irst parameter, the value 3 (112), is the operation value f or this category of operation. T he second parameter (0000002) is the specif ic operation value f or LD/Load Word. T he third parameter is the output destination, which is a register operand and def ined in the Register target description f ile (IntRegs). def LDrr : F3_1 <3, 0b000000, (out s Int Regs:$dst ), (ins MEMrr:$addr), "ld [$addr], $dst ", [(set i32:$dst , (load ADDRrr:$addr))]>; T he f ourth parameter is the input source, which uses the address operand MEMrr that is def ined earlier in

SparcInstrInfo.td: def MEMrr : Operand<i32> { let Print Met hod = "print MemOperand"; let MIOperandInfo = (ops Int Regs, Int Regs); } T he f if th parameter is a string that is used by the assembly printer and can be lef t as an empty string until the assembly printer interf ace is implemented. T he sixth and f inal parameter is the pattern used to match the instruction during the SelectionDAG Select Phase described in The LLVM Target-Independent Code Generator. T his parameter is detailed in the next section, Instruction Selector. Instruction class def initions are not overloaded f or dif f erent operand types, so separate versions of instructions are needed f or register, memory, or immediate value operands. For example, to perf orm a Load Integer instruction f or a Word f rom an immediate operand to a register, the f ollowing instruction class is def ined: def LDri : F3_2 <3, 0b000000, (out s Int Regs:$dst ), (ins MEMri:$addr), "ld [$addr], $dst ", [(set i32:$dst , (load ADDRri:$addr))]>; Writing these def initions f or so many similar instructions can involve a lot of cut and paste. In .td f iles, the multiclass directive enables the creation of templates to def ine several instruction classes at once (using the defm directive). For example in SparcInstrInfo.td, the multiclass pattern F3_12 is def ined to create 2 instruction classes each time F3_12 is invoked: mult iclass F3_12 <st ring OpcSt r, bit s<6> Op3Val, SDNode OpNode> { def rr : F3_1 <2, Op3Val, (out s Int Regs:$dst ), (ins Int Regs:$b, Int Regs:$c), !st rconcat (OpcSt r, " $b, $c, $dst "), [(set i32:$dst , (OpNode i32:$b, i32:$c))]>; def ri : F3_2 <2, Op3Val, (out s Int Regs:$dst ), (ins Int Regs:$b, i32imm:$c), !st rconcat (OpcSt r, " $b, $c, $dst "), [(set i32:$dst , (OpNode i32:$b, simm13:$c))]>; } So when the defm directive is used f or the XOR and ADD instructions, as seen below, it creates f our instruction objects: XORrr, XORri, ADDrr, and ADDri. defm XOR : F3_12<"xor", 0b000011, xor>; defm ADD : F3_12<"add", 0b000000, add>; SparcInstrInfo.td also includes def initions f or condition codes that are ref erenced by branch instructions. T he f ollowing def initions in SparcInstrInfo.td indicate the bit location of the SPARC condition code. For example, the 10th bit represents the greater than condition f or integers, and the 22nd bit represents the greater than condition f or f loats.

def def def ... def def def ...

ICC_NE : ICC_VAL< 9>; // Not Equal ICC_E : ICC_VAL< 1>; // Equal ICC_G : ICC_VAL<10>; // Great er FCC_U : FCC_VAL<23>; // Unordered FCC_G : FCC_VAL<22>; // Great er FCC_UG : FCC_VAL<21>; // Unordered or Great er

(Note that Sparc.h also def ines enums that correspond to the same SPARC condition codes. Care must be taken to ensure the values in Sparc.h correspond to the values in SparcInstrInfo.td. I.e., SPCC::ICC_NE = 9, SPCC::FCC_U = 23 and so on.)

Inst ruct ion Operand Mapping


T he code generator backend maps instruction operands to f ields in the instruction. Operands are assigned to unbound f ields in the instruction in the order they are def ined. Fields are bound when they are assigned a value. For example, the Sparc target def ines the XNORrr instruction as a F3_1 f ormat instruction having three operands. def XNORrr : F3_1<2, 0b000111, (out s Int Regs:$dst ), (ins Int Regs:$b, Int Regs:$c), "xnor $b, $c, $dst ", [(set i32:$dst , (not (xor i32:$b, i32:$c)))]>; T he instruction templates in SparcInstrFormats.td show the base class f or F3_1 is InstSP. class Inst SP<dag out s, dag ins, st ring asmst r, list <dag> pat t ern> : Inst ruct ion { eld bit s<32> Inst ; let Namespace = "SP"; bit s<2> op; let Inst {31-30} = op; dag Out OperandList = out s; dag InOperandList = ins; let AsmSt ring = asmst r; let Pat t ern = pat t ern; } InstSP leaves the op f ield unbound. class F3<dag out s, dag ins, st ring asmst r, list <dag> pat t ern> : Inst SP<out s, ins, asmst r, pat t ern> { bit s<5> rd; bit s<6> op3; bit s<5> rs1; let op{1} = 1; // Op = 2 or 3 let Inst {29-25} = rd; let Inst {24-19} = op3; let Inst {18-14} = rs1; }

F3 binds the op f ield and def ines the rd, op3, and rs1 f ields. F3 f ormat instructions will bind the operands rd, op3, and rs1 f ields. class F3_1<bit s<2> opVal, bit s<6> op3val, dag out s, dag ins, st ring asmst r, list <dag> pat t ern> : F3<out s, ins, asmst r, pat t ern> { bit s<8> asi = 0; // asi not current ly used bit s<5> rs2; let op = opVal; let op3 = op3val; let Inst {13} = 0; // i eld = 0 let Inst {12-5} = asi; // address space ident ier let Inst {4-0} = rs2; } F3_1 binds the op3 f ield and def ines the rs2 f ields. F3_1 f ormat instructions will bind the operands to the rd, rs1, and rs2 f ields. T his results in the XNORrr instruction binding $dst, $b, and $c operands to the rd, rs1, and rs2 f ields respectively.

Inst ruct ion Relat ion Mapping


T his TableGen f eature is used to relate instructions with each other. It is particularly usef ul when you have multiple instruction f ormats and need to switch between them af ter instruction selection. T his entire f eature is driven by relation models which can be def ined in XXXInstrInfo.td f iles according to the target-specif ic instruction set. Relation models are def ined using InstrMapping class as a base. TableGen parses all the models and generates instruction relation maps using the specif ied inf ormation. Relation maps are emitted as tables in the XXXGenInstrInfo.inc f ile along with the f unctions to query them. For the detailed inf ormation on how to use this f eature, please ref er to How To Use Instruction Mappings.

Implement a subclass of TargetInstrInfo


T he f inal step is to hand code portions of XXXInstrInfo, which implements the interf ace described in TargetInstrInfo.h (see The TargetInstrInfo class). T hese f unctions return 0 or a Boolean or they assert, unless overridden. Heres a list of f unctions that are overridden f or the SPARC implementation in SparcInstrInfo.cpp: isLoadFromStackSlot If the specif ied machine instruction is a direct load f rom a stack slot, return the register number of the destination and the FrameIndex of the stack slot. isStoreToStackSlot If the specif ied machine instruction is a direct store to a stack slot, return the register number of the destination and the FrameIndex of the stack slot. copyPhysReg Copy values between a pair of physical registers. storeRegToStackSlot Store a register value to a stack slot. loadRegFromStackSlot Load a register value f rom a stack slot. storeRegToAddr Store a register value to memory. loadRegFromAddr Load a register value f rom memory. foldMemoryOperand Attempt to combine instructions of any load or store instruction f or the specif ied operand(s).

Branch Folding and If Conversion


Perf ormance can be improved by combining instructions or by eliminating instructions that are never reached. T he AnalyzeBranch method in XXXInstrInfo may be implemented to examine conditional instructions and remove unnecessary instructions. AnalyzeBranch looks at the end of a machine basic block (MBB) f or opportunities f or improvement, such as branch f olding and if conversion. T he BranchFolder and IfConverter machine f unction passes (see the source f iles BranchFolding.cpp and IfConversion.cpp in the lib/CodeGen directory) call AnalyzeBranch to improve the control f low graph that represents the instructions. Several implementations of AnalyzeBranch (f or ARM, Alpha, and X86) can be examined as models f or your own AnalyzeBranch implementation. Since SPARC does not implement a usef ul AnalyzeBranch, the ARM target implementation is shown below. AnalyzeBranch returns a Boolean value and takes f our parameters: MachineBasicBlock &MBB T he incoming block to be examined. MachineBasicBlock *&TBB A destination block that is returned. For a conditional branch that evaluates to true, TBB is the destination. MachineBasicBlock *&FBB For a conditional branch that evaluates to f alse, FBB is returned as the destination. std::vector<MachineOperand> &Cond List of operands to evaluate a condition f or a conditional branch. In the simplest case, if a block ends without a branch, then it f alls through to the successor block. No destination blocks are specif ied f or either TBB or FBB, so both parameters return NULL. T he start of the AnalyzeBranch (see code below f or the ARM target) shows the f unction parameters and the code f or the simplest case. bool ARMInst rInfo::AnalyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB, MachineBasicBlock *&FBB, st d::vect or<MachineOperand> &Cond) const { MachineBasicBlock::it erat or I = MBB.end(); if (I == MBB.begin() || !isUnpredicat edTerminat or(--I)) ret urn false; If a block ends with a single unconditional branch instruction, then AnalyzeBranch (shown below) should return the destination of that branch in the TBB parameter. if (Last Opc == ARM::B || Last Opc == ARM::t B) { TBB = Last Inst ->get Operand(0).get MBB(); ret urn false; } If a block ends with two unconditional branches, then the second branch is never reached. In that situation, as shown below, remove the last branch instruction and return the penultimate branch in the TBB parameter.

if ((SecondLast Opc == ARM::B || SecondLast Opc == ARM::t B) && (Last Opc == ARM::B || Last Opc == ARM::t B)) { TBB = SecondLast Inst ->get Operand(0).get MBB(); I = Last Inst ; I->eraseFromParent (); ret urn false; } A block may end with a single conditional branch instruction that f alls through to successor block if the condition evaluates to f alse. In that case, AnalyzeBranch (shown below) should return the destination of that conditional branch in the TBB parameter and a list of operands in the Cond parameter to evaluate the condition. if (Last Opc == ARM::Bcc || Last Opc == ARM::t Bcc) { // Block ends wit h fall-t hrough condbranch. TBB = Last Inst ->get Operand(0).get MBB(); Cond.push_back(Last Inst ->get Operand(1)); Cond.push_back(Last Inst ->get Operand(2)); ret urn false; } If a block ends with both a conditional branch and an ensuing unconditional branch, then AnalyzeBranch (shown below) should return the conditional branch destination (assuming it corresponds to a conditional evaluation of true) in the TBB parameter and the unconditional branch destination in the FBB (corresponding to a conditional evaluation of false). A list of operands to evaluate the condition should be returned in the Cond parameter. unsigned SecondLast Opc = SecondLast Inst ->get Opcode(); if ((SecondLast Opc == ARM::Bcc && Last Opc == ARM::B) || (SecondLast Opc == ARM::t Bcc && Last Opc == ARM::t B)) { TBB = SecondLast Inst ->get Operand(0).get MBB(); Cond.push_back(SecondLast Inst ->get Operand(1)); Cond.push_back(SecondLast Inst ->get Operand(2)); FBB = Last Inst ->get Operand(0).get MBB(); ret urn false; } For the last two cases (ending with a single conditional branch or ending with one conditional and one unconditional branch), the operands returned in the Cond parameter can be passed to methods of other instructions to create new branches or perf orm other operations. An implementation of AnalyzeBranch requires the helper methods RemoveBranch and InsertBranch to manage subsequent operations. AnalyzeBranch should return f alse indicating success in most circumstances. AnalyzeBranch should only return true when the method is stumped about what to do, f or example, if a block has three terminating branches. AnalyzeBranch may return true if it encounters a terminator it cannot handle, such as an indirect branch.

Instruction Selector
LLVM uses a SelectionDAG to represent LLVM IR instructions, and nodes of the SelectionDAG ideally

represent native target instructions. During code generation, instruction selection passes are perf ormed to convert non-native DAG instructions into native target-specif ic instructions. T he pass described in XXXISelDAGToDAG.cpp is used to match patterns and perf orm DAG-to-DAG instruction selection. Optionally, a pass may be def ined (in XXXBranchSelector.cpp) to perf orm similar DAG-to-DAG operations f or branch instructions. Later, the code in XXXISelLowering.cpp replaces or removes operations and data types not supported natively (legalizes) in a SelectionDAG. TableGen generates code f or instruction selection using the f ollowing target description input f iles: XXXInstrInfo.td Contains def initions of instructions in a target-specif ic instruction set, generates XXXGenDAGISel.inc, which is included in XXXISelDAGToDAG.cpp. XXXCallingConv.td Contains the calling and return value conventions f or the target architecture, and it generates XXXGenCallingConv.inc, which is included in XXXISelLowering.cpp. T he implementation of an instruction selection pass must include a header that declares the FunctionPass class or a subclass of FunctionPass. In XXXTargetMachine.cpp, a Pass Manager (PM) should add each instruction selection pass into the queue of passes to run. T he LLVM static compiler (llc) is an excellent tool f or visualizing the contents of DAGs. To display the SelectionDAG bef ore or af ter specif ic processing phases, use the command line options f or llc, described at SelectionDAG Instruction Selection Process. To describe instruction selector behavior, you should add patterns f or lowering LLVM code into a SelectionDAG as the last parameter of the instruction def initions in XXXInstrInfo.td. For example, in SparcInstrInfo.td, this entry def ines a register store operation, and the last parameter describes a pattern with the store DAG operator. def STrr : F3_1< 3, 0b000100, (out s), (ins MEMrr:$addr, Int Regs:$src), "st $src, [$addr]", [(st ore i32:$src, ADDRrr:$addr)]>; ADDRrr is a memory mode that is also def ined in SparcInstrInfo.td: def ADDRrr : ComplexPat t ern<i32, 2, "Select ADDRrr", [], []>; T he def inition of ADDRrr ref ers to SelectADDRrr, which is a f unction def ined in an implementation of the Instructor Selector (such as SparcISelDAGToDAG.cpp). In lib/Target/TargetSelectionDAG.td, the DAG operator f or store is def ined below: def st ore : Pat Frag<(ops node:$val, node:$pt r), (st node:$val, node:$pt r), [{ if (St oreSDNode *ST = dyn_cast <St oreSDNode>(N)) ret urn !ST->isTruncat ingSt ore() && ST->get AddressingMode() == ISD::UNINDEXED; ret urn false; }]>;

XXXInstrInfo.td also generates (in XXXGenDAGISel.inc) the SelectCode method that is used to call the appropriate processing method f or an instruction. In this example, SelectCode calls Select_ISD_STORE f or the ISD::STORE opcode. SDNode *Select Code(SDValue N) { ... MVT::ValueType NVT = N.get Node()->get ValueType(0); swit ch (N.get Opcode()) { case ISD::STORE: { swit ch (NVT) { default : ret urn Select _ISD_STORE(N); break; } break; } ... T he pattern f or STrr is matched, so elsewhere in XXXGenDAGISel.inc, code f or STrr is created f or Select_ISD_STORE. T he Emit_22 method is also generated in XXXGenDAGISel.inc to complete the processing of this instruction. SDNode *Select _ISD_STORE(const SDValue &N) { SDValue Chain = N.get Operand(0); if (Predicat e_st ore(N.get Node())) { SDValue N1 = N.get Operand(1); SDValue N2 = N.get Operand(2); SDValue CPTmp0; SDValue CPTmp1; // Pat t ern: (st :void i32:i32:$src, // ADDRrr:i32:$addr)<<P:Predicat e_st ore>> // Emit s: (STrr:void ADDRrr:i32:$addr, Int Regs:i32:$src) // Pat t ern complexit y = 13 cost = 1 size = 0 if (Select ADDRrr(N, N2, CPTmp0, CPTmp1) && N1.get Node()->get ValueType(0) == MVT::i32 && N2.get Node()->get ValueType(0) == MVT::i32) { ret urn Emit _22(N, SP::STrr, CPTmp0, CPTmp1); }

...

The Select ionDAG Legalize Phase


T he Legalize phase converts a DAG to use types and operations that are natively supported by the target. For natively unsupported types and operations, you need to add code to the target-specif ic XXXTargetLowering implementation to convert unsupported types and operations to supported ones. In the constructor f or the XXXTargetLowering class, f irst use the addRegisterClass method to specif y which types are supported and which register classes are associated with them. T he code f or the register classes are generated by TableGen f rom XXXRegisterInfo.td and placed in XXXGenRegisterInfo.h.inc. For example, the implementation of the constructor f or the SparcTargetLowering class (in SparcISelLowering.cpp) starts with the f ollowing code:

addRegist erClass(MVT::i32, SP::Int RegsRegist erClass); addRegist erClass(MVT::f32, SP::FPRegsRegist erClass); addRegist erClass(MVT::f64, SP::DFPRegsRegist erClass); You should examine the node types in the ISD namespace ( include/llvm/CodeGen/SelectionDAGNodes.h) and determine which operations the target natively supports. For operations that do not have native support, add a callback to the constructor f or the XXXTargetLowering class, so the instruction selection process knows what to do. T he TargetLowering class callback methods (declared in llvm/Target/TargetLowering.h) are: setOperationAction General operation. setLoadExtAction Load with extension. setTruncStoreAction Truncating store. setIndexedLoadAction Indexed load. setIndexedStoreAction Indexed store. setConvertAction Type conversion. setCondCodeAction Support f or a given condition code. Note: on older releases, setLoadXAction is used instead of setLoadExtAction. Also, on older releases, setCondCodeAction may not be supported. Examine your release to see what methods are specif ically supported. T hese callbacks are used to determine that an operation does or does not work with a specif ied type (or types). And in all cases, the third parameter is a LegalAction type enum value: Promote, Expand, Custom, or Legal. SparcISelLowering.cpp contains examples of all f our LegalAction values.

Expand
For a type without native support, a value may need to be broken down f urther, rather than promoted. For an operation without native support, a combination of other operations may be used to similar ef f ect. In SPARC, the f loating-point sine and cosine trig operations are supported by expansion to other operations, as indicated by the third parameter, Expand, to setOperationAction: set Operat ionAct ion(ISD::FSIN, MVT::f32, Expand); set Operat ionAct ion(ISD::FCOS, MVT::f32, Expand);

Cust om
For some operations, simple type promotion or operation expansion may be insuf f icient. In some cases, a special intrinsic f unction must be implemented. For example, a constant value may require special treatment, or an operation may require spilling and restoring registers in the stack and working with register allocators.

As seen in SparcISelLowering.cpp code below, to perf orm a type conversion f rom a f loating point value to a signed integer, f irst the setOperationAction should be called with Custom as the third parameter: set Operat ionAct ion(ISD::FP_TO_SINT, MVT::i32, Cust om); In the LowerOperation method, f or each Custom operation, a case statement should be added to indicate what f unction to call. In the f ollowing code, an FP_TO_SINT opcode will call the LowerFP_TO_SINT method: SDValue SparcTarget Lowering::LowerOperat ion(SDValue Op, Select ionDAG &DAG) { swit ch (Op.get Opcode()) { case ISD::FP_TO_SINT: ret urn LowerFP_TO_SINT(Op, DAG); ... } } Finally, the LowerFP_TO_SINT method is implemented, using an FP register to convert the f loating-point value to an integer. st at ic SDValue LowerFP_TO_SINT(SDValue Op, Select ionDAG &DAG) { assert (Op.get ValueType() == MVT::i32); Op = DAG.get Node(SPISD::FTOI, MVT::f32, Op.get Operand(0)); ret urn DAG.get Node(ISD::BITCAST, MVT::i32, Op); }

Legal
T he Legal LegalizeAction enum value simply indicates that an operation is natively supported. Legal represents the def ault condition, so it is rarely used. In SparcISelLowering.cpp, the action f or CTPOP (an operation to count the bits set in an integer) is natively supported only f or SPARC v9. T he f ollowing code enables the Expand conversion technique f or non-v9 SPARC implementations. set Operat ionAct ion(ISD::CTPOP, MVT::i32, Expand); ... if (TM.get Subt arget <SparcSubt arget >().isV9()) set Operat ionAct ion(ISD::CTPOP, MVT::i32, Legal);

Calling Convent ions


To support target-specif ic calling conventions, XXXGenCallingConv.td uses interf aces (such as CCIfType and CCAssignToReg) that are def ined in lib/Target/TargetCallingConv.td. TableGen can take the target descriptor f ile XXXGenCallingConv.td and generate the header f ile XXXGenCallingConv.inc, which is typically included in XXXISelLowering.cpp. You can use the interf aces in TargetCallingConv.td to specif y: T he order of parameter allocation. Where parameters and return values are placed (that is, on the stack or in registers).

Which registers may be used. Whether the caller or callee unwinds the stack. T he f ollowing example demonstrates the use of the CCIfType and CCAssignToReg interf aces. If the CCIfType predicate is true (that is, if the current argument is of type f32 or f64), then the action is perf ormed. In this case, the CCAssignToReg action assigns the argument value to the f irst available register: either R0 or R1. CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>> SparcCallingConv.td contains def initions f or a target-specif ic return-value calling convention (RetCC_Sparc32) and a basic 32-bit C calling convention (CC_Sparc32). T he def inition of RetCC_Sparc32 (shown below) indicates which registers are used f or specif ied scalar return types. A single-precision f loat is returned to register F0, and a double-precision f loat goes to register D0. A 32-bit integer is returned in register I0 or I1. def Ret CC_Sparc32 : CallingConv<[ CCIfType<[i32], CCAssignToReg<[I0, I1]>>, CCIfType<[f32], CCAssignToReg<[F0]>>, CCIfType<[f64], CCAssignToReg<[D0]>> ]>; T he def inition of CC_Sparc32 in SparcCallingConv.td introduces CCAssignToStack, which assigns the value to a stack slot with the specif ied size and alignment. In the example below, the f irst parameter, 4, indicates the size of the slot, and the second parameter, also 4, indicates the stack alignment along 4-byte units. (Special cases: if size is zero, then the ABI size is used; if alignment is zero, then the ABI alignment is used.) def CC_Sparc32 : CallingConv<[ // All argument s get passed in int eger regist ers if t here is space. CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>, CCAssignToSt ack<4, 4> ]>; CCDelegateTo is another commonly used interf ace, which tries to f ind a specif ied sub-calling convention, and, if a match is f ound, it is invoked. In the f ollowing example (in X86CallingConv.td), the def inition of RetCC_X86_32_C ends with CCDelegateTo. Af ter the current value is assigned to the register ST0 or ST1, the RetCC_X86Common is invoked. def Ret CC_X86_32_C : CallingConv<[ CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>, CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>, CCDelegat eTo<Ret CC_X86Common> ]>; CCIfCC is an interf ace that attempts to match the given name to the current calling convention. If the name identif ies the current calling convention, then a specif ied action is invoked. In the f ollowing example (in X86CallingConv.td), if the Fast calling convention is in use, then RetCC_X86_32_Fast is invoked. If the

SSECall calling convention is in use, then RetCC_X86_32_SSE is invoked. def Ret CC_X86_32 : CallingConv<[ CCIfCC<"CallingConv::Fast ", CCDelegat eTo<Ret CC_X86_32_Fast >>, CCIfCC<"CallingConv::X86_SSECall", CCDelegat eTo<Ret CC_X86_32_SSE>>, CCDelegat eTo<Ret CC_X86_32_C> ]>; Other calling convention interf aces include: CCIf <predicate, action> If the predicate matches, apply the action. CCIfInReg <action> If the argument is marked with the inreg attribute, then apply the action. CCIfNest <action> If the argument is marked with the nest attribute, then apply the action. CCIfNotVarArg <action> If the current f unction does not take a variable number of arguments, apply the action. CCAssignToRegWithShadow <registerList, shadowList> similar to CCAssignToReg, but with a shadow list of registers. CCPassByVal <size, align> Assign value to a stack slot with the minimum specif ied size and alignment. CCPromoteToType <type> Promote the current value to the specif ied type. CallingConv <[actions]> Def ine each calling convention that is supported.

Assembly Printer
During the code emission stage, the code generator may utilize an LLVM pass to produce assembly output. To do this, you want to implement the code f or a printer that converts LLVM IR to a GAS-f ormat assembly language f or your target machine, using the f ollowing steps: Def ine all the assembly strings f or your target, adding them to the instructions def ined in the XXXInstrInfo.td f ile. (See Instruction Set.) TableGen will produce an output f ile (XXXGenAsmWriter.inc) with an implementation of the printInstruction method f or the XXXAsmPrinter class. Write XXXTargetAsmInfo.h, which contains the bare-bones declaration of the XXXTargetAsmInfo class (a subclass of TargetAsmInfo). Write XXXTargetAsmInfo.cpp, which contains target-specif ic values f or TargetAsmInfo properties and sometimes new implementations f or methods. Write XXXAsmPrinter.cpp, which implements the AsmPrinter class that perf orms the LLVM-toassembly conversion. T he code in XXXTargetAsmInfo.h is usually a trivial declaration of the XXXTargetAsmInfo class f or use in XXXTargetAsmInfo.cpp. Similarly, XXXTargetAsmInfo.cpp usually has a f ew declarations of XXXTargetAsmInfo replacement values that override the def ault values in TargetAsmInfo.cpp. For example in SparcTargetAsmInfo.cpp:

SparcTarget AsmInfo::SparcTarget AsmInfo(const SparcTarget Machine &TM) { Dat a16bit sDirect ive = "\t .half\t "; Dat a32bit sDirect ive = "\t .word\t "; Dat a64bit sDirect ive = 0; // .xword is only support ed by V9. ZeroDirect ive = "\t .skip\t "; Comment St ring = "!"; Const ant PoolSect ion = "\t .sect ion \".rodat a\",#alloc\n"; } T he X86 assembly printer implementation (X86TargetAsmInfo) is an example where the target specif ic TargetAsmInfo class uses an overridden methods: ExpandInlineAsm. A target-specif ic implementation of AsmPrinter is written in XXXAsmPrinter.cpp, which implements the AsmPrinter class that converts the LLVM to printable assembly. T he implementation must include the f ollowing headers that have declarations f or the AsmPrinter and MachineFunctionPass classes. T he MachineFunctionPass is a subclass of FunctionPass. #include "llvm/CodeGen/AsmPrint er.h" #include "llvm/CodeGen/MachineFunct ionPass.h" As a FunctionPass, AsmPrinter f irst calls doInitialization to set up the AsmPrinter. In SparcAsmPrinter, a Mangler object is instantiated to process variable names. In XXXAsmPrinter.cpp, the runOnMachineFunction method (declared in MachineFunctionPass) must be implemented f or XXXAsmPrinter. In MachineFunctionPass, the runOnFunction method invokes runOnMachineFunction. Target-specif ic implementations of runOnMachineFunction dif f er, but generally do the f ollowing to process each machine f unction: Call SetupMachineFunction to perf orm initialization. Call EmitConstantPool to print out (to the output stream) constants which have been spilled to memory. Call EmitJumpTableInfo to print out jump tables used by the current f unction. Print out the label f or the current f unction. Print out the code f or the f unction, including basic block labels and the assembly f or the instruction (using printInstruction) T he XXXAsmPrinter implementation must also include the code generated by TableGen that is output in the XXXGenAsmWriter.inc f ile. T he code in XXXGenAsmWriter.inc contains an implementation of the printInstruction method that may call these methods: printOperand printMemOperand printCCOperand (f or conditional statements) printDataDirective printDeclare printImplicitDef

printInlineAsm T he implementations of printDeclare, printImplicitDef, printInlineAsm, and printLabel in AsmPrinter.cpp are generally adequate f or printing assembly and do not need to be overridden. T he printOperand method is implemented with a long switch/case statement f or the type of operand: register, immediate, basic block, external symbol, global address, constant pool index, or jump table index. For an instruction with a memory address operand, the printMemOperand method should be implemented to generate the proper output. Similarly, printCCOperand should be used to print a conditional operand. doFinalization should be overridden in XXXAsmPrinter, and it should be called to shut down the assembly printer. During doFinalization, global variables and constants are printed to output.

Subtarget Support
Subtarget support is used to inf orm the code generation process of instruction set variations f or a given chip set. For example, the LLVM SPARC implementation provided covers three major versions of the SPARC microprocessor architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a 64-bit architecture), and the UltraSPARC architecture. V8 has 16 double-precision f loating-point registers that are also usable as either 32 single-precision or 8 quad-precision registers. V8 is also purely big-endian. V9 has 32 doubleprecision f loating-point registers that are also usable as 16 quad-precision registers, but cannot be used as single-precision registers. T he UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set extensions. If subtarget support is needed, you should implement a target-specif ic XXXSubtarget class f or your architecture. T his class should process the command-line options -mcpu= and -mattr=. TableGen uses def initions in the Target.td and Sparc.td f iles to generate code in SparcGenSubtarget.inc. In Target.td, shown below, the SubtargetFeature interf ace is def ined. T he f irst 4 string parameters of the SubtargetFeature interf ace are a f eature name, an attribute set by the f eature, the value of the attribute, and a description of the f eature. (T he f if th parameter is a list of f eatures whose presence is implied, and its def ault value is an empty array.) class Subt arget Feat ure<st ring n, st ring a, st ring v, st ring d, list <Subt arget Feat ure> i = []> { st ring Name = n; st ring At t ribut e = a; st ring Value = v; st ring Desc = d; list <Subt arget Feat ure> Implies = i; } In the Sparc.td f ile, the SubtargetFeature is used to def ine the f ollowing f eatures. def Feat ureV9 : Subt arget Feat ure<"v9", "IsV9", "t rue", "Enable SPARC-V9 inst ruct ions">; def Feat ureV8Deprecat ed : Subt arget Feat ure<"deprecat ed-v8", "V8Deprecat edInst s", "t rue", "Enable deprecat ed V8 inst ruct ions in V9 mode">; def Feat ureVIS : Subt arget Feat ure<"vis", "IsVIS", "t rue", "Enable Ult raSPARC Visual Inst ruct ion Set ext ensions">;

Elsewhere in Sparc.td, the Proc class is def ined and then is used to def ine particular SPARC processor subtypes that may have the previously described f eatures. class Proc<st ring Name, list <Subt arget Feat ure> Feat ures> : Processor<Name, NoIt ineraries, Feat ures>; def def def def def def def def def def def def def : : : : : : : : : : : : : Proc<"generic", []>; Proc<"v8", []>; Proc<"supersparc", []>; Proc<"sparclit e", []>; Proc<"f934", []>; Proc<"hypersparc", []>; Proc<"sparclit e86x", []>; Proc<"sparclet ", []>; Proc<"t sc701", []>; Proc<"v9", [Feat ureV9]>; Proc<"ult rasparc", [Feat ureV9, Feat ureV8Deprecat ed]>; Proc<"ult rasparc3", [Feat ureV9, Feat ureV8Deprecat ed]>; Proc<"ult rasparc3-vis", [Feat ureV9, Feat ureV8Deprecat ed, Feat ureVIS]>;

From Target.td and Sparc.td f iles, the resulting SparcGenSubtarget.inc specif ies enum values to identif y the f eatures, arrays of constants to represent the CPU f eatures and CPU subtypes, and the ParseSubtargetFeatures method that parses the f eatures string that sets specif ied subtarget options. T he generated SparcGenSubtarget.inc f ile should be included in the SparcSubtarget.cpp. T he target-specif ic implementation of the XXXSubtarget method should f ollow this pseudocode: XXXSubt arget ::XXXSubt arget (const Module &M, const st d::st ring &FS) { // Set t he default feat ures // Det ermine default and user specied charact erist ics of t he CPU // Call ParseSubt arget Feat ures(FS, CPU) t o parse t he feat ures st ring // Perform any addit ional operat ions }

JIT Support
T he implementation of a target machine optionally includes a Just-In-Time (JIT ) code generator that emits machine code and auxiliary structures as binary output that can be written directly to memory. To do this, implement JIT code generation by perf orming the f ollowing steps: Write an XXXCodeEmitter.cpp f ile that contains a machine f unction pass that transf orms targetmachine instructions into relocatable machine code. Write an XXXJITInfo.cpp f ile that implements the JIT interf aces f or target-specif ic code-generation activities, such as emitting machine code and stubs. Modif y XXXTargetMachine so that it provides a TargetJITInfo object through its getJITInfo method. T here are several dif f erent approaches to writing the JIT support code. For instance, TableGen and target descriptor f iles may be used f or creating a JIT code generator, but are not mandatory. For the Alpha and PowerPC target machines, TableGen is used to generate XXXGenCodeEmitter.inc, which contains the binary coding of machine instructions and the getBinaryCodeForInstr method to access those codes. Other JIT

implementations do not. Both XXXJITInfo.cpp and XXXCodeEmitter.cpp must include the llvm/CodeGen/MachineCodeEmitter.h header f ile that def ines the MachineCodeEmitter class containing code f or several callback f unctions that write data (in bytes, words, strings, etc.) to the output stream.

Machine Code Emit t er


In XXXCodeEmitter.cpp, a target-specif ic of the Emitter class is implemented as a f unction pass (subclass of MachineFunctionPass). T he target-specif ic implementation of runOnMachineFunction (invoked by runOnFunction in MachineFunctionPass) iterates through the MachineBasicBlock calls emitInstruction to process each instruction and emit binary code. emitInstruction is largely implemented with case statements on the instruction types def ined in XXXInstrInfo.h. For example, in X86CodeEmitter.cpp, the emitInstruction method is built around the f ollowing switch/case statements: swit ch (Desc->TSFlags & X86::FormMask) { case X86II::Pseudo: // for not yet implement ed inst ruct ions ... // or pseudo-inst ruct ions break; case X86II::RawFrm: // for inst ruct ions wit h a xed opcode value ... break; case X86II::AddRegFrm: // for inst ruct ions t hat have one regist er operand ... // added t o t heir opcode break; case X86II::MRMDest Reg:// for inst ruct ions t hat use t he Mod/RM byt e ... // t o specify a dest inat ion (regist er) break; case X86II::MRMDest Mem:// for inst ruct ions t hat use t he Mod/RM byt e ... // t o specify a dest inat ion (memory) break; case X86II::MRMSrcReg: // for inst ruct ions t hat use t he Mod/RM byt e ... // t o specify a source (regist er) break; case X86II::MRMSrcMem: // for inst ruct ions t hat use t he Mod/RM byt e ... // t o specify a source (memory) break; case X86II::MRM0r: case X86II::MRM1r: // for inst ruct ions t hat operat e on case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and case X86II::MRM4r: case X86II::MRM5r: // use t he Mod/RM byt e and a eld case X86II::MRM6r: case X86II::MRM7r: // t o hold ext ended opcode dat a ... break; case X86II::MRM0m: case X86II::MRM1m: // for inst ruct ions t hat operat e on case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and case X86II::MRM4m: case X86II::MRM5m: // use t he Mod/RM byt e and a eld case X86II::MRM6m: case X86II::MRM7m: // t o hold ext ended opcode dat a ... break; case X86II::MRMInit Reg: // for inst ruct ions whose source and ... // dest inat ion are t he same regist er break; }

T he implementations of these case statements of ten f irst emit the opcode and then get the operand(s). T hen depending upon the operand, helper methods may be called to process the operand(s). For example, in X86CodeEmitter.cpp, f or the X86II::AddRegFrm case, the f irst data emitted (by emitByte) is the opcode added to the register operand. T hen an object representing the machine operand, MO1, is extracted. T he helper methods such as isImmediate, isGlobalAddress, isExternalSymbol, isConstantPoolIndex, and isJumpTableIndex determine the operand type. (X86CodeEmitter.cpp also has private methods such as emitConstant, emitGlobalAddress, emitExternalSymbolAddress, emitConstPoolAddress, and emitJumpTableAddress that emit the data into the output stream.) case X86II::AddRegFrm: MCE.emit Byt e(BaseOpcode + get X86RegNum(MI.get Operand(CurOp++).get Reg())); if (CurOp != NumOps) { const MachineOperand &MO1 = MI.get Operand(CurOp++); unsigned Size = X86Inst rInfo::sizeOfImm(Desc); if (MO1.isImmediat e()) emit Const ant (MO1.get Imm(), Size); else { unsigned rt = Is64Bit Mode ? X86::reloc_pcrel_word : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolut e_word); if (Opcode == X86::MOV64ri) rt = X86::reloc_absolut e_dword; // FIXME: add X86II ag? if (MO1.isGlobalAddress()) { bool NeedSt ub = isa<Funct ion>(MO1.get Global()); bool isLazy = gvNeedsLazyPt r(MO1.get Global()); emit GlobalAddress(MO1.get Global(), rt , MO1.get Oset (), 0, NeedSt ub, isLazy); } else if (MO1.isExt ernalSymbol()) emit Ext ernalSymbolAddress(MO1.get SymbolName(), rt ); else if (MO1.isConst ant PoolIndex()) emit Const PoolAddress(MO1.get Index(), rt ); else if (MO1.isJumpTableIndex()) emit JumpTableAddress(MO1.get Index(), rt ); } } break; In the previous example, XXXCodeEmitter.cpp uses the variable rt, which is a RelocationType enum that may be used to relocate addresses (f or example, a global address with a PIC base of f set). T he RelocationType enum f or that target is def ined in the short target-specif ic XXXRelocations.h f ile. T he RelocationType is used by the relocate method def ined in XXXJITInfo.cpp to rewrite addresses f or ref erenced global symbols. For example, X86Relocations.h specif ies the f ollowing relocation types f or the X86 addresses. In all f our cases, the relocated value is added to the value already in memory. For reloc_pcrel_word and reloc_picrel_word, there is an additional initial adjustment. enum Relocat ionType { reloc_pcrel_word = 0, // add reloc value aft er adjust ing for t he PC loc reloc_picrel_word = 1, // add reloc value aft er adjust ing for t he PIC base reloc_absolut e_word = 2, // absolut e relocat ion; no addit ional adjust ment reloc_absolut e_dword = 3 // absolut e relocat ion; no addit ional adjust ment };

Target JIT Inf o


XXXJITInfo.cpp implements the JIT interf aces f or target-specif ic code-generation activities, such as emitting machine code and stubs. At minimum, a target-specif ic version of XXXJITInfo implements the f ollowing: getLazyResolverFunction Initializes the JIT, gives the target a f unction that is used f or compilation. emitFunctionStub Returns a native f unction with a specif ied address f or a callback f unction. relocate Changes the addresses of ref erenced globals, based on relocation types. Callback f unction that are wrappers to a f unction stub that is used when the real target is not initially known. getLazyResolverFunction is generally trivial to implement. It makes the incoming parameter as the global JITCompilerFunction and returns the callback f unction that will be used a f unction wrapper. For the Alpha target (in AlphaJITInfo.cpp), the getLazyResolverFunction implementation is simply: Target JITInfo::LazyResolverFn AlphaJITInfo::get LazyResolverFunct ion( JITCompilerFn F) { JITCompilerFunct ion = F; ret urn AlphaCompilat ionCallback; } For the X86 target, the getLazyResolverFunction implementation is a little more complicated, because it returns a dif f erent callback f unction f or processors with SSE instructions and XMM registers. T he callback f unction initially saves and later restores the callee register values, incoming arguments, and f rame and return address. T he callback f unction needs low-level access to the registers or stack, so it is typically implemented with assembler.

You might also like