BCPL Tutorial
BCPL Tutorial
User Guide
by
Martin Richards
[email protected]
http://www.cl.cam.ac.uk/users/mr10/
Computer Laboratory
University of Cambridge
Revision date: Sat 30 Oct 16:32:25 BST 2021
Abstract
BCPL is a simple systems programming language with a small fast compiler which
is easily ported to new machines. The language was first implemented in 1967
and has been in continuous use since then. It is a typeless and provides machine
independent pointer arithmetic allowing a simple way to represent vectors and
structures. BCPL functions are recursive and variadic but, like C, do not allow
dynamic free variables, and so can be represented by just their entry addresses.
There is no built-in garbage collector and all input-output is done using library
calls.
This document describes both the single threaded BCPL Cintcode System
(called Cintsys) and the Cintcode version of the Tripos portable operating system
(called Cintpos). It gives the definition of standard BCPL including the recently
added features such as floating point expressions and constructs involving oper-
ators such as <> and op:=. The language has recently been extended to include
some of the pattern matching features of MCPL. This manual also describes the
standard library and running environment and the native code version of the
system based on Sial. Installation instructions are included. Since May 2013,
the standard BCPL distribution supports both 32 and 64 bit Cintcode versions.
Since August 2014, standard Cintcode BCPL includes floating point constants
and operators, and since March 2018 it includes the FLT feature to make it easier
to perform floating point calculations. Pattern matching was added in September
2021. These extensions are now in the standard BCPL distribution.
Keywords: Systems programming language, BCPL, Cintcode, Cintpos
2
Contents
Preface vii
i
ii CONTENTS
3 The Library 43
3.1 Manifest constants . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Global Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.1 Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.3.2 The Filing System . . . . . . . . . . . . . . . . . . . . . . 92
3.4 Random Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.5 RAM streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.6 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . 95
3.7 Coroutine examples . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.7.1 A square wave generator . . . . . . . . . . . . . . . . . . . 97
3.7.2 Hamming’s Problem . . . . . . . . . . . . . . . . . . . . . 98
3.7.3 A Discrete Event Simulator . . . . . . . . . . . . . . . . . 100
3.8 The BMP Graphics Library . . . . . . . . . . . . . . . . . . . . . 106
3.8.1 The Graphics Functions . . . . . . . . . . . . . . . . . . . 107
3.9 The SDL Graphics Library . . . . . . . . . . . . . . . . . . . . . . 110
3.9.1 sdl.h details . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.9.2 Functions defined in sdl.b . . . . . . . . . . . . . . . . . . 112
3.9.3 sys(Sys sdl,...) calls . . . . . . . . . . . . . . . . . . . 117
3.10 The GL Graphics Library . . . . . . . . . . . . . . . . . . . . . . 120
3.11 The Sound Library . . . . . . . . . . . . . . . . . . . . . . . . . . 121
3.11.1 The Sound Constants . . . . . . . . . . . . . . . . . . . . . 121
3.11.2 The Sound Global Variables . . . . . . . . . . . . . . . . . 122
3.11.3 The Sound Functions . . . . . . . . . . . . . . . . . . . . . 122
3.12 The EXT Library . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
CONTENTS iii
12 Installation 239
12.1 Linux Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
12.2 Command Line Arguments . . . . . . . . . . . . . . . . . . . . . . 243
12.3 Installation on Other Machines . . . . . . . . . . . . . . . . . . . 243
12.4 Installation for Windows XP . . . . . . . . . . . . . . . . . . . . . 243
12.5 Installation using Cygwin . . . . . . . . . . . . . . . . . . . . . . 244
12.6 Installation for Windows CE2.0 . . . . . . . . . . . . . . . . . . . 245
12.7 The Native Code Version . . . . . . . . . . . . . . . . . . . . . . . 245
Bibliography 257
The concept for BCPL originated in 1966 and was first outlined in my PhD
thesis [4]. Its was first implemented early in 1967 when I was working at M.I.T.
Its heyday was perhaps from the mid 70s to the mid 80s, but even now it is still
continues to be used at some universities, in industry and by private individuals.
It is a useful language for experimenting with algorithms and for research in
optimizing compilers. Cintpos is the multi-tasking version of the system based
on the Tripos [5]. It is simple and easy to maintain and can be used for real-time
applications such as process control. BCPL was designed many years ago but
is still useful in areas where small size, simplicity and portability are important.
Recently I have decided to augment BCPL with some of the features of MCPL
including particularly the pattern matching mechanism used in the definition of
functions.
This document is intended to provide a record of the main features of the
BCPL in sufficient depth to allow a serious reader to obtain a proper understand-
ing of philosophy behind the language. An efficient interpretive implementation
is presented, the source of which is freely available via my home page [3]. The
implementation is machine independent and should be easy to transfer to almost
any architecture both now and in the future.
The main topics covered by this report are:
A specification of the BCPL language.
A description of its runtime library and the extensions used in the Cintpos
system.
The design and implementation of command language interpreters for both
the single and multi-threaded versions of the system.
A description of OCODE, the intermediate code used in the compiler, and
Cintcode, the compact byte stream target code used by the interpreter.
A description of the single and multi-threaded interactive debugger and
other debugging aids.
The efficient implementation of the Cintcode interpreter for several proces-
sors including both RISC and i386/Pentium based machines.
vii
viii CONTENTS
The SIAL intermediate code that allows easy translation of BCPL in native
code for most architectures, including, for instance, the Raspberry Pi.
MR
Chapter 1
Valid arguments:
1
2 CHAPTER 1. THE SYSTEM OVERVIEW
The Cintsys system is normally started using the cintsys shell command. If
this demonstration was run on a machine called c223 in the directory cintcode,
its opening message was as follows.
c223$ cintsys
BCPL 32-bit Cintcode System (6 Sept 2019)
0.000>
The characters 0.000> are followed by a space character and is the command
language prompt string inviting the user to type a command. The number gives
the execution time in seconds of the preceeding command. A program called
fact.b in directory cintcode/com to compute factorials can be displayed using
the type command as follows:
0.000> type com/fact.b
GET "libhdr"
LET start() = VALOF
{ FOR i = 1 TO 5 DO writef("fact(%n) = %i4*n", i, fact(i))
RESULTIS 0
}
AND fact(n) = n=0 -> 1, n*fact(n-1)
0.000>
is the heading for the declaration of the function start which, by convention, is
the first function to be called when a program is run. The empty parentheses ()
indicate that the function expects no arguments. The text
FOR i = 1 TO 5 DO
introduces a for-loop whose control variable i successively takes the values from
1 to 5. The body of the for-loop is a call of the library function writef whose
effect is to output the format string after replacing the substitution items %n
and %i4 by appropriately formatted representations of i and fact(i). Within
the string *n represents the newline character. The statement RESULTIS 0 exits
from the VALOF construct providing the result of start that indicates the program
completed successfully. The text:
AND fact(n) =
1.1. A CINTSYS CONSOLE SESSION 3
introduces the definition of the function fact which take one argument (n) and
yields n factorial. The word AND causes fact to available to the previously defined
function. This program can be compiled by using the following command:
0.000> bcpl com/fact.b to fact
BCPL (3 Sep 2019) 32 bit with the FLT feature
Code size = 104 bytes 0f 32-bit little ender Cintcode
0.030>
This command compiles the source file fact.b creating an executable object
module in the file called fact. The program can then be run by simply typing
the name of this file.
0.030> fact
fact(1) = 1
fact(2) = 2
fact(3) = 6
fact(4) = 24
fact(5) = 120
0.011>
40: RTN
44: L9920:
44: DATAW 0x6361660F
48: DATAW 0x6E252874
52: DATAW 0x203D2029
56: DATAW 0x0A346925
60: DATAW 0x0000DFDF
64: DATAW 0x6361660B
68: DATAW 0x20202074
72: DATAW 0x20202020
// Entry to: fact
76: L2:
76: JNE0 L5
78: L1
79: RTN
80: L5:
80: LM1
81: AP3
82: LF L2
84: K4
85: LP3
86: MUL
87: RTN
88: L3:
88: DATAW 0x00000000
92: DATAW 0x00000001
96: DATAW 0x00000014
100: DATAW 0x0000005E
Code size = 104 bytes 0f 32-bit little ender Cintcode
0.072>
This output shows the sequence of Cintcode instructions compiled for the func-
tions start and fact. In addition there are some data words holding the string
constant, initialisation data and symbolic information for the debugger. The
data word at location 4 holds a special bit pattern indicating the presence of a
function name placed just before the entry point. As can be seen the name in
this case is start. Similar information is packed at location 60 for the function
fact. Most Cintcode instructions occupy one byte and perform simple opera-
tions on the registers and memory of the Cintcode machine. For instance, the
first two instructions of start (L1 and SP3 at locations 20 and 21) load the
constant 1 into the Cintcode A register and then stores it at word 3 of the cur-
rent stack frame (pointed to by P). This corresponds to the initialisation of the
for-loop control variable i. The start of the for-loop body has label L4 corre-
sponding to location 22. The compilation of fact(i) is LP3 LF L2 K9 which
loads i and the entry address of fact and enters the function incrementing P
by 9 locations. The result of this function is returned in A which is stored in
the stack using SP9 in the appropriate position for the third argument of the
call of writef. The second argument, i, is setup using LP3 SP8, and the first
argument which is the format string is loaded by LLL L9920. The next instruc-
tion (K4G 94) causes the routine writef, whose entry point is in global variable
1.1. A CINTSYS CONSOLE SESSION 5
The asterisk is the prompt inviting the user to enter a debugging command. The
debugger provides facilities for inspecting and changing memory as well as setting
breakpoints and performing single step execution. As an example, a breakpoint
is placed at the first instruction of the routine clihook which is used by the
command language interpreter (CLI) to transfer control to a command. Consider
the following commands:
* g4 b1
* b
1: clihook
*
This first loads the entry point of clihook (held in global variable 4) and sets
(b1) a breakpoint numbered 1 at this position. The command b, without an
argument, lists the current breakpoints confirming that the correct one has been
set. Normal execution is continued using the c command.
* c
0.010>
If we now try to execute the factorial program, we immediately hit the break-
point.
0.000> fact
!! BPT 1: clihook
A= 0 B= 0 17940: K4G 1
*
6 CHAPTER 1. THE SYSTEM OVERVIEW
This indicates that the breakpoint occurred when the Cintcode registers A and
B were both zero, and that the program counter is set to 17940 where the next
instruction to be obeyed is K4G 1. Single step exection can now be performed
using the \ command.
* \ A= 0 B= 0 46276: L1
* \ A= 1 B= 0 46277: SP3
* \ A= 1 B= 0 46278: LP3
*
After each single step execution a summary of the current state is printed. In the
above sequence we see that the execution of the instruction L1 loading 1 into the
A register. The execution of SP3 does not have an immediately observable effect
since it updates a local variable held in the current stack frame, but the stack
frame can be displayed using the t command.
* p t4
P 0: 46420 17942 start 1
*
This confirms that location P3 contains the value 1 corresponding to the initial
value of the for-loop control variable i. At this stage it is possible to change its
value to 3, say.
* 3 sp3
* p t4
P 0: 46420 17942 start 3
*
* \ A= 0 B= 0 46334: L1
* \ A= 1 B= 0 46335: RTN
* \ A= 1 B= 0 46341: LP3
* \ A= 1 B= 1 46342: MUL
* \ A= 1 B= 1 46343: RTN
* \ A= 1 B= 1 46341: LP3
* \ A= 2 B= 1 46342: MUL
* \ A= 2 B= 1 46343: RTN
* \ A= 2 B= 1 46341: LP3
* \ A= 3 B= 2 46342: MUL
* \ A= 6 B= 2 46343: RTN
* \ A= 6 B= 2 46282: SP9
* \ A= 6 B= 2 46283: LP3
* \ A= 3 B= 6 46284: SP8
* \ A= 3 B= 6 46285: LLL 46300
* \ A= 11575 B= 3 46287: K4G 94
*
At this moment the routine writef is just about to be entered to print an message
about factorial 3. We can unset breakpoint 1 and continue normal execution by
typing 0b1 c.
* 0b1 c
fact(3) = 6
fact(4) = 24
fact(5) = 120
0.010>
As one final example in this session we will re-compile the BCPL compiler.
0.010> bcpl com/bcpl.b to junk
BCPL (3 Sep 2019) 32 bit with the FLT feature
Code size = 32044 bytes of 32-bit little ender Cintcode
Code size = 14968 bytes of 32-bit little ender Cintcode
0.478>
This shows that the total size of the compiler is 47,012 bytes and that it can
be compiled (on a 1.6GHz CPU) in 0.478 seconds. Since this involves executing
41,273,342 Cintcode instructions, the rate is about 86 million Cintcode instruc-
tions per second with the current interpreter.
There is a directory called com that holds the BCPL source code of several
Cintpos commands, such as bcpl.b, bench100.b and fact.b. We can inspect
fact.b using the type command as follows.
0.000 1> type com/fact.b
SECTION "fact"
GET "libhdr"
LET f(n) = n=0 -> 1, n*f(n-1)
LET start() = VALOF
{ FOR i = 1 TO 10 DO
writef("f(%i2) = %i8*n", i, f(i))
RESULTIS 0
}
0.000 1>
It can be compiled and run as follows.
0.000 1> c bc fact
bcpl com/fact.b to cin/fact hdrs POSHDRS
BCPL (20 Oct 2009)
Code size = 120 bytes
0.020 1> fact
f( 1) = 1
f( 2) = 2
f( 3) = 6
f( 4) = 24
f( 5) = 120
f( 6) = 720
f( 7) = 5040
f( 8) = 40320
f( 9) = 362880
f(10) = 3628800
0.000 1>
There is a benchmark program called bench100.b which can be compiled and
run as follows.
0.000 1> c bc bench100
bcpl com/bench100.b to cin/bench100 hdrs POSHDRS
BCPL (20 Oct 2009)
Code size = 1444 bytes
0.040 1> bench100
bench mark starting, Count=1000000
starting
finished
qpkt count = 2326410 holdcount = 930563
these results are correct
end of run
9.170 1>
1.2. A CINTPOS CONSOLE SESSION 9
The latest prompt (9.170 1>) indicates that the benchmark program took 9.17
seconds to run and that we are connected to the root command language inter-
preter running as task one.
When Cintpos starts these are six resident tasks which can be seen using the
status command as follows.
0.000 1> status
Task 1: Root_Cli running CLI Loaded command: status
Task 2: Debug_Task waiting DEBUG
Task 3: Console_Handler waiting COHAND
Task 4: File_Handler waiting FH0
Task 5: MBX_Handler waiting MBXHAND
Task 6: TCP_Handler waiting TCPHAND
0.010 1>
Notice that the root CLI (task 1) completes the execution of the run command
and issues a prompt (0.000 1>) before the newly created CLI (task 7) has had
time to load and run the status command. As soon as task 7 finishes running
the status command it commits suicide leaving the original 6 tasks.
The bounce.b program provides a demonstration of how communication be-
tween Cintpos tasks works.
0.000 1> type com/bounce.b
SECTION "bounce"
GET "libhdr"
LET start() BE qpkt(taskwait()) REPEAT
0.000 1>
The status output shows that the bounce program is running as task 7 and is
suspended in taskwait waiting for another task to send it a packet. When it
receives a packet it immediately returns it to the sender and waits for another to
arrive. We can send a suitable packet to bounce using the send command whose
source code is as follows.
0.000 1> type com/send.b
SECTION "send"
GET "libhdr"
GLOBAL { task: 200; count: 201 }
LET start() BE
{ LET pkt = VEC 2
LET argv = VEC 50
UNLESS rdargs("TASK/n,COUNT/n", argv, 50) DO
{ writef("Bad arguments for SEND*n")
stop(20)
}
task, count := 7, 1_000_000
IF argv!0 DO task := !argv!0
IF argv!1 DO count := !argv!1
pkt!0, pkt!1, pkt!2 := notinuse, task, count
writef("*nSending a packet to task %n, %n times*n", task, count)
{ LET k = pkt!2
UNLESS k BREAK
pkt!2 := k-1
qpkt(pkt)
pkt := taskwait()
} REPEAT
writes("Done*n")
}
0.010 1>
1.2. A CINTPOS CONSOLE SESSION 11
This demonstration shows that a packet may be sent from one task to another
2 million times in 3.94 seconds. This corresponds to a rate of just over half a
million times per second.
12 CHAPTER 1. THE SYSTEM OVERVIEW
Chapter 2
The design of BCPL owes much to the work done on CPL (originally Cambridge
Programming Language) which was conceived at Cambridge to be the main lan-
guage to run on the new and powerful Ferranti Atlas computer to be installed
in 1963. At that time there was another Atlas computer in London and it was
decided to make the development of CPL a joint project between the two Uni-
versities. As a result the name changed to Combined Programming Language. It
could reasonably be called Christopher’s Programming Language in recognition
of Christpher Strachey whose bubbling enthusiasm and talent steered the course
of its development.
CPL was an ambitious language in the ALGOL tradition but with many novel
and significant extensions intended to make its area of application more general.
These included a greater richness in control constructs such as the now well known
IF, UNLESS, WHILE, UNTIL, REPEATWHILE, SWITCHON statements. It could handle
a wide variety of data types including string and bit patterns and was one of the
first strictly typed languages to provided a structure mechanism that permitted
convenient handling of lists, trees and directed graphs. Work on CPL ran from
about 1961 to 1967, but was hampered by a number of factors that eventually
killed it. It was, for instance, too large and complicated for the machines available
at the time, and the desire for elegance and mathematical cleanliness outweighed
the more pragmatic arguments for efficiency and implementability. Much of the
implementation was done by research students who came and left during the
lifetime of the project. As soon as they knew enough to be useful they had
to transfer their attention to writing theses. Another problem (that became of
particular interest to me) was that the implementation at Cambridge had to
move from EDSAC II to the Atlas computer about halfway through the project.
The CPL compiler thus needed to be portable. This was achieved by writing it
in a simple subset of CPL which was then hand translated into a sequence of
low level macro calls that could be expanded into the assembly language of either
machine. The macrogenerator used was GPM[6] designed by Strachey specifically
for this task. It was a delightfully elegant work of art in its own right it is well
13
14 CHAPTER 2. THE BCPL LANGUAGE
worth study. A variant of GPM, called BGPM, is included in the standard BCPL
distribution.
BCPL was initially similar to this subset of CPL used in the encoding of
the CPL compiler. An outline of BCPL’s main features first appeared in my
PhD thesis [4] in 1966 but it was not fully designed and implemented until early
the following year when I was working at Project MAC of the Massachussetts
Institute of Technology. Its first implementation was written in Ross’s Algol
Extended for Design (AED-0)[1] which was the only language then available on
CTSS, the time sharing system at Project MAC, other than LISP that allowed
recursion.
2.1.1 Comments
There are two form of comments. One starts with the symbol // and extends
up to but not including the end-of-line character, and the other starts with the
symbol /* and ends at a matching occurrence of */. Comment brackets (/* and
*/ may be nested, and within such a comments the lexical analyser is only looking
for /* and */ and so special care is needed when commenting out fragments of
program containing // comments and string constants. Comments are equivalent
to white space and so may not occur within of multi-character symbols such as
identifiers or constants.
2.1. LANGUAGE OVERVIEW 15
$$tag
$<tag
$~tag
$>tag
where tag is conditional compilation tag composed of letters, digits, dots and
underlines. All tags are initially unset, but may be complemented using the $$tag
directive. All the lexical tokens between $<tag and $>tag are skipped (treated as
comments) unless the specified tag is set. All the lexical tokens between $~tag
and $>tag are skipped unless the specified tag is not set.
The following example shows how this conditional compilation feature can be
used.
$$Linux // Set the Linux conditional compilation tag
$<Linux // Include if the Linux tag is set
$<WinXP $$WinXP $>WinXP // Unset the WinXP tag if set
writef("This was compiled for Linux")
$>Linux
$<WinXP // Include if the WinXP tag is set
writef("This was compiled for Windows XP")
$>WinXP
16 CHAPTER 2. THE BCPL LANGUAGE
2.2 Expressions
Expressions are composed of names, constants and expression operators and may
be grouped using parentheses. The precedence and associativity of the different
expression constructs is given in Section 2.2.11. BCPL expressions always vield
values of the same bit length, normall 32 or 64 bits.
2.2.1 Names
Syntactically a name is of a sequence of letters, digits, dots and underlines starting
with a letter, but it must not one of the reserved words (such as IF, WHILE or
RESULTIS). The use of dots in names is no longer recommended, and should be
replaced by underscores. Double dots are no longer permitted in names because
.. is the range operator used in the pattern matching extension.
A name may be declared as a local variable, a static variable, a global variable,
a manifest constant, a label or a function or routine. Since the language is
typeless, the value of a name is just a bit pattern whose interpretation depends
on how it is used. This is similar to the way values in central registers of most
computers are used.
2.2.2 Constants
Decimal numbers consist of a sequence of digits, while binary, octal or hexadeci-
mal are represented, respectively, by #B, #O or #X followed by digits of the appro-
priate sort. Letters in hexadecimal numbers may use both upper and lower case
and the case of the letters B, O or X after #. The O may be omitted in octal num-
bers. Underlines may be inserted within numbers to improve their readability.
2.2. EXPRESSIONS 17
Since August 2014, floating point constants are now allowed, such as the
following:
1234.0
1.234_456e-5
10e6
Note that 12.34 is a floating point number, but 12..34 is 12 followed by the
range operator .. and 34. A floating point constant must start with a digit and
contain a decimal point (.) or an exponent sign (e or E).
BCPL uses the standard IEEE representation for floating point numbers using
the same word length as other BCPL values For 32-bit BCPL the format is as
follows. The left most bit is the sign with 1 representing negative. The next 8
bits hold an unsigned number e in the range 0 to 255. e = 0 and e = 255 are used
to specify in the representation of some special values such as zero, infinity or
various error values. The values between 1 and 254 specify binary exponents in
the range -126 to +127 equal to e − 127. The remaining 23 bits are the fractional
bits of the significand. For non zero values, the significand has 24 bits with its
left most bit being 1 followed by these 23 fractional bits. This represents a value
greater than or equal to 1.0 and less than 2.0. Note that 1+8+23=32. The value
of the floating point number is the significand multiplied by 2e−127 . As a special
case, the number 0.0 is represented by a bit pattern of 32 zeroes.
For 64-bit numbers the exponent has 11 bits and the significand has 53. Note
that 1+11+52=64.
The compiler does not use any floating point operators or constants using,
where necessary, calls of the form sys(Sys flt,...) to perform any floating
point calculations needed. This allows the compiler to be compiled using older
versions of the compiler. Floating point constants are currently only compiled
correctly if the BCPL word length of the compiler is the same as that of the
target code.
TRUE and FALSE are reserved words that have values -1 and 0, respectively,
representing the two truth values. They can be used in manifest constant ex-
pressions. Whenever a boolean test is made, the value is compared with with
FALSE (=0). BITSPERBCPLWORD is also a reserved word whose value is 32 or 64
giving the BCPL word length currently being used. This constant was added on
16 May 2013 to allow the same header file to be used on both 32- and 64-bit
BCPL systems. It is used in the MANIFEST declarations of constants such as
18 CHAPTER 2. THE BCPL LANGUAGE
bytesperword and minint that are word length dependent. If you are using an
older BCPL compiler with the latest version of libhdr.h you will need to un-
comment a line that declares BITSPERBCPLWORD as a MANIFEST constant with
the appropriate value for the system you are using.
The commands BREAK, LOOP, NEXT, EXIT and ENDCASE are permitted within
expressions and have the same effect as the corresponding commands described
in Section 2.3.8.
A question mark (?) may be used as a constant with undefined value. It can
also be used in statements such as:
LET a, b, count = ?, ?, 0
sendpkt(notinuse, rdtask, ?, ?, Read, buf, size)
Constants of the form: SLCT len:shift:offset pack the three constants len,
shift and offset into a word. Such packed constants are used by the field selection
operator OF to access fields of given length, shift and offset relative to a pointer
as described in Section 2.2.6. The len and shift components are optional. Their
omission has the following effect.
Escape Replacement
*n A newline (end-of-line) character.
*c A carriage return character.
*p A newpage (form-feed) character.
*s A space character.
*b A backspace character.
*t A tab character.
*e An escape character.
*" "
*’ ’
** *
*xhh The single character with number hh (two hexadecimal
digits denoting an integer in the range [0,255]).
*ddd The single character with number ddd (three octal digits
denoting an integer in the range [0,255]).
*#g Set the encoding mode to GB2312 for the rest of this
string or character constant. The default encoding is
UTF8 unless speified by the GB2312 compiler option,
See the specification of the bcpl command on page 131.
*#u Set the encoding mode explicitly to UTF8 for the rest
of this string or character constant.
*#hhhh In UTF8 mode, this specifies a single Unicode character
with up to four hexadecimal digits. In string constants,
this is converted to a sequence of bytes giving its UTF-
8 representation. In character constants, it yields the
integer hhhh. Thus ’*#C13F’=#xC13F.
*##h..h In UTF8 mode, this specifies a Unicode character with
up to eight hexadecimal digits, but is otherwise treated
as the *#hhhh escape.
*#dddd In GB2312 mode, this specifies the GB2312 decimal code
(dddd) for an extended character. In string constants,
this is converted to a sequence of bytes giving its GB2312
representation. In character constants, it yields the in-
teger dddd. Thus ’*#g*#4566’=4566.
*f..f * This sequence is ignored, where f..f stands for a se-
quence of white space characters. In this context, com-
ments introduced by ’//’ are treated as white space,
but those introduced by ’/*’ are not.
length and bytes of the string are packed. If s is a string then s%0 is its length
and s%1 is its first character, see Section 2.2.6. The *# escapes allow Unicode
and GB2312 characters to be handled. For instance, if the following statements
output to a suitable UTF8 configured device:
writef("*#uUnicode hex 2200 prints as: ’*#2200’*n")
writef("%%# in writef can also be used: ’%#’*n", #x2200)
The parentheses are required even if no arguments are given. The last example
above illustrates a call in which the function is specified by an expression. If
the function being called was defined by a routine definition, the result of the
call will be undefined. The arguments are evaluated and laid out in consective
stacklocations where they become the initial values of the formal parameters of
the called function or routine. There is no need for the number of arguments
to be the same as the number of formal parameters. See Section 2.4.5 for more
details.
E#(E1 ,..,En )
Here, E1 points to the fields of an object, with the convention that its zeroth
field (E1 !0) is a pointer to the methods vector containing the possible functions
to call. Element E of this vector is applied to the given set of arguments. E is
normally a manifest constant. An example program illustrating method calls can
be found in BCPL/bcplprogs/demos/objdemo.b in the BCPL distribution.
An :: expression can be used on both right and left hand sides of assignments
but not as the operand of @. When used in a right hand context the selected field
is shifted to the right hand end of the result with vacated positions filled with
zeros. A shift to the left is performed when a field is updated. Suppose p!3
holds #x12345678, the expression (SLCT 12:8:3)::p yields #x456 and after the
assignment:
! f x means ! (f x)
! @ x means ! (@ x)
! v ! i ! j means ! ((v!i)!j)
@ v ! i ! j means @ ((v!i)!j)
x << 1+y >> 1 means (x<<(1+y))>>1)
~ x!y means ~ (x!y)
~ x=y means ~ (x=y)
NOT x=y means NOT (x=y)
b1-> x, b2 -> y,z means b1 -> x, (b2 -> y, z)
9 Names, Literals, ?,
TRUE, FALSE, BITSPERBCPLWORD,
BREAK, LOOP, ENDCASE,
NEXT, EXIT
(E),
9L SLCT : Field selectors
Function and method calls
Subscripted expressions using [ and ]
8L ! % OF Dyadic
7 ! @ Prefixed
6L * / MOD #* #/ #MOD Dyadic and
5 + - ABS #+ #- #ABS monadic
4 = ~= <= >= < > Extended Relations
#= #~= #<= #>= #< #>
4L << >>
3 ~ Bitwise and Boolean operators
3L &
2L |
1L EQV XOR
1R -> , Conditional expressions
0 MATCH EVERY VALOF TABLE
2.3 Commands
The primary purpose of commands is to update variables, to perform in-
put/output operations, and fto control the flow of control. They are described in
the following sections.
2.3.1 Assignments
Simple assignments have the following possible forms:
L:=E
L#:=E
Lop:=E
where op is one of the following operators: !, *, /, MOD, +, -, #*, #/, #MOD, #*,
#-, <<, >>, &, |, EQV or XOR and L is a variable name or an expression of one
of the following forms: E1 !E2 , !E, E1 %E2 , %%E or K OF E. K is normally a
selector of the form SLCT length:shift:offset. For := and #:= assignments,
the right hand side is evaluated and used to update the location specified by
the left hand side. For op:= assignments, the value to assign is Lop:=R. If # is
26 CHAPTER 2. THE BCPL LANGUAGE
present a floating point assignment is performed. This causes the right hand side
to be evaluated in FLT mode with the restriction that the left hand side must
refer to a full word. See Section 2.7 for details. Typical simple assignments are
as follows:
cg_x := 1000
v!i := x+1
!ptr := mk3(op, a, b)
str%k := ch
%strp := ’A’
SLCT 8:10:1 OF p +:= 1
p &:= #x7F
w!p #*:= a
Assignments are not permitted to start with any of the following keywords:
MATCH, EVERY, BREAK, LOOP, ENDCASE, NEXT or EXIT.
L1 ,..,Ln := E1 ,..,En
L1 ,..,Ln #:= E1 ,..,En
L1 ,..,Ln op:= E1 ,..,En
L1 :=E1 ;. . . ; Ln := En
L1 #:=E1 ;. . . ; Ln #:= En
L1 op:=E1 ;. . . ; Ln op:= En
This conversion is performed before the application of the rules of the FLT
feature. See Section 2.7 for details.
IF E DO C1
UNLESS E DO C2
TEST E THEN C1 ELSE C2
2.3. COMMANDS 27
WHILE E DO C
UNTIL E DO C
C REPEAT
C REPEATWHILE E
C REPEATUNTIL E
FOR N = E1 TO E2 BY K DO C
FOR N = E1 TO E2 DO C
FOR N = E1 BY K DO C
FOR N = E1 DO C
Labels of the form DEFAULT: or CASE K: are permitted in the command se-
quence. E is evaluated and control is passed to the matching case label if it
28 CHAPTER 2. THE BCPL LANGUAGE
exists, otherwise a jump is made to the default label but, if that is not given,
control passes to the point just after the switchon command.
{ C1 ;...; Cm }
30 CHAPTER 2. THE BCPL LANGUAGE
This sequencing operator has been included since it was in the extended non
standard version of BCPL at MIT and extensively used in the PAL compiler.
See for instance com/pal75.b.
2.3.10 Blocks
A block is similar to a compound command but may start with some declarations.
The syntax is as follows:
{ D1 ;...; Dn ; C1 ;...; Cm }
2.4 Declarations
Each name used in BCPL program must in the scope of its declaration. The
scope of names declared at the outermost level of a program include the right
hand side of its own declaration and all the remaining declarations in the section.
The scope of names declared at the head of a block include the right hand side of
its own declaration, the succeeding declarations and the body of the block. Such
declarations are introduced by the keywords MANIFEST, STATIC, GLOBAL and LET.
A name is also declared when it occurs as the control variable of a for loop. The
scope of such a name is the body of the for loop.
2.4. DECLARATIONS 31
2.4.1 Labels
The only other way to declare a name is as a label of the form N :. This may
prefix a command or occur just before the closing section bracket of a compound
command or block. The scope of a label is the body of the block or compound
command in which it was declared.
MANIFEST { N1 = K1 ;...; Nn = Kn }
where N1 ,...,Nn are names (see Section 2.2.1) and K1 ,...,Kn are manifest
constant expressions (see Section 2.2.12). Each name is declared to have the
constant value specified by the corresponding manifest expression. The details
have recently changed due to the introduction of the FLT feature, see Section 2.7
on page 40. Manifest names with the FLT tag have floating point values otherwise
they are integers. If a value specification (=K1 ) is omitted in the declaration of the
first name, the value 1 or 1.0 is assumed. If a value specification (=Ki ) is omitted
in later declarations a value 1 or 1.0 greater than the value of the previous name
is used. An automatic conversion between integer and floating point is performed
if necessary. Thus, the declaration:
where N1 ,...,Nn are names possibly prefixed by FLT (see Section 2.2.1) and
K1 ,...,Kn are manifest constants (see Section 2.2.12). Each constant specifies
which global vector element is associated with each variable, and if FLT is specified
the variable is assumed to hold a floating point number.
If a global number (:Ki ) is omitted, the next global variable element is im-
plied. If :K1 is omitted, then :0 is assumed. Thus, the declaration:
32 CHAPTER 2. THE BCPL LANGUAGE
declares the variables a, b, c and d occupy positions 0, 200, 201 and 251 of the
global vector, respectively.
where N1 ,...,Nn are names possibly prefixed by FLT (see Section 2.2.1) and
K1 ,...,Kn are manifest constant expressions (see Section 2.2.12). Each name
is declared to be a statically allocated variable initialised to the corresponding
manifest expression. If a value specification (=Ki ) is omitted, the a value 0 or 0.0
is implied. Thus, the declaration:
When a local variable has the FLT tag it initial value expression is evaluated in
FLT mode. For details see Section 2.7 on page 40.
The query expression (?) should be used on the right hand side when a
variable does not need a specified initial value.
where N is a name which may not be qualified by the FLT tag and K is a manifest
constant. A location is allocated for N and initialized to point to a vector whose
lower bound is 0 and whose upper bound is K. The variable N and the vector
elements (N !0 to N !K) reside in the stack frame of the current function and
only continue to exist while control remains within the function.
Function Definitions
These definitions have the following form:
N ( N1 ,..., Nn ) = E
N ( N1 ,..., Nn ) BE C
where N is the name of the function being defined, and N1 ,...,Nn are its formal
parameters, each of which may be prefixed by FLT. A function defined using =
returns E as result, but one defined using BE and executes the command C and
does not return a defined a result. Functions defined using BE are often called
routines.
Some example functions definitions are as follows.
A function can be defined using pattern matching by giving its name followed
a one or more match items of the form:
: Plist => E.
optionally followed by a dot. The way patterns work is described in on page 35.
A routine can be defined using pattern matching by giving its name followed
a one or more match items of the form:
: Plist BE C.
34 CHAPTER 2. THE BCPL LANGUAGE
optinally followed by a dot. The way patterns work is described in on page 35.
Example functions defined using pattern matching are as follows.
LET ways
: 0, ? => 1
: ?, [0] => 0
: s, coins[>s] => ways(s, coins+1)
: s, coins[v ] => ways(s, coins+1) + ways(s-v, coins)
LET eval
: [Id, x], e => lookup(x, e)
: [Num, k], ? => k
: [Mul, x, y], e => eval(x, e) * eval(y, e)
: [Div, x, y], e => eval(x, e) / eval(y, e)
: [Add, x, y], e => eval(x, e) + eval(y, e)
: [Sub, x, y], e => eval(x, e) - eval(y, e)
GLOBAL { var:200 }
LET f1(...) BE
{ LET oldvar = var // Save the current value of var
var := ... // Use var during the call of f1
...
f2(...) // var may be used in f2
...
IF ... DO f1(...) // f1 may be called recursively
var := oldvar // restore the original value of var
}
2.5 Patterns
This section describes the MCPL style pattern match features that is now avail-
able in BCPL.
Pattern matching is an important feature since it provides a mechanism to
select an outcome based on the values of locations in a structure referenced di-
rectly or indirectly from a set of arguments. Names, called pattern variables,
can be associated with locations in the structure. Such variables have much in
common with ordinary local variables. They can only be used in the pattern and
the expression or command in the match item.
36 CHAPTER 2. THE BCPL LANGUAGE
Patterns are used in function and routine definitions and in MATCH and EVERY
constructs. They are applied to the arguments given by the function or routine
calls or the arguments of MATCH and EVERY constructs. Each matxh construct has
a sequence of one or more match items, consisting of a pattern and an expression
or command which must be of one of the following forms.
: Plist => E
: Plist BE C
where Plist is a list of zero or more pattern terms separated by commas. E is
an expression and C is a command. The sequence of match items is optionally
terminated by a dot. If the first match item in a sequence used the operator =>
then all the subsequent items must also use =>, otherwise all the match items
must use BE.
For functions routine and MATCH expressions and commands, the patterns
are tested in turn and the first to be successful causes the related expression or
command to be evaluated.
For an EVERY command, the commands in all successful match items are exe-
cuted, and for an EVERY expression the values of the expressions of all successful
match items are summed and returned as the result.
A pattern term tests an individual location. It may test the numerical value
contained in the location or, if the location holds a pointer, it can apply a pattern
list to the consecutive memory locations pointed to. Pattern terms in a list test
consective locations in left to right order. A pattern can only successfully match
its arguments if every pattern term it contains is suceessful.
There are many kinds of term and, just as with ordinary expressions, it is
convenient to define the syntax using precedence. The grammar will use syntactic
categories for the different precedence levels. Plist, P1, P2, P3. P4, P5 and P6
are the term categories from lowest to highest precedence.
The syntax of P6 and P5 is as follows:
P6 -> an integer constant
| a floating point constant
As an example of the use of pattern variables and relation terms, the pat-
tern [<x, x] will be successful if the current location contains a pointer to two
consective integers where the first is less than the second.
The term of the form [ em Plist ] assumes the current location holds a pointer
to consecutive locations . The match is only successful if every P1 term in the
Plist successfully matches its location. This construct can only be nested up to
a depth of 4.
If a the position the current location relative to P, the base of the current stack
frame, any pattern variables will associated with one of the following location.
P!a
P!a!r
P!a!r!s
P!a!r!s!t
P!a!r!s!t!u
where r, s, t and u are all offsets in the range 0 to 255, identfying the path from
an argument to the location associated with the declared name.
A P1 term has the following syntax
P1 -> P2 P1
| P2
P1 is thus just a list of consecutive T2 terms. It is is applied to a single location
and is only satisfied if at the T2 terms applied to this location are successful.
The syntax of Plist is as follows:
Plist -> P1 , Plist
— P1
A Plist is applied to the consecutive locations of the argument lists of function
or routine calls or the arguments of MATCH and EVERY constructs. It can also be
applied to the consecutive locations pointed to by a term of the form [ Plist ].
The consecutive P1 terms are applied to the consective locations. The Plist is
only successful if every P1 term successfully matches it argument location.
When these sections are loaded, global 200 is initialized to the entry point of
function f defined in demolib.b and so can be called from the function start
defined in demomain.b.
The header file, libhdr, contains the global declarations of all the resident
library functions and routines making all these accessible to any section that
started with: GET "libhdr". The library is described in the next chapter. Global
variable 1 is called start and is, by convention, the first function to be called
when a program is run.
Automatic global initialisation also occurs if a label declared by colon (:)
occurs in the scope of a global of the same name.
Although the global vector mechanism has disadvantages, particularly in the
organisation of library packages, there are some compensating benefits arising
from its extreme simplicity. One is that the output of the compiler is available
directly for execution without the need for a link editing step. Sections may
also be loaded and unloaded dynamically during the execution of a program
using the library functions loadseg and unloadseg, and so arbitrary overlaying
schemes can be organised easily. An example of where this is used is in the
implementation of the Command Language Interpreter described in Chapter 4.
The global vector also allows for a simple but effective interactive debugging
system without the need for compiler constructed symbol tables. Again, this was
devised when machines were small, disc space was very limited and modern day
linkage editors had not been invented; however, some of its advantages are still
relevant today.
40 CHAPTER 2. THE BCPL LANGUAGE
FLT mode. Expressions giving the values of static, manifest and local variable
names with the FLT tag are also evaluated in FLT mode. All other expressions
are evaluate in non-FLT mode.
(5) If the leading operator of an expression evaluated in FLT mode is one of ABS,
*, /, MOD, +, - or -> it is replaced by the floating point version. If an integer
constant is evaluated in FLT mode it is replaced by the corresponding floating
point constant.
These rules are applied repeatedly until there is no further change. As an
example, the following function:
Notice that the user almost never needs to use # to specify floating point opera-
tions. Note also that the expression (0|#x3840000|) is a way to protect an ex-
pression (#x38400000) from being evaluated in FLT mode. See com/xcmpltest.b
and bcpl4raspi.pdf from my home page for examples of the use of the FLT fea-
ture.
To see why the name in a vector declarations may not be given the FLT tag,
consider LET v = VEC 5. This initialises v with a pointer to a vector with 6
elements and the expression v+1 would point to the element at subscript position
one. Had v been given the FLT tag, v+1 would have been automatically converted
to v#+1.0 which is clearly meaningless.
The FLT tag is not permitted to qualify FOR loop control variables since, due
to floating point truncation and rounding errors, the number of iterations of a
loop such as FOR FLT x = 0.1 TO 10.0 DO... is properly defined.
#!/usr/local/bin/cintsys -c
42 CHAPTER 2. THE BCPL LANGUAGE
as the first line of the compiled object module. This line is ignored by the CLI
but under Linux it allows Cintcode programs to be called directly from a Linux
shell. If objline1 cannot be found no such line is inserted at the start of the
object module.
Chapter 3
The Library
This manual describes three variants of the BCPL system. The simplest is in-
voked by the shell command cintsys and provides a single threaded command
language interpreter. The system invoked by cintpos provides a multi-threaded
system where the individual threads (called tasks) are run in parallel and are
pre-emptible. A third version is available for some architectures and provides a
single threaded version in which the BCPL source is compiled into native machine
code. Although this version is faster, it is more machine dependent, has fewer
debugging aids and will only run a single command.
The libraries of these three systems have much in common and so are all
described together. The description of all constants, variables and functions have
a right justified line such as the following
CIN:y, POS:y, NAT:n
where CIN:, POS: and NAT: denote the single threaded, multi-threaded and native
code versions, respectively, and the letters y and n stand for yes and no, showing
whether the corresponding constant, variable or function is available on that
version of the system.
The resident library functions, variables and manifest constants are declared
in the standard library header file g/libhdr.h. Most of the functions are defined
in BCPL in either sysb/blib.b or sysb/dlib.b, but three functions (sys, chgco
and muldiv) are in the hand written Cintcode file cin/syscin/syslib. Most
functions relating to the multi-threaded version are defined in klib.b.
The following three sections describe the manifest constants, variables and
functions (in alphabetical order) provided by the standard library.
43
44 CHAPTER 3. THE LIBRARY
Most implementations pack 4 bytes into 32-bit words requiring B2Wsh=2, but on 64-bit
implementations, such as native code on the DEC Alpha or the 64-bit Cintcode version
of BCPL, its value is 3.
CloseObj#(obj)
function name is too long its first and last five character are packed into the string
separated by a single quote ’. Typically entryword=#x0000DFDF.
This holds a pointer to the debug count vector. These counters can be incremented
by calls of the form sys(Sys incdcount, n) or by similar calls in C within the Cintcode
interpreter. The zeroth element of dcountv holds it upper bound which is typically
511.
Under Cintpos, this holds the Cintpos device table. The zeroth entry is the ta-
ble’s upperbound and each other entries is either zero, or points to the device control
block (DCB) of the corresponding device. Some devices are handled by polling in the
interpreter thread based on the count of Cintcode instructions obeyed. Currently the
clock (device -1) and ttyout (device -3) are handled in this way. This improved the
performance of output to the screen and causes the clock to have a resolution of about
1 milli-second although the actual clock precision is usually limited by the underlying
operating system.
48 CHAPTER 3. THE LIBRARY
If dumpflag is TRUE when Cintsys or Cintpos exits, the entire Cintcode memory is
dumped in a compacted form to the file DUMP.mem for later inspection by commands
such as dumpsys or dumpdebug.
This rootnode field holds the list of logical name-value pairs used by the functions
setlogval and getlogval, and the CLI command setlogval. The environment vari-
able held in envlist are distinct from those such as BCPLROOT held by the underlying
operating system but have a similar purpose.
rtn mc0, rtn mc1, rtn mc2, rtn mc3 CIN:y, POS:y, NAT:n
These hold the machine address of the start of the Cintcode memory and other
values used by the MC package.
scbt net, scbt file, scbt ram, scbt console, scbt mbx, scbt tcp
CIN:y, POS:y, NAT:n
These constants are mnemonics for the possible values of the type field of a stream
control block. See scb type above.
callco always leaves the global currco is set to point to the target coroutine. This is
done by the Cintcode instruction CHGCO invoked by changeco.
return with result val. The only other use of changeco is in createco. This is more
subtle but can be understood by looking at the description of createco on page 58.
If the encoding field of the current output stream is GB2312, the output is in GB2312
format as described in the following table.
cowait always leaves the global currco is set to point to the resumed coroutine. This
is done by the Cintcode instruction CHGCO invoked by changeco.
resumption point
fn sz c P1 L1
The resumption point is P pointer belonging to the function that caused the sus-
pension of the coroutine. It becomes the value of the P pointer when the coroutine
next resumes execution. The parent link points to the coroutine that called this one,
or is zero if the coroutine not active. The outermost coroutine (or root coroutine) is
marked by the special value -1 in its parent link. As a debugging aid, all coroutines
are chained together in a list held in the global colist. The values fn and sz hold the
main function of the coroutine and its stack size, and c is a private variable used by
the coroutine mechanism.
At any time just one coroutine (the current coroutine) has control, and all the
others are said to be suspended. The current coroutine is held in the global variable
currco, and the Cintcode P register points to a stack frame within its stack. Passing
3.3. GLOBAL FUNCTIONS 59
currco changeco
stack frame
resumption point
P1 L1 a
cptr
P CHGCO
resumption point
PC
P1 L1 a
PC currco P
control from one coroutine to another involves saving the resumption point in the
current coroutine, and setting new values for the program counter (PC), the P pointer
and currco. This is done by changeco(a,cptr) as shown in figure 3.2. The function
changeco is defined by hand in syslib used by cintsys and cintpos and its body
consists of the single Cintcode instruction CHGCO. As can be seen its effect is somewhat
subtle. The only uses of changeco are in the definitions of createco, callco, cowait
and resumeco, and these are the only functions that cause coroutine suspension. In
the native code version of BCPL changeco is defined in mlib.s
The definition of createco is in blib.b and is as follows.
The function createco creates a new coroutine by allocating its stack by the call
gevec(size+6). The variable c holds a pointer to the new coroutine stack and, as can
been seen, its first six words are initialised to hold system information, as follows.
60 CHAPTER 3. THE LIBRARY
createco changeco
stack frame stack frame
P2 L2 fn sz c P1 L1 0 c
PC LP5 {
K9G 24 cowait(c)
LP3
K6 fn( ... )
SP5 c := ...
coroutine chain J -7 } REPEAT
fn sz c
colist
currco
The new coroutine
P
A coroutine with main function fn and given size is created and, if successful, it is
initialised by callco(cptr, @a). Thus, fn should expect a vector containing up to 11
elements. Once the newly created coroutine has initialised itself, it returns control to
initco by means a call of cowait. The result of initco is the newly created coroutine
pointer, or zero on failure. The second result (in result2) is the value returned by the
first call of cowait in the newly created coroutine.
As can be seen, it allocates a vector for the fields of the object, initialises its
zeroth element to point to the methods vector and calls the initialisation method that
is expected to be in element InitObj of fns. The result is a pointer to the initialised
66 CHAPTER 3. THE LIBRARY
fields vector. If it fails, it returns zero. As can be seen the initialisation method receives
a vector of up to 11 initialisation arguments.
Remember that scaled fixed point values can be output conveniently using writef
as in:
writef("%10.6d*n", 123_456789)
123.456789
in global variable randseed which can be set using setseed described below. Its
implementation is as follows:
In this example, there are four possible arguments and their values will be placed in
the first four elements of argv. The first argument has keyword FROM and must receive
a value because of the qualifier /A. The second has alternative keywords TO and AS with
qualifier /K that insists the argument is introduced by one of its keywords. The third
argument has the qualifiers /N and /P indicating that it expects a number and that it
will be prompted for if not already given, and the last argument has the qualifier /S
indicating that it is a switch that can be set by the presence of its keyword.
Table 3.4 shows the values in placed in argv and the result when the call:
rdargs("FROM/A,TO=AS/K,DATA/N/P,N/S", argv, 50)
is given various argument strings. This example illustrates that keyword synonyms can
be defined using = within the key string. Positional arguments are those not introduced
by keywords. When one is encountered, it becomes the value of the lowest numbered
unset non-switch argument.
To consolidate your understanding of rdargs, try compiling and running the pro-
gram: bcplprogs/tests/tstrdargs.b.
the vector v whose upper bound is upb and returns an integer describing the kind of
item read. Table 3.5 gives the kinds of item that can be read and corresponding item
codes.
not exist it returns zero with 101 in result2. If the released task has higher priority
and is runnable it gains control leaving the current task suspended in RUN state. This
function is also called unhold.
resumeco always leaves the global currco is set to point to the resumed coroutine.
This is done by the Cintcode instruction CHGCO invoked by changeco.
sawritef(format,a,b,c,d,e,f ,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z)
CIN:y, POS:y, NAT:y
This function is similar to writef but performs its output using sawrch.
res := sys(Sys callc, c bind, a1, a2, a3) CIN:y, POS:y, NAT:y
This bind socket a1 to remote IP address a2 and remote port a3. If there is an
error the result is -1.
res := sys(Sys callc, c tcpconnect, a1, a2, a3) CIN:y, POS:y, NAT:y
This make a TCP/IP connection through socket a1 to remote IP address a2 and
remote port a3. If there is an error the result is -1.
res := sys(Sys callc, c fd select, a1, a2, a3, a4, a5) CIN:y, POS:y, NAT:y
This inspects bit a1 in the bit vector a2. The result is 1 if the bit was set and 0
otherwise.
The number of the bits to test is in a1. The bit vector identifying read sockets of
interest is in a2, The bit vector identifying write sockets of interest is in a3, The bit
vector identifying other sockets of interest is in a4. A pointer to two words holding
the timeout in seconds and microseconds is in a5. The result is the number of sockets
that can now be read or written to, or 0 if the timeout period has elapsed before any
sockets are ready. A result of -1 indicate an error.
mantissa and leaving the decimal exponent in result2. For example, sys(Sys flt,
fl unmk, 1234.5678) might return 12345678 leaving -4 in result2. However, the
result may vary depending on the BCPL word length and the floating point represen-
tation used.
command:
SECTION "logout"
GET "libhdr"
LET start() BE abort(0)
is
000003E8 0000000E
0000000E 0000FDDF 474F4C0B 2054554F 20202020
0000DFDF 6174730B 20207472 20202020 7B1C2310
00000000 00000001 00000024 0000001C
The first two words (000003E8 0000000E) indicate the presence of a “hunk” of code
of size 14(000000E) words which then follow. The first word of the hunk (000000E)
is again its length. The next four words (0000FDDF 474F4C0B 2054554F 20202020)
contain the SECTION name "logout". These are followed by the four words 0000DFDF
6174730B 20207472 20202020 which hold the name of the function "start". The
body of start is compiled into one word (7B1C2310) which correspond to the Cintcode
instructions:
L0 Load A with 0
K3G 28 Call the function in global 28, incrementing the stack by 3
RTN Return from start – never reached
The remaining 4 words contain global initialisation data that is read backwards during
global initialisation invoked by sys(Sys globin,...). 0000001C (=28) is the highest
global variable referenced by this section. The pair 00000001 00000024 specifies that
the entry point at position 36 is the initial value of global 1, and the next entry
(00000000) marks the end of the global initialisation data.
The manifest constants t hunk, t reloc, t end, t hunk64, t reloc64, t end64,
t bhunk, and t bhunk64 are declared in libhdr for the convenience of programs that
generate or read Cintsys and Cintpos object modules. The example above shows t hunk
loading n 32-bit words encoded in hex bytes. Although the BCPL compiler used in both
Cintsys and Cintpos generates position independent code and has no need to modify
the loaded words of a hunk, other languages may need to perform relocation. This can
be done using t reloc which is followed by a 32-bit word n encoded in hex followed by a
further n words which each give the position of a word in the most recently loaded hunk
that needs to be modified by the addition of the base address of the hunk. The code
t bhunk is similar to t hunk only the data words (not the length field) are provided in
binary rather than hex characters. Such hunks are thus about half the size of character
based ones. The code t end marks the end of an object module, but end-of-file has
the same effect. Those codes containing the characters 64 provide equivalent facilities
for 64-bit versions of BCPL. Neither t reloc nor t reloc64 are currently available in
Cintsys or Cintpos.
This function sets atomically the contents of the machine memory location whose
word address is addr to val returning its previous setting. The address may point to
system work space outside the normal Cintcode memory.
when it was invoked and returns with the result code to the (C) program that called
this invocation of the interpreter. This is normally used to exit from the Cintcode
system, but can also be used to return from recursive invocations of the interpreter
(see sys(Sys interpret,regs) above). A code of zero denotes successful completion
and, if invoked at the outermost level, causes the BCPL Cintcode System to terminate.
aids, does not count instruction executions and does not implement the profiling feature.
Setting count to a negative value causes this faster interpreter to be invoked and setting
count to a positive value causes the slower interpreter to be used. Normally the CLI
command interpreter is used to make this switch, see Section 4.3.
With some debugging versions of fasterp, setting count to -2 causes it to exe-
cute just one instruction before returning with error code 10. This feature assists the
debugging of a new versions of fasterp and is particularly useful when fasterp is
implemented in assembly language.
Both interpreter cinterp and fasterp returns when a fault such as division by zero
occurs or when a call of sys(Sys_quit,...) is made. Before returning, the interpreter
save the Cintcode registers in regs. The returned result is either the second argument
of sys(Sys_quit,...) or one of the builtin return codes in the following table:
K1000 S12 1000 instruction per raster line, 12 bytes per pixel
W10B3W1345B1N 10 white, 3 black, 1345 white, 1 black, newline
W13B3W12B2N etc
...
See the CLI commands raster and rast2ps on page 148 for more information on
how to use the rastering facility. See also the command bits2ps.
res := sys(Sys sound, snd midiOutWrite1, a1, a2) CIN:y, POS:y, NAT:y
This writes a one byte MIDI message (a2) to MIDI device a1.
res := sys(Sys sound, snd midiOutWrite, a1, a2 ...) CIN:y, POS:y, NAT:y
This write a3 MIDI bytes from buffer a2 to MIDI output device a1. The result is
the number of bytes actually sent.
writef(format,a,b,c,d,e,f ,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z)
CIN:y, POS:y, NAT:y
The first argument (format) is a string that is copied character by character to
the currently selected output stream until a substitution item such as %s or %i5 is
encountered when a value (usually the next argument) is output in the specified format.
The substitution items are given in table 3.6.
When a field width (denoted by n in the table) is required, it is specified by a
single character, with 0 to 9 being represented by the corresponding digit and 10 to
35 represented by the letters A to Z. Format characters are case insensitive but field
width characters are not. A recent entension allows the field width to be specified as a
decimal integer immediately following the percent, as in %12i meaning %iB.
Some examples of the %n.md substitution item are given below.
writef("%9.2d", 1234567) writes 12345.67
writef("%9.2d", -1234567) writes -12345.67
writef("%9.0d", 1234567) writes 1234567
writef("%9d", 1234567) writes 1234567
As an example of how the %p substitution item can be used, the following code:
FOR count = 0 TO 2 DO
writef("There %p\ is\are\ %-%n thing%-%ps.*n", count)
outputs:
Item Substitution
%s Write the next argument as a string using writes.
%nt %tn Write the next argument as a left justified string in a field width of
n characters using writet.
%c Write the next argument as a character using wrch.
%# Write the next argument as a in UTF-8 or GB2312 character using
codewrch.
%nb %bn Write the next argument as a binary number in a field width of n
characters using writebin.
%no %on Write the next argument as an octal number in a field width of n
characters using writeoct.
%nx %xn Write the next argument as a hexadecimal number in a field width
of n characters using writehex.
%ni %in Write the next argument as a decimal number in a field width of n
characters using writed.
%n Write the next argument as a decimal number in its natural field
width using writen.
%nu %un Write the next argument as an unsigned decimal number in a field
width of n characters using writeu.
%n.md Write the next argument as a scaled decimal number in a field with
of n with m digits after the decimal point.
%+ Skip over the next argument.
%- Step back to the previous argument.
%% Write the character %.
%pc Plural formation. Write character c if the next argument is not 1.
%p\a\b\ Plural formation. Write text a if the next argument is 1, otherwise
write text b.
%f Take the next argument as a writef format string and call writef
recursively to process it passing it the remaining arguments. The
argument pointer is advanced by the appropriate amount.
%n.mf Write the next argument as a floating point number in a field with
of n with m digits after the decimal point. The output is generated
using writeflt.
%n.me Write the next argument as a floating point number in exponential
form in a field with of n with m digits after the decimal point. The
output is generated using writee.
%m The next argument is taken as a message number and processes
as for %f above using the message format string obtained by the
call get text(messno, str, upb) where str is a vector local to
writef to hold the message string. This provides an easy way
to generate messages in different languages. get text is a global
function typically defined by the user. The default version always
yields the message string "<mess:%-%n>".
3.3.1 Streams
BCPL uses streams as a convenient method of obtaining device independent input and
output. All the information needed to process a stream is held in a vector called a
stream control block (SCB) whose fields have already been summarized in Section 3.1.
The element buf is either zero or holds the stream’s byte buffer which must have
been allocated using getvec and must be freed using freevec when the stream is
closed. The elements pos and end hold positions within the byte buffer, file holds a
file pointer for file streams or -1 for streams connected to the console. The element
id indicates whether the stream is for input, output or both and work is private work
space for the action function rdfn, wrfn which are called, repectively, when the byte
buffer becomes empty on reading or full on output. The function endfn is called to
close the stream.
Input is read from the currently selected input stream whose SCB is held in the
global variable cis. For an input stream, pos holds the position of the next character to
be read, and end points to just past the last available character in the buffer. Characters
are read using rdch whose definition is given in figure 3.7. If a character is available in
the buffer it is returned after incrementing pos. Exceptionally, the character carriage
return (CR) is ignored since on some systems, such as Windows, lines are terminated
with carriage return and linefeed while on others, such as Linux, only linefeed is used.
If the buffer is exhausted, replenish is called to refill it, returning TRUE if one or
more character are transferred. If replenish fails it returns FALSE with the reason why
in result2. Possible reasons are: -1 indicating end of file, -2 indicating a timeout
has occurred and -3 meaning input is in polling mode and no character is currently
available. By setting the timeoutact field of the SCB to -1, a timeout is treated as
end of file.
Whenever possible, the buffer contains the previously read character. This is to
allow for a clean and simple implementation of unrdch whose purpose is to step input
back by one character position. Its definition is given in figure 3.8.
Output is sent to the currently selected output stream whose SCB is held in the
global variable cos. The SCB field pos of an output stream holds the position in the
buffer of the next character to be written, and end holds the position just past the end
of the buffer. Characters are written using the function wrch whose definition is given
in figure 3.9. The character ch is copied into the byte buffer and pos incremented. If
the buffer is full, it is emptied by calling the element wrfn. If writing fails it return
FALSE, causing wrch to abort.
Within BCPL file names slashs (/) and back slashes (\) are regarded as separators
between the components of file names. File names may start with a colon prefix con-
sisting of letters and digits followed by a colon, as in TCP:shep.cl.cam.ac.uk:9000
or G:test.b. Such prefixes allow access to special features such as URLs used in
TCP/IP communication or to other filing systems. These are often dependent on the
host operating system.
A file name starting ’/’ or ’\’ or containing a colon is treated as an abso-
lute name; all others are relative names and are interpreted relative to the cur-
rent directory. A file name consisting of a single asterisk (*) is special and rep-
resents standard input (normally the keyboard) or standard output (normally the
screen) depending on context. Within a file name, the components dot (.) and
double dot (..) represent the current and parent directories, respectively. As
an example, the file name ../bcplprogs/demos/queens.b is valid and automati-
94 CHAPTER 3. THE LIBRARY
After loading the resident system control is passed to BOOT which updates the variable
names appropriately for the system being run. It is unlikely that the user will want
change them using setroot although it might be useful to use setroot to see what
names are currently being used.
If the value of an environment variable represents a list of directories, they should be
given using Linux style slash (/) separators and the directories separated by semicolons
(rather than the Linux style colons). This allows colon prefixes such as G: to be used
in, for instance, Windows version of the system. For compatibility with older systems,
colons may be used as an alternative to semicolons when not running under Windows.
When Cintpos starts up the process is similar except the setting of rtn pathvar is
POSPATH unless explicitly changed using -cin.
When installing cintsys or cintpos for the first time it is common to fail to set
the environment variables correctly. To help repair such mistakes, use the -f option
when calling cintsys or cintpos. This will output a trace of every time any file is
looked up using an environment variable. Even more information is generated if the
-v argument is also given (or even -vv). Until the system is working correctly it is
recommended that it is started using
cintsys -f
or
cintpos -f -v
The following call creates a coroutine that initially generates a square wave with fre-
quency 440Hz and amplitude 5000 at a rate of 44100 samples per second.
FOR i = 1 TO 44100 DO
{ LET sample = callco(sqco)
...
}
sqparmv!0 := newfrequency
sqparmv!1 := newamplitude
Note that the new frequency and amplitude take effect at the start of the next complete
cycle.
Other examples of the use of initco can be found below.
98 CHAPTER 3. THE LIBRARY
This problem is attributed to R.W.Hamming. The solution given here shows how data
can flow round a network of coroutines. It is illustrated in figure 3.10 in which each
box represents a coroutine and the edges represent callco/cowait connections. The
end of a connection corresponding to callco is marked by c, and an end corresponding
to cowait is marked by w. The arrows on the connections show the direction in which
data moves. Notice that, in tee1, callco is sometimes used for input and sometimes
for output.
w w c c w w c c w w
BUF1 TEE1 BUF2 TEE2 BUF3
w w
c c c
X2 X3 X5
w w w
c c
c w c
MER1 MER2
w
c c
MAIN
The coroutine BUF1 controls a queue of integers. Non-zero values can be in-
serted into the queue using callco(BUF1,val), and values can be extracted using
callco(BUF1,0). The coroutines BUF2 and BUF3 are similar. The coroutine TEE1 is
connected to BUF1 and BUF2 and is designed so that callco(TEE1) executed in corou-
tine X2 will yield a value that TEE1 extracted from BUF1, after sending a copy to BUF2.
TEE2 similarly takes values from BUF2 passing them to BUF3 and X3. Values passing
through X2, X3 and X5 are multiplied by 2, 3 and 5, respectively. MER1 merges two
monotonically increasing streams of numbers produced by X2 and X3. The resulting
monotonic stream is then merged by MER2 with the stream produced by X5. The stream
produced by MER2 is the required Hamming sequence, each value of which is printed
by MAIN and then inserted into BUF1.
3.7. COROUTINE EXAMPLES 99
GET "libhdr"
LET buf(args) BE // Body of BUF1, BUF2 and BUF3
{ LET p, q, val = 0, 0, 0
LET v = VEC 200
{ val := cowait(val)
TEST val=0 THEN { IF p=q DO writef("Buffer empty*n")
val := v!(q MOD 201)
q := q+1
}
ELSE { IF p=q+201 DO writef("Buffer full*n")
v!(p MOD 201) := val
p := p+1
}
} REPEAT
}
SECTION "cosim"
GET "libhdr"
GLOBAL {
priq:ug // The vector holding the priority queue
priqupb // The upper bound
priqn // Number of items in the priority queue
wkqv // The vector of work queues
count // count of messages processed
nodes // The number of nodes
ptmax // The maximum processing time
stopco // The stop coroutine
cov // Vector of message coroutines
ranv // A vector used by the random number generator
rani; ranj // subscripts of ranv
simtime // Simulated time
stoptime // Time to stop the simulation
tracing
// Functions
rnd; initrnd; closernd; prq; insertevent; upheap
downheap; getevent; waitfor; prwaitq; qitem; dqitem
stopcofn; messcofn
}
AND upheap(event, i) BE
{ LET eventtime = event!0
//writef("upheap: eventtime=%n i=%n*n", eventtime, i)
{ LET p = i/2 // Parent of i
UNLESS p & eventtime < priq!p!0 DO
{ priq!i := event
RETURN
}
priq!i := priq!p // Demote the parent
i := p
} REPEAT
}
AND downheap(event, i) BE
{ LET j, min = 2*i, ? // j is left child, if present
IF j > priqn DO
{ upheap(event, i)
RETURN
}
min := priq!j!0
// Look at other child, if it exists
IF j<priqn & min>priq!(j+1)!0 DO j := j+1
// promote earlier child
priq!i := priq!j
i := j
} REPEAT
AND waitfor(ticks) BE
{ // Make an event item into the priority queue
LET eventtime, co = simtime+ticks, currco
insertevent(@eventtime) // Insert into the priority queue
cowait() // Wait for the specified number of ticks
}
3.7. COROUTINE EXAMPLES 103
AND qitem(node) BE
// The message has reached this node
// It currently not busy, mark it as busy and return to process
// the message, other append it to the end of the work queue
// for this node.
{ // Make a queue item
LET link, co = 0, currco
LET p = wkqv!node
UNLESS p DO
{ // The node was not busy
wkqv!node := -1 // Mark node as busy
IF tracing DO
writef("%i8: node %i4: node not busy*n", simtime, node)
RETURN
}
// Append item to the end of this queue
IF tracing DO
writef("%i8: node %i4: busy so appending message to end of work queue*n",
simtime, node)
TEST p=-1
THEN wkqv!node := @link // Form a unit list
ELSE { WHILE !p DO p := !p // Find the end of the wkq
!p := @link // Append to end of wkq
}
cowait() // Wait to be activated (by dqitem)
}
AND dqitem(node) BE
// A message has just been processed by this node and is ready to process
// the next, if any.
{ LET item = wkqv!node // Current item (~=0)
UNLESS item DO abort(999)
TEST item=-1
THEN wkqv!node := 0 // The node is no longer busy
ELSE { LET next = item!0
AND co = item!1
wkqv!node := next -> next, -1 // De-queue the item
callco(co) // Process the next message
}
}
104 CHAPTER 3. THE LIBRARY
writef("*nCosim entered*n*n")
writef("Network nodes: %n*n", nodes)
writef("Stop time: %n*n", stoptime)
writef("Max processing time: %n*n", ptmax)
writef("Random number seed: %n*n", seed)
newline()
UNLESS initrnd(seed) DO
{ writef("Can’t initialise the random number generator*n")
RESULTIS 0
}
stopco := 0
wkqv, priq, cov := getvec(nodes), getvec(nodes+1), getvec(nodes)
UNLESS wkqv & priq & cov DO
{ writef("Can’t allocate space for the node work queues*n")
GOTO ret
}
FOR i = 1 TO nodes DO wkqv!i, cov!i := 0, 0
priqn := 0 // Number of events in the priority queue
count := 0 // Count of message processed
simtime := 0 // Simulated time
IF tracing DO writef("%i8: Starting simulation*n", simtime)
// Create and start the stop coroutine
stopco := createco(stopcofn, 200)
IF stopco DO callco(stopco)
// Create and start the message coroutines
FOR i = 1 TO nodes DO
{ LET co = createco(messcofn, 200)
IF co DO callco(co, i)
cov!i := co
}
// Run the event loop
{ LET event = getevent() // Get the earliest event
UNLESS event BREAK
simtime := event!0 // Set the simulated time
IF simtime > stoptime BREAK
callco(event!1)
} REPEAT
IF tracing DO writef("*nSimulation stopped*n*n")
writef("Messages processed: %n*n", count)
ret:
FOR i = nodes TO 1 BY -1 IF cov!i DO deleteco(cov!i)
IF cov DO freevec(cov)
IF wkqv DO freevec(wkqv)
IF priq DO freevec(priq)
IF stopco DO deleteco(stopco)
closernd()
RESULTIS 0
fail:
writef("Unable to initialise the simulator*n")
GOTO ret
}
106 CHAPTER 3. THE LIBRARY
xsize, ysize
These hold the the number of pixels in each row and column of the canvas.
bmpmode
This is set by opengraphics to mode8bit, mode8bitalt or mode24bit.
bpp
This holds the the number of bytes (1 or 3) per pixel in canvas.
rowlen
This holds bpp*xsize, the number of bytes in canvas to represent a row.
canvassize
This holds the number of bytes (rowlen*ysize) in canvas.
canvasupb
This holds the UPB of canvas in words.
canvas
This holds the vector, allocated by getvec(canvasupb), of pixel bytes to represent
the image. Each pixel is either an 8-bit byte identifying a colour in the palette or 3
bytes giving the blue, green and red components of the colour directly.
palettev
If 8-bit pixels are being used, this holds the palette vector of 256 colours. The
colours are specified by values of the form #Xrrggbb in the least significant 24 bits of
each element of palettev. palettev is set to zero when 24 bit pixels are being used.
col white, col majenta, col blue, col cyan, col green, col yellow, col red,
col black
These variables hold either various 8 or 24 bit colour values.
3.8. THE BMP GRAPHICS LIBRARY 107
closegraphics()
This function closes the graphics library returning canvas to freestore and if
palettev was allocated it is also returned.
drawpoint(x, y)
This function places a pixel with colour specified by currcolour at position (x, y)
on the canvas.
drawpoint33(x, y)
This function places a 3x3 square of pixels with the colour specified by currcolour
centred at position (x, y) on the canvas.
drawch(ch)
This function draws a 8x12 array of pixels representing the given character. Its
colour is specified by currcolour on a white background. The bottom leftmost pixel
of the character is at (currx,curry). If ch is ’*n’, currx is set to 10 and curry is
decremented by 14, otherwise currx is incremented by 9.
drawstr(x, y,str)
This function calls drawch for each character in the given string starting at position
(x, y),
moveto(x, y)
This function sets currx and curry to x and y, respectively.
moveby(dx, dy)
This function increments currx and curry by dx and dy, respectively.
drawto(x, y)
108 CHAPTER 3. THE LIBRARY
This function draws a straight line of colour currcolour from (currx, curry) to
(x, y). It leaves (currx and curry) to x and y.
drawby(dx, dy)
This function draws a straight line of colour currcolour from (currx, curry) to
(currx+dx, curry+dy). It then increments currx and curry by dx and dy, respec-
tively.
drawrect(x, y, w, h)
This function draws the outline of the rectangle of width w and height h with the
bottom left corner at (x, y) using currcolour.
drawrndrect(x, y, w, h, radius)
This function draws the outline of the rectangle of width w and height h with its
bottom left corner at (x, y) with rounded corners of given radius. Its colour is specified
by currcolour. If radius is less than or equal to zero the corners are square, and if
radius is greater than half the shorter side length it is reduced to this value. currx and
curry are set to x1 and y1, respectively.
fillrect(x, y, w, h)
This function draws a rectangle of width w and height h with its bottom left corner
at (x, y). It is filled with the colour specified by currcolour.
fillrndrect(x, y, w, h, radius)
This function draws the rectangle of width w and height h with rounded corners of
given radius. The bottom left corner is at (x, y). It colour is specified by currcolour.
If radius is less than or equal to zero the corners are square, and if radius is greater
than half the shorter side length it is reduced to this value.
drawcircle(x, y, radius)
This function draws a circle centred at (x, y) with given radius. Its colour is specified
by currcolour.
fillcircle(x, y, radius)
This function draws a filled circle centred at (x, y) with given radius. Its colour is
specified by currcolour.
wrgraph(filename)
This function writes the image held in canvas to the given file in .bmp format. The
image is (currently) scaled to 300 DPI which corresponds to 11811 pixels per metre.
At this scale the size of an A4 page is 2490x3510 pixels.
There are two programs to illustrate how this graphics library can be used. They
are bcplprogs/tests/grtst.b and bcplprogs/tests/grpalette.b. If you are using
BCPL under Linux you can compile and run them as follows.
3.8. THE BMP GRAPHICS LIBRARY 109
cd ~/distribution/BCPL/bcplprogs/test
cintsys
c b grtst
grtst
ctrl-c
gimp grtst.bmp
ctrl-c
cintsys
c b grpalette
grpalette b8
ctrl-c
gimp palette.bmp
ctrl-c
cintsys
grpalette b24
ctrl-c
gimp palette.bmp
The image displayed by the last call of gimp is shown in figure 3.11 illustrating
some of the colours available when using 24 bit pixels.
GET "libhdr"
MANIFEST { g_sdlbase=nnn } // Only used if the default setting of 450 in
// libhdr is not suitable.
GET "sdl.h"
GET "sdl.b" // Insert the library source code
.
GET "libhdr"
MANIFEST { g_sdlbase=nnn } // Only used if the default setting of 450 in
// libhdr is not suitable.
GET "sdl.h"
There are several programs that use SDL described in Chapter 4 of bcpl4raspi.pdf
available from my home page.
The width and height of the currently seclected surface is held in currxsize and
currysize.
The vectors leftxv, leftzv, rightxv and rightzv are used by the functions such
as drawtriangle and drawtriangle3d defined in g/sdl.b that draw filled objects.
The variables miny and maxy are also used by these functions.
The vector depthv and variables miny and maxy are used by functions in g/sdl.b
and should not be touched by the user. The 3D drawing functions use depthv
to implement the hiding of pixels that are further from the eye than other pixel
at the same position on the screen. The upperbound of depthv is depthvupb
(=screenxsize*screenysize-1). The elements of depthv are scaled integers with
zfac units of depth corresponding to a distance of one pixel. The variab;e zfac is
actually a floating point number since the 3D drawing functions typiclally take floating
point pixel coordinates. If zfac is too small the the line of intersection of two nearly
parallel plane can become inaccurate. The maximum allowable scaled depth is held in
maxdepth (=-1 000 000 000).
Whenever anything is drawn it is given the colour held in the variable currclour.
The variables currx and curry give the starting position of 2D lines and characters.
They are updated after each line or characte is drawn. This allows long sequences of
2D lines or charaters to be drawn conveniently.
There is a similar mechanism for drawing 3D lines using the variables currx3d,
curry3d and currsz3d. These hold the integer x and y pixel coordindates of the next
3D line to be drawn with currsx3d being its scaled integer depth.
The variables mousex and mousey hold the current pointer position 0n the screen,
and mousebuttons is a bitpattern indicating which mouse buttons are currently
pressed.
Some or all of the variables eventtype, eventa1, eventa2, eventa3, eventa4 and
eventa5 are set by the function pollevents as described below.
sdle\_active
sdle\_keydown
sdle\_keyup
sdle\_mousemotion
sdle\_mousebuttondown
sdle\_mousebuttonup
sdle\_joyaxismotion
sdle\_joyballmotion
sdle\_joyhatmotion
sdle\_joybuttondown
sdle\_joybuttonup
sdle\_quit
sdle\_syswmeven
sdle\_videoresize
sdle\_userevent
sdle\_arrowup
sdle\_arrowdown
112 CHAPTER 3. THE LIBRARY
sdle\_arrowright
sdle\_arrowleft
sdl\_init_everything
alloc2dvecs()
this function is only used in drawtriangle. It allocates and initialises the vectors
leftxv and rightxv.
alloc3dvecs()
this function is only used in drawtriangle2d. It allocates and initialises the vectors
leftxv, lefttszv, rightxv and rightszv. .
blitsurf(srcptr, dstptr, x, y)
This copies the source surface into the specified position of the destination surface.
rc := closesdl()
This closes down the SDL library returning all allocated space to freestore.
drawby(dx, dy)
This is just calls drawto(currx+dx, curry+dy).
3.9. THE SDL GRAPHICS LIBRARY 113
drawch(ch)
Draw a 12x8 character at the position specified b y and curry and increment curry
by 9. If ch was ’*n’ set currx to 10 and decrement curry by 11.
drawcircle(x0,y0, radius)
Not yet described
drawfillcircle(x, y, radius)
Not yet described
drawfillrect(x0,y0, x1,y1)
Not yet described
drawpoint(x, y)
This draws a point at location (x,y) on the currently selected surface. It colour is
te one set by the most recent call of setcolour.
drawpoint3di(x, y, sz)
Not yet described
drawquad3d(FLT x1, FLT y1, FLT z1, FLT x2, FLT y2, FLT z2, FLT x3, FLT
y3, FLT z3, FLT x4, FLT y4, FLT z4)
Not yet described
drawrect()
Not yet described
drawrndrect()
Not yet described
drawstr()
Not yet described
114 CHAPTER 3. THE LIBRARY
drawto()
Not yet described
drawto3d()
Not yet described
drawto3d()
Not yet described
drawto3di()
Not yet described
drawtriangle()
Not yet described
drawtriangle3d(FLT x1, FLT y1, FLT z1, FLT x2, FLT y2, FLT z2, FLT x3,
FLT y3, FLT z3)
Not yet described
drawwrch(ch)
Not yet described
fillsurf(col)
Not yet described
freesurface(surfptr)
Not yet described
getevent()
Not yet described
getmousestate()
Not yet described
hidecursor()
Not yet described
rc := initsdl()
This initialises the SDL library and sets the global variables used by sdl.b. It
must be called before any other SDL operations can be performed. It returns TRUE if
successful.
colour := maprgb(r,g,b)
This return a value representing the colour specified by its r, r and r components.
It uses the colour represention chosen during the call of mkscreen.
mksurface(w, h, surfptr)
Not yet described
moveby(dx, dy)
This is just calls moveto(currx+dx, curry+dy).
moveto(x, y)
This selects position (x,y) in the currently selected surface used by subsequent
calls of drawch, drawto and drawby. Its depth coordinate is given the value zero.
sdldelay(msecs) This causes a real time delay of the specified number of millisec-
onds.
sdlmsecs()
This return the number of real time milliseconds since the current command was
entered.
rc := setcaption(title)
This resets the title of the window created by mkscreen. It returns TRUE if success-
ful.
setcolour(col)
This sets the colour to be used in subsequent drawing commands. The colour should
be one returned by a call of maprgb.
116 CHAPTER 3. THE LIBRARY
setcolourkey(col)
This sets the colour that has the special property that when an attempt is made
to to draw a pixel with this colour the pixel is left with its previous colour. This
mechanism is used, for example, when displaying the moving coloured circles by the
program bcplprogs/raspi/bucket.b.
showcursor()
This causes the cursor to be displayed.
standardize(v)
This divides the three floating point elements of the vector v by length of the vector
leaving the elements of v set to the direction cosines of the given vector.
updatescreen()
This causes the surface that is currently being drawn to be copied the display
screen.
GET "libhdr"
MANIFEST { g_glbase=nnn } // Only used if the default setting of 450 in
// libhdr is not suitable.
GET "gl.h"
GET "gl.b" // Insert the library source code
.
GET "libhdr"
MANIFEST { g_glbase=nnn } // Only used if the default setting of 450 in
// libhdr is not suitable.
GET "gl.h"
GET "libhdr"
MANIFEST { g_sndbase=nnn } // Only used if the default setting of 400 in
// libhdr is not suitable.
GET "sound.h"
GET "sound.b" // Insert the library source code
The manifest constant g sndbase specifies the position of the first global variable to
be used by the sound library.
GET "libhdr"
MANIFEST { g_extbase=nnn } // Only used if the default setting of 900 in
// libhdr is not suitable.
GET "ext.h"
GET "ext.b" // Insert the library source code
Chapter 4
The Command Language Interpreter (CLI) is a simple interactive interface between the
user and the system. It loads and executes previously compiled programs that are held
either in the current directory or one of the directories specified by the shell environ-
ment variable (typically BCPLPATH or POSPATH) whose name is in rootnode!rtn path.
These commands are described in Section 4.3 and their source code can be found in
the com directory. The command language is a combination of the features provided
by the CLI and the collection of commands that can be invoked. Under Cintpos, a
similar CLI program provides command language interpreters in several contexts such
as those created by the commands: run, newcli, tcpcli and mbxcli. Details of the
implementation of both CLIs are given at the end of this chapter from page 155.
Commands can set a return code in the global returncode with zero meaning
successful termination and other values indicating the severity of the fault. Commands
that set a non zero return code are expected to leave a reason code in result2. The
CLI copies the return code and reason code of the previous command into the CLI
variables cli returncode and cli result2, respectively. These can be inspected by
commands such as if and why and also used by the CLI to terminate a command-
command if the failure was severe enough. For details, see the command failat on
page 140 below.
123
124 CHAPTER 4. THE COMMAND LANGUAGE
no return link has been stored into the stack, this call of start must not attempt
to return in the normal way; however, its execution can still be terminated using
sys(Sys quit,0).
The global vector and stack shown in figure 4.1 are used by start and form the
running environment both during initialization and while running the debugger. The
CLI, on the other hand, is provided with a new stack and a separate global vector,
thus allowing the debugger to use its own globals freely without interfering with the
command language interpreter or running commands. The global vector of 1000 words
is allocated for the CLI and this is shared by the CLI program and its running com-
mands. The stack, on the other hand, is used exclusively by the command language
interpreter since it creates a coroutine for each command it runs.
Tally vector
blklist
stack globals
rootnode PC P G
Entry to start
Control is passed to the CLI by means of the call sys(Sys interpret,regs) which
recursively enters the intepreter from an initial Cintcode state specified by the vector
regs in which that P and G are set to point to the bases of a new stack and a new global
vector for CLI, respectively, PC is the location of the first instruction of startcli, and
count is set to -1. This call of sys(Sys interpret,regs) is embedded in the loop
shown below that occurs at the end of the body of start.
{ LET res = sys(Sys_interpret, regs) // Call the interpreter
IF res=0 DO sys(Sys_quit, 0)
debug res // Enter the debugger
} REPEAT
At the moment sys(Sys interpret,regs) is first called, only globsize, sys and
rootnode have been set in CLI’s global vector and so the body of startroot must
be coded with care to avoid calling global functions before their entry points have be
placed in the global vector. Thus, for instance, instead of calling globin to initialise
the globals defined in BLIB, SYSLIB and DLIB, the following code is used:
sys(Sys_globin, rootnode!rtn_blib)
If a fault occurs during the execution of CLI or a command that it is running,
the call of sys(Sys interpret,regs) will return with the fault code and regs will
4.2. BOOTSTRAPPING CINTPOS 125
hold the dumped Cintcode registers. A result of zero, signifying successful completion,
causes execution of Cintsys to terminate; however, if a non zero result is returned, the
debugger in entered by means of the call debug(res). Note that the Cintcode registers
are available to the debugger since regs is a global variable. When debug returns,
the REPEAT-loop ensures that the command language interpreter is re-entered. The
debugger is briefly described in the Chapter 7.
On entry to startroot, the coroutine environment is initialised by setting currco
and colist to point to the base of the current stack which is then setup as the root
coroutine. The remaining globals are the initialised and the standard input and output
streams opened before loading the CLI program by means of the following statement:
rootnode!rtn_cli := globin(loadseg("syscin/cli"))
All console input and output within BOOT and the standalone debugger is done us-
ing the standalone version of rdch and wrch, so these globals are updated appropriately.
BOOT next initialises the variables used by the standalone debugger. These include
the vectors bpt addr, bpt instr and bpt dbgvars which respectively hold breakpoint
addresses, breakpoint instructions that have been overwritten by the BRK instruction,
and the vector of the 10 standalone debugger variables V0 to V9. These three vectors
are placed in the rootnode to make them accessible both to the DEBUG task and to
dumpdebug when it is inspecting a system dump.
BOOT now creates and initialises a global vector and a stack to be used during the
further initialisation of the Cintpos system. The all elements of the global vector are
given values of the form globword(=#x8F8F0000)+n, except for the globals globsize,
sys, rootnode, currco and colist, the last two being set to zero. Every element of
the stack is set to stackword (=#xABCD1234). The register set klibregs is initialised,
giving zero to A, B and C, the stack and global vector pointers to P and G, the value
one to ST to indicate execution is in KLIB and interrupts are disabled, and the entry
point startroot in PC. This register set is then handed to a recursive call of the
interpreter. This inner call is the one than performs the rest of the initialisation and
enters the normal execution of Cintpos. In due course the interpreter will return with
a completion code which controls what BOOT should do next.
A completion code of zero signifies successfully completion and BOOT causes the
termination of cintpos. A return code of -1 is special, causing BOOT to re-enter
the interpreter immediately. Its purpose is to allow a running program to change
which interpreter is used. There are typically two interpreters: a slow one in which all
debugging aids are turned on, and a fast one in which most aids are turned off. The call
sys(Sys interpret, regs) selects the fast interpreter if the count register in regs
is -1, otherwise it selects the slow interpreter. The return code -2 allows a running
program to invoke the dumpmem mechanism to write the file DUMP.mem representing
the current state of the entire Cintcode memory. All other completion codes causes
BOOT to invoke the standalone debugger.
BOOT cunningly places a private version of the sys function in its global vector
so that, even if a breakpoint is set in the public version of sys, BOOT and in partic-
ular the standalone debugger can continue to work as normal. When BOOT invokes
the interpreter for the first time execution begins at the start of startroot which is
described in the next section.
4.2.2 startroot
This function creates the Cintpos running environment and loads all the resident system
tasks. Finally it enters the Cintpos scheduler which, in turn, gives control to the Idle
task which sends a packet to the root CLI task. After some initialisation, this issues the
first CLI prompt inviting the user to type in a command. Knowledge of the underlying
structures used by Cintpos is key to understanding how Cintpos works. They are
described in this section in the order in which startroot creates them.
startroot is entered by the recursive call of interpret from BOOT with a new
stack and a different global vector from that used by BOOT. If the interpreter sub-
4.2. BOOTSTRAPPING CINTPOS 127
may return control to the interrupted task or it may enter the scheduler if another task
deserves to gain control.
Before creating the resident tasks, startroot initialises a few more rootnode fields.
These are rtn tcblist and rtn crntask both set to zero since there are currently no
Cintpos tasks, rtn blklist set to the start of the memory block list used by getvec
and freevec, rtn clkintson set to FALSE to globally disable interrupts, rtn clwkq
set to zero representing an empty list of packets for the clock device, and rtn info set
to a cleared table of 50 elements.
The resident tasks are now created using suitable calls of createtask. Each time
createtask is called it allocates a task control block (TCB) giving it the next available
task identifier and updating the appropriate entry in tasktab to point to it. Such tasks
are initially given a state of #b1100 indicating that they are DEAD, not HELD and
have no packets in the work queue. The first task to be created is a special one called
Idle whose body is in cin/syscin/idle and although createtask will have chosen
identifier one for it, this must be replaced by zero and it entry in tasktab removed.
It is given a startup packet and an initial state of #b1101 indicating it is DEAD, not
HELD but has a packet and so can be given control by the scheduler when it is run.
Six more resident tasks are now created, all have state #b1100. They are the root
command language interpreter that initially waits for commands from the keyboard,
and interactive debugging task, the console handler providing communication between
the keyboard and tasks, the file handler providing access to disk files, the mailbox
handler that provides a mechanism that lets tasks send and receive short messages via
named mailboxes and the TCP handler providing TCP/IP communication.
Just after Cintpos starts up the status command will output the following.
Once the kernel structure and all the resident tasks have been set up, the system
can be started by entering the scheduler which is a function called srchwk defined in
sysb/klib.b. It take one argument which is a pointer to the highest priority TCB
that could possibly run. It searches through the chain of TCBs that are linked in
decreasing priority order looking at only the status field of each. This field is sufficient
to tell whether the corresponding task can run or not. It has 4 bits IWHP. The I bit is
a 1 if the task has been interrupted in which case its Cintcode registers will be packed
elsewhere in the TCB. The W bit is a 1 if the task is suspended in taskwait waiting
for a packet to arrive from another task or a device. The H bit is 1 if the task is in
HOLD state indicating that it cannot run even if it otherwise would be ready to do
so, and the P bit is a 1 if the tasks’s work queue is not empty. A task cannot be both
interrupted and waiting for a packet and the setting of both the I and W bits have
a special meaning, namely that the task is in DEAD state having no runtime stack
or global vector. There are thus 16 posible states a task can have of which only six
indicate that it is runnable, they are as follows.
4.3. COMMANDS 129
#b0000
This task is runnable but has no packet on its work queue. It is either the current
task or it gave up control voluntarily by for instance sending a packet to a higher
priority task. When it next gains control it will immediately return from the
function that caused it to give up control.
#b0001
This is just like the case above except there is a packet on its work queue.
#b0101
This indicates that the task is waiting for a packet and that one has arrived.
It is thus runnable and when given control the first packet on its work queue
will be dequeued and returned as the result of the taskwait call that caused its
suspension.
#b1000
This indicates the task is in interrupted state with an empty work queue. It is
thus runnable and when given control it will resume execution using the Cintcode
register values saved in the TCB when it was interrupted.
#b1001
This indicates the task is in interrupted state with a non empty work queue.
It is thus runnable and when given control it will resume execution using the
Cintcode register values save in the TCB when it was interrupted.
#b1101
This is a task in DEAD state (with no stack or global vector) but it now has a
startup packet on its work queue. It is thus runnable and when given control will
be initialised with a new stack and global vector and its main function start
in global variable 1 will be called with the startup packet as its first argument.
This packet will have been dequeued.
4.3 Commands
This section describes the Command Language Interpreter commands whose source
code can be found in either cintcode/com or cintpos/com. The rdargs argument
format string for each command is given.
of minutes. The offset is converted into a signed integer representing the number of
minutes to be added to the time of day as supplied by the system. If adjclock is not
given an argument, it just outputs the current offset.
After three and a half minute a message such as the following will appear.
bbcbcpl FROM/A,TO/K,REPORT/K,NONAMES/S,MAX/S,SECTLEN/S
CIN:y, POS:y, NAT:y
This invokes a reconstruction of the BCPL compiler for the BBC Microcomputer
marketed by my brother’s company RCP in the early 1970s. The FROM argument
specifies the BCPL source file. The TO argument specifies the desination file for the
compiled 16-bit Cintcode. The REPORT argument specifies a file to hold error messages.
The NONAMES argument causes the compiler not to embed function names in the com-
piled code. The MAX argument has no effect in this version of the compiler and SECTLEN
controls whether the length of a section of compiled code includes it length in its first
word.
This reconstruction was created to possibly help reconstruct the BBC Domes-
day Project that ran on the BBC Microcomputer using a Philips 12” laser disc.
Related commands are mapcode and prbbcocode. For more information, see
bcplprogs/bbcmicro/ in the standard BCPL distribution.
bbcbcpl32 FROM/A,TO/K,REPORT/K,NONAMES/S,MAX/S,SECTLEN/S
CIN:y, POS:y, NAT:y
This command is similar to bbcbcpl but generates 32-bit Cintcode suitable for the
current BCPL Cintcode system. The BBC BCPL programs may need minor changes
needed because the BCPL word length has increased from 16 to 32 bits.
This command is now obsolete since using bbc2bcpl is a more satisfactory way to
run BBC BCPL on the modern BCPL Cintcode system.
bbcbcpl32 will soon be deleted.
This command convert the BBC BCPL program given by FROM to a destination file
given by TO. The result can be compiled and run under the modern BCPL Cintcode
system. HARD causes the program to abort on every detected error, EQCASES toggles
whether the case of letters in reserved word and identifiers are ignored. By default the
case of letters is ignored. Several minor replacements are made, such as LOGAND, LOGOR,
LSHIFT, EQ and LE are replaved by &, |, <<. = and <=. Missinf close sections brackets are
automatically inserted with the correct indentation when multiple sections are closed
by a tagged closing bracket. All dots in identediters are replaced by underscores. The
cases of letters used in identifiers is modified to agree with the way each identifier was
written when first encountered. To do this the program reads GET files but leave the
GET directives unchanged.
bcpl FROM/A,TO/K,VER/K,SIZE/K/N,TREE/S,NONAMES/S,
D1/S,D2/S,OENDER/S,EQCASES/S,BIN/S,XREF/S,GDEFS/S,HDRS/K,
GB2312/S,UTF8/S,SAVESIZE/K/N,HARD/S,T32/S,T64/S,
OPT/K,TREE2/S,NOSELST/S CIN:y, POS:y, NAT:y
This invokes the BCPL compiler. The FROM argument specified the name of the file
to be compiled. If the TO argument is given, the compiler generates code to the specified
file. Without the TO argument the compiler will output the OCODE intermediate
form to the file ocode as a compiler debugging aid. This file can be converted to a
more readable form using the procode command, described below. The VER argument
redirects the standard output to a named file. The SIZE argument specified the size
of the compiler’s work space. The default is 100,000 words. The NONAMES switch
causes the compiler not include section and function names in the compiled code. The
switches D1 and D2 control compiler debugging output. D1 causes a readable form of the
compiled Cintcode to be output. D2 causes a detailed trace of the internal working of
the codegenerator to be output. D1 and D2 together causes a slightly more detailed trace
of the internal working of the codegenerator. OENDER causes code to be generated for a
machine with the opposite endianess of the machine on which the compiler is running.
EQCASES causes all identifiers to be converted to uppercase during compilation. This
allows very old BCPL programs to be compiled. BIN causes the target Cintcode to be
in binary rather than the ASCII encoded hexadecimal normally used. The XREF option
causes a line to be output by the compiler for each non local identifier occurring in the
program. A typical such line is as follows:
It shows that the variable all was declared as global variable 201 and its was loaded
in the compilation of statements on line 9 of the program queens.b and the context of
its use was: all&~(ld|col|rd). These lines can be filtered and sorted to form a cross
reference listing of a program. See, for instance, the file BCPL/cintcode/xrefdata
or Cintpos/cintpos/xrefdata. If both VER and XREF are specified the xref data is
appended to the verification stream. This allows the xref data generated by several
separate compilations to be concatenated. The resulting file can be filtered and sorted
by the sortxref command. Typical usage is as follows:
132 CHAPTER 4. THE COMMAND LANGUAGE
delete -f rawxref
c compall "ver rawxref xref"
sort rawxref to xrefdata
delete rawxref
The GDEFS switch is a debugging aid to output the global numbers of any global
function defined in the program. For example:
The UTF8 and GB2312 options specify the default encoding for extended characters
in string and character constants. This default can be overridden in individual constants
using the *#u and *#g escape sequences, as described on page 18.
The SAVESIZE option allows the user to specify the number of words in the argument
stack used to hold function return information. The default value is three making room
for the old P pointer, the return address and the entry point of the current function.
When compiling into native code using the Sial mechanism, the save space size may
be different, since, for instance, some or all of this information may be stored in the
hardware (SP) stack.
The HARD options causes both syntax and translation phase errors to call
abort(100). This is useful in commands such as: c compall hard allowing each
error in a long sequence of compilations to be inspected separately.
The arguments T32 and T64 specify whether the target architecture is for 32 or 64
bit BCPL.
The argument OPT gives a list of conditional compilation option names consisting
of letters, digits, underline and dot, separated by plus signs or any other characters
not allowed in option names. These options are declared at the start of compilation of
every BCPL section.
The debugging options TREE and TREE2 cause the parse tree to be output before
and after the the conversions caused by the FLT feature.
The option NOSELST causes the compiler to avoid using the Ocode instructions
SELLD and SELST when compiling the OF operator. Less efficient code is compiler using
shifts and logical instructions. This option causes the bcpl2sial command not to use the
4.3. COMMANDS 133
new SIAL function codes selld, selst and xselst, enabling older Sial codegenerators
to continue to work.
bcpl2sial FROM/A,TO/K,VER/K,SIZE/K/N,TREE/S,NONAMES/S,
D1/S,D2/S,OENDER/S,EQCASES/S,BIN/S,XREF/S,GDEFS/S,HDRS/K,
GB2312/S,UTF8/S,SAVESIZE/K/N,HARD/S,T32/S,T64/S,
OPT/K,TREE2/S,NOSELST/S CIN:y, POS:y, NAT:y
This command compiles a BCPL program into the internal assembly language Sial
which is designed as a low level intermediate target code for BCPL and is described in
Section 10.1. The command sial-sasm, described below, can be used to convert Sial
into a human readable form and various commands, such as sial-386, sial-alpha and
sial-arm will convert Sial to assembly language for corresponding architectures. The
bcpl2sial command uses the same front end as bcpl and so takes the same command
arguments as the bcpl command.
V Local variable
P Function or Routine
L Label
G Global
M Manifest
S Static
F FOR loop variable
The TO argument can be used to redirect the output to a file, and the PAT argument
supplies a pattern to restrict which names are to be cross referenced. Within a pattern
an asterisk will match any sequence of characters, so the pattern a*b* will match
identifiers such as ab, axxbor axbyy. Upper and lower case letters are equated. This
command has largely been superceded by the xref option in the bcpl command and
the related sortxref command.
ABCDEFGH
12345678
bmake TARGET,FROM/K,TO/K,-m/S,-l/S,-p/S,-r/S,-s/S,-c/S,-d/S
CIN:y, POS:y, NAT:n
This command provides an approximation the make command found in other sys-
tems. It uses a makefile (normally bmakefile) to generate a CLI sequence of commands
to bring a specified target up to date. The makefile is expanded using the BGPM macro-
generator and parsed to form a set of pattern rules and explicit rules. Each rule has a
target, an optional set of items on which the target depends and a possibly empty CLI
command sequence to execute if the target need to be brought up to date.
Pattern rules generate explicit rules when needed. They contain parameters of the
form <tag>. Within a pattern all tags must be the same and must be declared in the
target of the rule.
The optional first argument (TARGET) is normally a file name and specifies the target
to make. If no target is specified, the target of the first rule is used. The optional FROM
argument specified the makefile name. The default makefile is bmakefile. The optional
TO argument specifies where the output is to be sent.
The -m argument causes bmake to output the makefile file after macrogeneration.
The -l argument outputs the makefile as a sequence of lexical tokens. The -p argument
outputs the set of rule patterns. The arguments -r and -s output the explicit rules
before and after the application of the rule patterns, respectively. The -c argument
outputs the sequence of commands required to bring the target up to date. The -d
argument generates a debugging trace of the execution of bmake.
The BGPM macrogenerator is described elsewhere, but the version use in bmake
uses the following special characters:
136 CHAPTER 4. THE COMMAND LANGUAGE
Such rules are only used when there is no explicit rule for a given target. When a rule
pattern is applied all occurrences of its parameter are replaced by the text that allowed
the target item to match the required target. So if cin/echo must be brought up to
date and has no explicit rule, the above pattern will automatically add the following
explicit rule to the set:
A target is out of date if it does not exist or if any of the items it depends on are out
of date or have a modify dates later than that of the target. A target is brought up to
date by, first, bringing the items it depends on up to date and then executing the CLI
command sequence given by the body.
Items may consist of any sequence of characters not including %, [, !, ], {, }, =, or
white space, and < and > may only appear in parameters.
In normal use, bmake generates a command-command file to bring the target up
to date and then returns to the CLI to cause this file to be executed. The -c option
allows the command-command file to be inspected without execution.
4.3. COMMANDS 137
qpkt(taskwait()) REPEAT
which repeatedly suspends the task until a packet is received then immediately returns
it to the sender. Packets are normally sent to the bounce task using the send command,
described below.
All directives must occur at the start of the command file. The .KEY directive
specifies a format string of the form used by rdargs (see page 69) that describes
what arguments can follow the command file name. The .DEFAULT directive specifies
the default value that a specified key should have if the corresponding argument was
omitted. The remaining directives allow the special characters to be changed.
The command sequence occurs after all the directives and may contain items of the
form <key$value> or <key> where key is one of the keys in the format string and value
is a default value. Such items are textually replaced by its corresponding argument or
a default value. If $value is present, this overrides (for this item only) any default that
might have been given by a .DEFAULT directive.
138 CHAPTER 4. THE COMMAND LANGUAGE
0.000> cobounce
Calling the bounce coroutine 10000000 times
About 7812500 coroutine changes per second
done
2.560>
This shows that transferring control between coroutines is about 12 time faster than
transferring control between Cintpos tasks as demonstrated by the send command
described below.
Monday 23-Apr-2010
## ## ######## ## ## ######
## ## ######## ## ## ########
## ## ## ## ## ## ##
######## ###### ## ## ## ##
## ## ## ## ## ## ##
## ## ## ## ## ## ##
## ## ######## ######## ######## ########
## ## ######## ######## ######## ######
hexdump FROM/A,N/N,P/N,RL/K/N,RLB/K/N,TO/K,
X1/S,X2/S,X4/S,LIT/S,BIG/S CIN:y, POS:y, NAT:y
This program dumps a file specified by FROM in a combination of hex and character
forms. If either RL or RLB is given the file is treated as a sequence of records. RL gives
the record length in BCPL words and RLB gives it in bytes. The P and N arguments
give the number of the first record to dump and N specifies how many to dump. If
neither RL nor RLB is given P gives the number of the first byte to dump and N gives the
number of bytes to dump. X1 causes the file to be dumped as a sequence of individual
bytes. X2 causes the file to be dumped as a sequence of 16-bit words, and X4 causes
the file to be dumped as a sequence of 32-bit words. LIT or BIG specify whether to
use little-ender or big-ender ordering when dumping words. It neither are specified the
enderness of the current computer is used. If the file bc is as follows:
#!/home/mr/distribution/BCPL/cintcode/cintsys -s
.k file/a,arg
echo "bcpl com/<file>.b to cin/<file> hdrs BCPLHDRS <arg>"
bcpl com/<file>.b to cin/<file> hdrs BCPLHDRS <arg>
distinct from those already in the BLIB chain. The CANCEL argument specifies the name
of a section to remove from the BLIB chain. The LIST switch argument causes a list
of the section names in the BLIB chain to be output. The argument -g causes a list of
all the global functions defined in the BLIB chain to be output including the names of
the sections they are in. The TO argument specifies the name of a file where the output
is to be sent. It is often useful to sort this file using sortlines. Normally the library
command is only used during the initialisation of special purpose versions of Cintsys
or Cintpos, or when one wishes to see which functions are defined in BLIB.
where prompt is the prompt format string, cpumsecs is the time in milliseconds used
by the previous command, taskno is the current task number under Cintpos and zero
otherwise. The arguments hours, mins, secs and msecs represent the current time of
day. The default prompt format under Cintpos is: "%+%n> " and under the other
systems is: "%5.3d> ". An example of how it might be used is as follows.
0>
0> prompt "%+%+%z2:%z2:%z2 %-%-%-%-%-%5.3d> "
15:11:52 0.000>
15:11:55 0.000> bench100
bench mark starting, Count=1000000
starting
finished
qpkt count = 2326410 holdcount = 930563
these results are correct
end of run
15:12:14 10.690>
This shows that bench100 finished execution 14 seconds after 3:12pm after running for
10.690 seconds.
This command differs from fail since it terminates the execution of a command-
command while fail allows a command-command to continue run.
rast2ps FROM,SCALE/N,TO/K,ML/N,MH/N,MG/N,FL/N,FH/N,FG/N,
DPI/K/N,INCL/K,A5/S,A4/S,A3/S,A2/S,A1/S,A0/S CIN:y, POS:y, NAT:y
This command converts a raster data file (written using the raster command
described below) into a postscript file suitable for printing.
The FROM parameter specifies the name of the raster data file. RASTER is the default.
SCALE specifies a magnification as a percentage. The default is 80. The TO parameter
specifies the name of the postscript file to be generated. RASTER.ps is the default.
The parameters ML and MH specify the low and high limits of the address space to be
processed. MG specifies the separation of the grid line on the memory axis. The default
values of MH and FH are given by the FROM file. The default values of ML and FL are
both zero. Unless MG and FG are given, suitable values are chosen automatically. The
units are in bytes. The parameters FL and FH specify the low and high limits of the
instruction count axis to be displayed. FG specifies the separation of the grid line on the
memory axis. DPI specified the approximate number of dots per inch used by the output
device. The default is 300. An specified the output page size. The default is A4. The
INCL parameter specifies the name of a file to be copied into the postscript file. This
file allows annotations to be made in the picture. The file cintcode/origbcplps.h
was used to annotate the memory time graph shown in Figure 4.2. This file contains
lines such as:
F2 setfont
(SYN) 1.1 35 2 PDL
(TRN) 8.1 30 1.7 PUL
(CG) 15.3 36 2.1 PUR
(GET Stream) 0.45 270 1.7 PUL
...
(OCODE Buffer) 13.9 245 2 PDR
% 8.5 150 MVT (HELLO WORLD) SC
F3 setfont
(Self Compilation of the Cintcode BCPL Compiler) TITLE
The postscript macros PDL, PUL, PUR and PDR draw arrows with specified labels, byte
address, instruction count and arrow lengths. The arrow directions are respectively:
down left, up left, up right and down right. The macro MVT moves to the specified
position in the graph and SC draws a string centered at that position. The TITLE
macro draws the graph title and F2 and F3 are fonts suitable for the labels and title.
The resulting postscript file can, of course, be further editied by hand.
This command converts a raster data file (written using the raster command
described below) into .wav sound file based on the pattern of memory accesses during
the CLI command following the call of raster.
The FROM parameter specifies the name of the raster data file. RASTER is the default.
The TO parameter specifies the name of the .wav file to be generated. RASTER.wav is
148 CHAPTER 4. THE COMMAND LANGUAGE
the default. By default, the program only generated notes that are equal temperament
semitone (12 per octave), but the n argument allows the user to specify a different
number of notes per octave, susch as 24 or 41. The duration of the generated sound file
can be specified using the s or secs argument. The default .wav sample rate is 44100
per second, but 22050 or 11025 can be specified using the r argument. By default notes
are numbered upwards from 0 to 60 with 12 notes per octave the lowest note id C two
octaves below middle C. The d option is a debugging aid that causes all notes other
that a specified one to be silent. This allows the algorithm to choose when to sound a
note to be tested including how it volumes envelope changes. The t argument is not
yet implemented but willin due course generate trace output during the execution of
rast2wav. This command was written the sound generated by EDSAC 2’s loudspreaker
in the 1960s was a remarkably useful debugging aid.
As a demonstration, origbcpl.wav or origbcpl.mp3 is the sound of the early
version of the BCPL compiler compiling itself, and the following command sequence
from a bash prompt will generate an approximation to Bach’s Invention no 10, bwv
784. These demonstrations are not yet ready.
rastsys
c bc bwv784
raster
bwv784
rast2wav
ctrl-c
audacity RASTER.wav
This command controls the generation of raster data but only works when the
BCPL Cintcode system is running under the rastering interpreter rasterp. The im-
plementation uses sys(Sys setraster,...) calls that are described on page 85. If
raster is called without the BITS options it activates the rastering mechanism for the
duration of the next CLI command. Without the BITS option the default TO file is
RASTER. The format of this file is outlined on page 85.
The COUNT argument specifies the number of Cintcode instructions to obey per
raster line. The default is 1000. The SCALE argument gives the number of byte addresses
per unit on the memory axis. The default being 8.
The raster data file can be processed and converted to Postscript using the rast2ps
command described above. Typical use of the raster command is following script,
starting from a linux bash prompt:
rastsys
raster
origbcpl com/origbcpl.b to junk
rast2ps incl origbcplps.h
ctrl-c
ps2pdf RASTER.ps
okular RASTER.pdf
This will create a .pdf file for an early version of the BCPL compiler compiling itself,
4.3. COMMANDS 149
similar to that shown in Figure 4.2. For a more detailed view of the parse tree while
SYN is being compiled, try:
rastsys
raster
origbcpl com/origbcpl.b to junk
rast2ps incl origbcplps.h ml 350000 mh 500000 fh 6000000
ctrl-c
ps2pdf RASTER.ps
okular RASTER.pdf
500K
OCODE Buffer
GET Stream
450K
400K
TRN Parse Tree
SYN Parse Tree CG Parse Tree Compiled Code Buffer
350K
Declaration Vector
300K
250K
Input Stream Code Output Stream
200K
150K
Stack CG
SYN TRN
100K
50K
0K
0.00M 2.00M 4.00M 6.00M 8.00M 10.00M 12.00M 14.00M 16.00M 18.00M 20.00M 22.00M
If raster is called with the BITS option, the next CLI command will generate a bit
stream file corresponding to the fifth bit of every Cintcode byte address accessed. The
default TO file name is RASTER.bits. This file contains one byte for every 8 memory
references so can become very large. It can be converted to a .wav sound file using the
bits2wav command.
the tally vector and causing tallying to be enabled for the next command to be executed.
Subsequent commands are not tallied, making it possible to process the tally vector
while it is in a static state. Typical usage of the stats command is illustrated below:
14:12:36.069
This indicates, for instance, that there are currently 7 blocks of requested size 23
allocated.
to
ABCDEFGHIJKLMNOPQRSTUVWXYZ
1234567890
#####filename#
followed by the encoded file in which all characters with ASCII codes in the range 33
to 126 except for ’#’, ’=’ and ’.’ are copied, spaces are replaced by dots (’.’) and
all other characters (including ’#’ ’=’ and ’.’) are encoded by #hh where hh is the
ASCII code in hex. The encoded files are broken into lines of about 50 characters. The
last file to be encoded is terminated by ######+#.
Such xencode’d files can be decoded by the xdecode command.
As can be seen cli init must either return zero or a function that can be applied
156 CHAPTER 4. THE COMMAND LANGUAGE
writef(cli_prompt,
cpumsecs, // msecs used by last command
taskid, // The task number, if running under Cintpos
hours, mins, secs, msecs) // The time of day
where hours, mins and secs correspond to the current time of day. On single threaded
BCPL systems taskid is set to 1.
When cintsys or cintpos is started a stream is opened to receive input from standard
input which is normally the keyboard and a second stream is opened to allow output
to standard output which is normally the screen. This combination of keyboard and
screen is called the console. The treatment of console streams depends on whether
cintsys or cintpos is being used.
159
160 CHAPTER 5. CONSOLE INPUT AND OUTPUT
is written, or when the buffer becomes full, or a call of deplete is made, the contents
of the buffer is transmitted to the screen by calls of sawrch.
Sequence Purpose
@A Set flag 1 in the currently selected task
@B Set flag 2 in the currently selected task
@C Set flag 3 in the currently selected task
@D Set flag 4 in the currently selected task
@E Send the current incomplete line to the currently selected
task
@F Throw away the current incomplete line and all outstanding
completed lines
@H Hold the currently selected task
@L Throw away the current incomplete line
@Sdd Set the currently selected task to task dd and allow output
from any task
@Tdd Set the currently selected task to task dd and only allow
output from task dd
@U Unhold the currently selected task
@Xhh Input the character with hex code hh
@Y Toggle message tagging. When tagging is enabled every line
of output identifies the originating task
@Z Toggle echo mode. When echoing is off subsequent charac-
ters are not echoed to the screen. This is useful for typing
passwords.
@ddd Input the character with octal code ddd
@@ Input @
5.2.1 Devices
The input and output device intentifiers may be inspected and changed by the following
call:
old_in_devid := sendpkt(notinuse, console_task, Action_devices,
?, ?,
new_in_devid,
new_out_devid)
old_out_devid := result2
The device identifiers are only changed if the new identifiers are non zero. This call
is used, for instance, by the record command to change replace the screen output
device with a task that forwards each character to the screen while recording timing
information. For details, see the programs com/record.b and com/recordtask.b
suspended and client tasks have direct access to the keyboard input device on a first
come first served basis by the call:
Sending an exclusiveinput request with argument FALSE returns the console handler
to its normal line editing mode and causes all outstanding exclusiverdch requests to
return end-of-file characters (-1) to their client tasks.
Cintpos/posprogs/test/inputtst.b
Cintpos/posprogs/test/sardchtst.b
Cintpos/posprogs/test/devrdchtst.b
Cintpos/posprogs/test/xintst.b
Chapter 6
Cintpos Devices
Cintpos allows asynchronous communication with peripheral devices using the qpkt
and taskwait functions. If the pkt id field of packet given to qpkt is negative, the
packet is sent to the identified device. It is returned when the device has completed
the requested operation. Most devices have device control blocks (DCBs) that contain
device related data. There is a device table pointed to by rootnode!rtn_devtab
whose upper bound is held in its zeroth element. The nth element of the device table
is zero if the device does not exist, otherwise it points to the DCB of device -n. Most
devices are implemented using threads of the host operating system, but some devices
such as the clock and screen are special and use a polling mechanism implemented
entirely within the interpreter thread. The extra overhead for this is small since the
interpreter only performs the polling operation about once every 10000 or so Cintcode
instructions. This figure is typically adjusted to cause polling to take place about once
per millisecond. When Cintpos has no work to do it should enter the Idle task and
stop executing Cintcode instructions so that other programs can run. For the polling
mechanism to work, such suspensions must be short. This is normally implemented
using the waitirq sys function with a short timeout. Each time waitirq returns, a
counter in the intepreter is set to zero to cause the polling mechanism to be activated.
The resident Cintpos devices are described below.
163
164 CHAPTER 6. CINTPOS DEVICES
Tcp socket
This attempts to create a port for a two way byte stream using the IPv4 protocol.
If the result is -1 there was an error, otherwise it returns the number of the new socket.
Tcp connect arg1: sock arg2: ipaddr arg3: port arg4: timeout
This attempts to establish a connection to a remote host via the given socket within
the given timeout. If timeout is greater than zero it specifies a timeout time in milli-
seconds, if it is zero there is no timeout and if it is -1 polling will be used but this
is not yet implemented. The result is zero if a connection was established, otherwise
it is negative and the second result indicates why the connection was not established.
A value greater than zero indicates an error, the value -1 the connection was closed
by the remote host, -2 indicates that the connection was not established within the
timeout period, and -3 indicates that when polling the connection has not yet been
established.
Tcp recv arg1: sock arg2: buf arg3: len arg4: timeout
This attempts to read up to len bytes into the given buffer from the specified socket
within a specified timeout period. If timeout is greater than zero it is the timeout
period in milli-seconds, if it is zero there is no timeout and if it is negative the packet
is returned immediately with as many characters as are currently available. A negative
result indicates failure with a reason given in the second result, otherwise it is the
number of bytes actually read.
Tcp send arg1: sock arg2: buf arg3: len arg4: timeout
This attempts to send len bytes from the given buffer via the specified socket within
166 CHAPTER 6. CINTPOS DEVICES
a specified timeout period. If timeout is greater than zero it is the timeout period in
milli-seconds, if it is zero there is no timeout and if it is negative the packet is returned
immediately having written as many bytes as are currently possible. A negative result
indicates failure with a reason given in the second result, otherwise it is the number of
bytes actually sent.
The Debugger
Both Cintsys and Cintpos have interactive debuggers but these are slightly different
and so will be described separately.
560> abort
!! ABORT 99: User requested
*
The asterisk (*) is the debugger’s prompt character. A brief description of the available
debug commands can be display using the query (?) command.
167
168 CHAPTER 7. THE DEBUGGER
* ?
? Print list of debug commands
Gn Pn Rn Vn Variables
G P R V Pointers
n #b101 #o377 #x7FF ’c Constants
*e /e %e +e -e |e &e ^e Dyadic operators
!e Subscription
< > Shift left/right one place
$b $c $d $e $f $o $s $u $x Set the print style
SGn SPn SRn SVn Store in variable
= Print current value
TRn Trace the next n instructions
Tn Print n consecutive locations
I Print current instruction
N Print next instruction
Q Quit
B 0Bn eBn List, Unset or Set breakpoints
C Continue execution
X Equivalent to G4B9C
Z Equivalent to P1B9C
\ Execute one instruction
, Move down one stack frame
. ; [ ] Move to current/parent/first/next coroutine
*
The debugger has a current value that can be loaded, modified and displayed. For
example:
Four areas of memory, namely: the global vector, the current stack frame, the Cint-
code register, and 10 scratch variables are easily accessed using the letters G, P, R, V,
respectively.
Notice that values that appear to be entry points display the first 7 characters of
the function’s name. Other display styles can be specified by the commands $C, $D, $F,
$B, $O, $S, $U or $X. These respectively display values as characters, decimal number,
in function style (the default), binary, octal, string, unsigned decimal and hexadecimal.
It is possible to display Cintcode instructions using the commands I and N. For
example:
A breakpoint can be set at the first instruction of clihook and debugged program
re-entered by the following:
The X command could have been used since it is a shorhand for G4B9C. The function
clihook is defined in BLIB and is called whenever a command is invoked. For example:
Notice that the values of the Cintcode registers A and B are displayed, followed by the
program counter PC and the Cintcode instruction at that point. Single step execution
is possible, for example:
At this point the first instruction of rdargs is about to be executed. Its return address
is in P1, so a breakpoint can be set to catch the return, as follows:
* p1b8
* c
!! BPT 8: 24243
A= createc B= 1 24243: JNE0 24254
*
170 CHAPTER 7. THE DEBUGGER
The next three calls of sys will be to write the characters ABC. The following example
steps through these and displays the state of the runtime stack just before the third
call, before leaving the debugger.
* c
!! BPT 1: sys
A= 11 B= 65 21188: SYS
* c
A
!! BPT 1: sys
A= 11 B= 66 21188: SYS
* c
B
!! BPT 1: sys
A= 11 B= 67 21188: SYS
* . 42844: Active coroutine clihook Size 20000 Hwm 127
43284: sys 11 67 312 43228
* , 43268: cnslwrf 37772
* , 43248: wrch 67 32
* , 43228: writes 42915 67
* , 42888: start 42904 42912 0 4407873
* , 42872: clihook 0
* , Base of stack
* 0b1c Clear breakpoint 1 and resume
C
210>
Command Effect
. Select current coroutine
, Display next stack frame
; Select parent coroutine
[ Select first coroutine
] Select next coroutine
The main additions as Sn to select a task, S. to select the current task and H
to hold or unhold the currently selected task. Since interrupts (particularly from the
clock device) interfere with single stepping of Cintcode instructions, the K command is
provided to turn clock interrupts on and off. The address of the task control block of
the currently selected task is given by W. Thus the first locations of the control block
can be printed by the command Wt10.
The debugger prompt contains a letter indicating whether the next instruction is
to be executed in user mode (a), in kernel mode (k) or within the interrupt service
routine (i). It also contains a number indicating which user task was running.
172 CHAPTER 7. THE DEBUGGER
rastsys
c b testrast -- Compile testraster.b.
raster -- Cause the next command to
-- generate raster data.
testraster -- Actually generate the data.
This creates the raster data file RASTER representing the accesses of memory locations
during the execution of the program testraster.b whose source is:
GET "libhdr"
As can be seen this is a simple test program that that reads every Cintcode memory
word from 0 to 250000. Under 32-bit BCPL these correspond to byte addresses in the
range 0 to 1000000. The resulting file RASTER starts as follows:
The first line specifies the rastering parameters. F1750051 states that the program exe-
cuted 1750051 Cintcode instructions. M1000000 specifies that the highest byte address
referenced was 1000000. K1000 indicates that 1000 Cintcode instructions were executed
7.3. DEBUGGING TECHNIQUES 173
per raster line and S8 says that one unit in the raster lines correspond to 8 address
bytes. What then follows are raster lines with each indentifying which addresses have
been referenced by the previous 1000 instructions. They use run length encoding with
Wn indicating that none the next n units of address space have been referenced and Bn
states that all the next n have been referenced. Each raster line is terminated by an N.
This file and others, some hand written, are used as test data for rast2wav.
The output generated by the program is a .wav file and so it is necessary to fully
understand the format of such a file. The structure is quite simple with a small header
block that describes such things as whether mono or stereo is being used and what the
sample rate is. This block is followed by 16-bit samples. Luckily it is easy to check that
the .wav file structure is correct using the freely available audacity program. This
allows the user inspect, edit and play .wav files.
Probably the most important advice on debugging is to spend sufficient time proof
reading the source code with great care. This is likely to save time in the long run. It
is, of course, essential to thoroughly understand the meaning of every construct in the
code. Misunderstanding the meaning of a statement can lead to bugs that are hard
to find. Luckily BCPL is simple and is easy to learn. Additionally, there are compile
options that help the user to check the meaning of any construct. The precedence of
expression operators such as +, -, * and / are fairly intuitive, and can be checked by
console sessions such as the following.
0.000>
0.000> type t24.b
LET f(x,y,z) = x * y / z
0.012> c b t24 tree
bcpl t24.b to t24 tree
This shows that x*y is computed before dividing by z. Using integer arithmetic the
result would often be different if the division was done first. This difference is significant
in the meaning of q := memvupb * (n+1) / (C7-C2+1) taken from rast2wav.b. The
D1 option is also sometimes helpful, as in:
00.000> c b t24 d1
bcpl t24.b to t24 d1
Note that the three arguments of f are held in positions 3, 4 and 5 relative to the P
pointer.
The precedence of non arithmetic operators are not so intuitive and, indeed,
tend to be different in different languages. A typical BCPL error is the belief that
IF a&7 = b&7 DO means IF (a&7)=(b&7) DO. BCPL uses operators such as & and |
for both Boolean and bit pattern operations and gives them the precedence normally
given to Boolean operators.
The transformations performed by the FLT feature and not always understood but
can be checked using the TREE2 compiler option that outputs the parse tree after these
transformations have been done by the translation phase. As an example study the
following console session.
This shows that the so called simultaneneous assignment is, in fact, a sequence of three
assignments with the second one promoted to floating point. It shows that FLOAT and
FIX are monadic operators more binding that multiplication and division. It also shows
that FLOAT x+y*2 is transformed to FLOAT x#+y#*2.0.
Another useful debugging aid is BCPL’s cross referencing facility. A cross refer-
ence file xrast2wav can be created by the command make xrast2wav. This uses the
following commands in Makefile.
This shows that fcount was declared as global variable 226 on line 97 of rast2wav.b.
This variable is used to hold the number of Cintcode instructions obeyed to reach
the current raster line. On line 643 it is used to compute the time in seconds as a
floating point number corressponding to the time of the current raster line. The actual
statement in rast2wav.b is:
The last three lines show that the function filter was declared to be global 209
and was defined on line 923 and used just once on line 725. We can see that the
arguments in the call matches the parameters in its definition. Careful reading of the
cross reference listing can sometime find errors in the program. This is worth doing
occasionally as the program is being developed.
Another vital tool to assist debugging is the interactive debugger. This is entered
automatically when a fault is detected but can also be entered explicitly by the user.
At an early stage of debugging the following sequence of commands are useful.
0.000> abort
0.011> rast2wav
!! BPT 9: clihook
A= 0 B= 0 25100: K4G 1
*
7.3. DEBUGGING TECHNIQUES 177
* g+200t15
This shows that the 13 global functions in rast2wav.b have been correctly inition-
alised. These are useful since they allow the user to set breakpoints at the the the first
instruction of any of these functions, as in:
* g209= #G0209#
* x
Breakpoint 9 at start of clihook
0.011> rast2wav
!! BPT 9: clihook
A= 0 B= 0 25100: K4G 1
* g209= filter
* b1
* b
1: filter
9: clihook
* c
Converting file RASTER to RASTER.wav
sample_rate = 44100
mono 16-bit samples
Total time with the extra second: 11 seconds
maxaddress = 460876
maxfcount = 58862328
kval=1000 sval=8
c2=0 C3=0 C4=24 C5=36 C6=48 C7=60
Data bytes = 970184
Total number of samples: 440992
debugnote=-1
!! BPT 1: filter
178 CHAPTER 7. THE DEBUGGER
Another good way to enter the debugger is to insert calls of abort into the code
usually preceded by a call of writef or sawritef to output the values of some relevant
variables. In the early stages of debugging it is useful to call abort after the command
arguments have been decoded. For example:
0.000> c bc rast2wav
bcpl com/rast2wav.b to cin/rast2wav
maxaddress = 460876
maxfcount = 58862328
kval=1000 sval=8
This method means we do not need to set the breakpoint in clihook. Giving the call
of abort an essentially random argument makes it easier to find the call in the source
code later.
More to follows.
Chapter 8
BCPL was designed to be a portable language with a compiler that is easily transferred
from machine to machine. To help to achieve this, the compiler is structured as shown
in figure 8.1 so that the codegenerator (CG), which is inherently machine dependent, is
separated from the rest of the compiler. The front end of the compiler performs syntax
analysis producing a parse tree (Tree) which is then translated by the translation phase
(TRN) to produce an intermediate form (OCODE) suitable for code generation.
179
180 CHAPTER 8. THE DESIGN OF OCODE
GET "libhdr"
LET start() BE { LET a, b, c = 1, 0, -1
writef("Answer is %n*n", a+b+c)
}
then the command: bcpl test.b would write the following text to the file ocode.:
These numbers encode the OCODE statements in a natural way as can be verified
by comparing them with the following more readable form of the same statements,
generated by the procode command:
JUMP L2
ENTRY L1 5 ’s’ ’t’ ’a’ ’r’ ’t’
SAVE 3 LN 1 LN 0 LN -1 STORE STACK 9
LSTR 13 ’A’ ’n’ ’s’ ’w’ ’e’ ’r’ ’ ’ ’i’ ’s’ ’ ’ ’%’ ’n’ 10
LP 4 LP 3 ADD LP 5 ADD LG 74 RTAP 6 RTRN STACK 3
ENDPROC STACK 3 LAB L2 STORE GLOBAL 1 1 L1
S
G P
Memory for program and static data
Li Lj
The OCODE machine has five registers G, P, PC, A and L. G points to the global
vector and P points to the stack frame of the currently executing function. PC points
to the next instruction to execute. The current size of the stack frame is held in S but
S does not require a physical register since its value at every point in the program is
8.3. LOADING AND STORING VALUES 181
known by both the frontend of the compiler and the codegenerator. They both know
the effect on S of every OCODE statement.
Static variables, tables and string constants are allocated space embedded in the
compiled code and are referenced using labels such as L36 and L92. All global, local
and static variables are of the same size and, on most modern implementations, they
hold 32 bit values. More recently 64 bit versions of BCPL are becoming more common.
OCODE is normally encoded as a compacted sequence of integers since it is gen-
erated by the frontend of the compiler and read by the codegenerator. For human
consumption a more readable form can be created using the procode command de-
scribed on page 146.. An OCODE statement consists of a function or directive code
possibly followed by operands that are either optionally signed integers, quoted char-
acters or labels of the form Ln where n is a label number. The following are examples
of mnemonic OCODE statements:
There are OCODE statements for loading and storing values, for applying expres-
sion operators, for the implementation of functions and routines, and to control the
flow of execution. There are also directives for the allocation of storage and to allow
information to be passed to the codegenerator.
Statement Meaning
LP n P!S := P!n; S := S+1
LG n P!S := G!n; S := S+1
LL Ln P!S := Ln; S := S+1
LLP n P!S := @P!n; S := S+1
LLG n P!S := @G!n; S := S+1
LLL Ln P!S := @Ln; S := S+1
SP n S := S-1; P!n := P!S
SG n S := S-1; G!n := P!S
SL Ln S := S-1; Ln := P!S
RSTACK n P!n := A; S := n+1
The RSTACK statement is used in conjunction with the RES statement in the com-
pilation of VALOF expressions. See page 186 for details.
The following tables shows the six statements for loading constants.
182 CHAPTER 8. THE DESIGN OF OCODE
Statement Meaning
LF Ln P!S := entry point Ln; S := S+1
LN n P!S := n; S := S+1
FNUM m e P!S := <make float>(m, e); S := S+1
TRUE P!S := TRUE; S := S+1
FALSE P!S := FALSE; S := S+1
QUERY P!S := ?; S := S+1
LSTR n C1 . . . Cn P!S := "C1 . . . Cn "; S := S+1
LF Ln loads the entry point of a non global function onto the stack. LN n loads the
signed integer constant n onto the stack. FNUM m e loads a floating point approximation
of a number with signed integer mantissa m and signed integer decimal exponent e onto
the stack. The statements TRUE and FALSE are present to improve portability between
machines that use different representations for the integers. For instance, on machines
using ones complement or sign and modulus arithmetic, TRUE is not equivalent to LN
-1. QUERY loads an undefined value onto the stack, and the LSTR statement allocates a
string in static memory and loads a pointer to it onto the stack.
Indirect assignments and assignments to elements of word and byte arrays normally
use the statements STIND and PUTBYTE whose meanings are given in table 5.3.
Statement Meaning
STIND !(P!(S-1)) := P!(S-2); S := S-2
PUTBYTE (P!(S-2))%(P!(S-1)) := P!(S-3); S := S-3
LN 12 LG 200 STIND
LN 99 LG 200 LN 3 ADD STIND
LN 65 LG 200 LN 3 PUTBYTE
SELST takes three argments op, len and sh. If op is zero, its effect is equivalent to
SLCT len:sh:0 OF (P!(S-1)) := P!(S-2); S := S-2
8.5. EXPRESSION OPERATORS 183
but if op is non zero it represents and assignment operator (assop) and the statement
is equivalent to:
SLCT len:sh:0 OF (P!(S-1)) assop:= P!(S-2); S := S-2
The mapping between op and assop is given by the following table.
The floating-point assignment operators are only allowed when the specified field
is a full word, typically with len and sh both zero. The SELST operator with len and
sh both zero is used in the compilation assop:= assignments where the left hand side
is a simple variable or a subscripted expression. For instance, the assigment v!3+:=1
might generate the following OCODE.
LN 1
LN 3 LG 200 ADD
SELST ADD 0 0
Statement Meaning
RV P!(S-1) := ! P!(S-1)
ABS P!(S-1) := ABS P!(S-1)
FABS P!(S-1) := FABS P!(S-1)
FLOAT P!(S-1) := FLOAT P!(S-1)
FIX P!(S-1) := FIX P!(S-1)
NEG P!(S-1) := - P!(S-1)
FNEG P!(S-1) := #- P!(S-1)
NOT P!(S-1) := ∼ P!(S-1)
All dyadic expression operators take two operands from stack replacing them the
result and decrementing S by 1. These operators are shown in the following table.
184 CHAPTER 8. THE DESIGN OF OCODE
Statement Meaning
GETBYTE S := S-1; P!(S-1) := P!(S-1) % P!S
MUL S := S-1; P!(S-1) := P!(S-1) * P!S
FMUL S := S-1; P!(S-1) := P!(S-1) #* P!S
DIV S := S-1; P!(S-1) := P!(S-1) / P!S
FDIV S := S-1; P!(S-1) := P!(S-1) #/ P!S
MOD S := S-1; P!(S-1) := P!(S-1) MOD P!S
ADD S := S-1; P!(S-1) := P!(S-1) + P!S
FADD S := S-1; P!(S-1) := P!(S-1) #+ P!S
SUB S := S-1; P!(S-1) := P!(S-1) - P!S
FSUB S := S-1; P!(S-1) := P!(S-1) #- P!S
EQ S := S-1; P!(S-1) := P!(S-1) = P!S
FEQ S := S-1; P!(S-1) := P!(S-1) #= P!S
NE S := S-1; P!(S-1) := P!(S-1) ∼= P!S
FNE S := S-1; P!(S-1) := P!(S-1) #∼= P!S
LS S := S-1; P!(S-1) := P!(S-1) < P!S
FLS S := S-1; P!(S-1) := P!(S-1) #< P!S
GR S := S-1; P!(S-1) := P!(S-1) > P!S
FGR S := S-1; P!(S-1) := P!(S-1) #> P!S
LE S := S-1; P!(S-1) := P!(S-1) <= P!S
FLE S := S-1; P!(S-1) := P!(S-1) #<= P!S
GE S := S-1; P!(S-1) := P!(S-1) >= P!S
FGE S := S-1; P!(S-1) := P!(S-1) #>= P!S
LSHIFT S := S-1; P!(S-1) := P!(S-1) << P!S
RSHIFT S := S-1; P!(S-1) := P!(S-1) >> P!S
LOGAND S := S-1; P!(S-1) := P!(S-1) & P!S
LOGOR S := S-1; P!(S-1) := P!(S-1) | P!S
EQV S := S-1; P!(S-1) := P!(S-1) EQV P!S
XOR S := S-1; P!(S-1) := P!(S-1) XOR P!S
Vector subscription (E1 !E2 is implemented using PLUS and RV. The value of x MOD
y is either zero or it has the same sign as x and its magnitude is less than ABS y. Shifts
by a negative amounts yield undefined results, but on some versions of BCPL the shift
direction is reversed. Shifts greater than the word length yield zero.
ENTRY Li n C1 . . . Cn
SAVE s
body of function or routine
ENDPROC
Li is the label allocated for the entry point. As a debugging aid, the length of the
function or routine name is given by n and its characters by the C1 . . . Cn . The SAVE
statement specifies the initial setting of S, which is just the save space size (=3) plus
the number of formal parameters. For functions defined using pattern matching the
number of formal parameters is determined by the patterns in the match list. The
state of the stack just after entry is shown in figure 8.3.
A1 A2 An
S
P
Figure 8.3: The stack frame on function or routine entry
The save space is used to hold the previous value of P, the return address and the
function entry address. Thus, the first argument of a function is always at position 3
relative to the P pointer. On some older versions of BCPL the size of the save space
may be different.
The end of the body is marked by an ENDPROC statement which is non executable
but allows the code generator to keep track of nested definitions. In early versions of
OCODE, the first two arguments of ENTRY were interchanged and ENDPROC was given
a numerical argument.
The language insists that arguments are laid out in consecutive locations on the
stack and that there is no limit to their number. This suggests that a good strategy is
to place the arguments of a call in the locations they must occupy when the function or
routine is entered. Thus, a typical call E(E1 , . . . , En ) is compiled by first incrementing
S to leave room for the save space in the new stack frame, then generate code to evaluate
the arguments E1 , . . . , En before generating code for E. The state is then as shown
in figure 8.4. Finally, either FNAP k or RTAP k is generated, depending on whether a
function or routine call is being compiled. Notice that k is the distance between the
old and new stack frames.
The return from a routine is performed by RTRN which restores the previous value
of P and resumes execution from the return address. The return from a function is
performed by FNRN just after the function result has been evaluated on the top of the
186 CHAPTER 8. THE DESIGN OF OCODE
E1 E2 En E
k
S
P
stack. FNRN performs the same action as RTRN, after placing the function result in a
special register (A) ready for FNAP to store it in the required location in the previous
stack frame.
8.7 Control
The PC register holds the address of the next instructions to be executed. For most
OCODE statements it is incremented by the size of the instruction, but for control
statements PC is set specifically as needed as needed by, for instance, conditional jumps
or SWITCHON commands. The OCODE statements concerned with function and rou-
tine calls have already been described above. The remaining control statements are
described here.
LAB Ln Ln := PC
This sets the value of label Ln to the current position in the compiled code.
JUMP Ln PC := Ln
JT Ln s := S-1; IF P!S DO PC := Ln
JF Ln S := S-1; UNLESS P!S DO PC := Ln
JUMP causes an unconditional jump to the instruction labelled by Ln. Both JT and
JF pop the top item from the stack then they conditionally jump to label Ln depending
on its value.
GOTO S := S-1; PC := P!S
This is used in the translation of the BCPL GOTO command.
RES Ln S := S-1; A := P!S; PC := Ln
This is used in the compilation of RESULTIS commands. The result is evaluated
and placed on the top of the stack, followed by a RES statement which pops the result
from the stack and places the it in register A before jumping to Ln. This label points
to the first instruction after the code for the VALOF block where there is an RSTACK
statement will push A onto the top of the stack after setting S appropriately for this
point in the program.
If the VALOF block is the body of a function, the compiled code for its RESULTIS
commands is optimised using FNRN rather than RES.
SWITCHON n LdK1 L1 . . . Kn Ln
8.8. DIRECTIVES 187
This is used in the compilations of switches. It makes a jump determined by the value
on the top of the stack. Its first argument (n) is the number of cases in the switch and
the second argument (Ld) is the default label. K1 to Kn are the case constants and L1
to Ln are the corresponding labels.
FINISH Ln S := S-1; A := P!S; PC := Ln
This statement is the compilation of the BCPL FINISH command. It should be con-
verted by the codegenerator into code equivalent to stop(0) by the code generator.
Users are strongly discourage from using FINISH
8.8 Directives
Sometimes the size of the stack frame changes other than in the course of expression
evaluation. This happens, for instance, when control leaves a block in which local
variables were declared. The statement STACK s informs the code generator that the
size of the current stack frame is now s.
The STORE statement is used to inform the code generator that the point separating
the declarations and body of a block has been reached and that any anonymous results
on the stack are actually initialised local variables and so should be stored in their true
stack locations.
Static variables and tables are allocated space in the program area using statements
of the form ITEMN n, where n is the initial value of the static cell. The elements of
table are placed in consecutive locations by consective ITEMN statements. A label may
be set to the address of a static cell by preceding the ITEMN statement by a statement
of the form DATALAB Ln.
The SECTION and NEEDS directives in a BCPL program translate into SECTION and
NEEDS statements of the form:
SECTION n C1 . . . Cn
NEEDS n C1 . . . Cn
where C1 to Cn are the characters of the SECTION or NEEDS name and n is the length.
The end of an OCODE module is marked by the GLOBAL statement which con-
tains information about global functions, routines and labels. The form of the GLOBAL
statement is as follows:
GLOBAL n K1 L1 . . . Kn Ln
where n is the number of items in the global initialisation list. Ki is the global number
and Li is its label. When a module is loaded its global entry points must be initialised.
8.9 Discussion
A very early version of OCODE used a three address code in which the operands were
allowed to be the sum of up to three simple values with a possible indirection. The
intention was that reasonable code should be obtainable even when codegenerating
188 CHAPTER 8. THE DESIGN OF OCODE
one statement at a time. It was soon found more convenient to use an intermediate
code that separates the accessing of values from the application of operators. This
improved portability by making it possible to implement very simple non optimising
codegenerators. Optimising codegenerators could absorb several OCODE statements
before emitting compiled code.
The TRUE and FALSE statements were added in 1968 to improve portability to
machines using sign and modulus or one’s complement arithmetic. Luckily two’s com-
plement arithmetic has now become the norm. Other extensions to OCODE, notably
the ABS, QUERY, GETBYTE and PUTBYTE statements were added as the corresponding
constructs appeared in the language.
In 1980, the BCPL changed slightly to permit position independent code to be
compiled. This change specified that non global functions, routines and labels were
no longer variables, and the current version of OCODE reflects this change by the
introduction of the LF statement and the removal of the old ITEML statement that used
to allocate static cells for such entry points.
Another minor change in this version of OCODE is the elimination of the ENDFOR
statement that was provided to fix a problem on 16-bit word addressed machines with
more than 64 Kbytes of memory.
Chapter 9
The original version of Cintcode was a byte stream interpretive code designed to be
both compact and capable of efficient interpretation on small 16 bit machines based
on 8 bit micro processors such as the Z80 and 6502. Versions that ran on the BBC
Microcomputer and under CP/M were marketed by RCP Ltd [2]. The current version
of Cintcode was extended for 32 bit implementations of BCPL and mainly differs from
the original by the provision of 32 bit operands and the removal of a size restriction of
the global vector.
There is now also a version of Cintcode for 64-bit implementations of BCPL. This is
almost identical to the 32-bit version. A nineth Cintcode register (MW) has been added.
This is normally zero but can be set by a new Cintcode instruction (MW), see below.
On 64-bit implementations, the instructions that take four byte immediate operands,
namely KW, LLPW, LW, LPW, SPW, APW, and AW, sign extend the four byte immediate
operand before adding the MW register into the senior half of the 64-bit result before
resetting the MW to zero. In this version static variables are allocated in 64-bit 8 byte
aligned locations.
The Cintcode machine has nine registers as shown in figure 9.1.
Registers
A
B
C
P
G
ST
PC
Count
MW
189
190 CHAPTER 9. THE DESIGN OF CINTCODE
The registers A and B are used for expression evaluation, and C is used in in byte
subscription. P and G are pointers to the current stack frame and the global vector,
respectively. ST is used as a status register in the Cintpos version of Cintcode, and PC
points to the first byte of the next Cintcode instruction to execute. Count is a register
used by the debugger. While it is positive, Count is decremented on each instruction
execution, raising an exception (code 3) on reaching zero. When negative, it causes a
second (faster) interpreter to be used.
Cintcode encodes the most commonly occurring operations as single byte instruc-
tions, using multi-byte instructions for rarer operations. The first byte of an instruction
is the function code. Operands of size 1, 2 or 4 bytes immediately follow some function
bytes. The two instructions used to implement switches have inline data following the
function byte. Cintcode modules also contains static data for stings, integers, tables
and global initialisation data.
It is clear from table 9.1 that accessing variables and constants requires special care,
and that conditional jumps, addition, calls and indirection are also important. Since
access to local variables accounts for about a quarter of the operations performed, about
this proportion of codes were allocated to instructions concerned with local variables.
Local variables are allocated words in the stack starting at position 3 relative to the P
9.1. DESIGNING FOR COMPACTNESS 191
pointer and, as one would expect, small numbered locals are used far more frequently
than the others, so operations on low numbered locals often have single byte codes.
Although not shown here, other statistics, such as the distribution of relative ad-
dressing offsets and operand values, influenced the design of Cintcode.
LG b B := A; A := G!b
LG1 b B := A; A := G!(b+256)
LGH h B := A; A := G!h
Here, b and h are unsigned 8 and 16 bit values, respectively.
Direct J a dest = x + a
PC x
Indirect J$ b hh dest = q + hh
All relative addressing instructions have two forms: direct and indirect, depending
on the least significant bit of the function byte. The details of both relative address
calculations are shown in figure 9.2, using the instructions J and J$ as examples. For
the direct jump (J), the operand (a) is a signed byte in the range -128 to +127 which
is added to the address (x) of the operand byte to give the destination address (dest).
For the indirect jump, J$, the operand (b) is an unsigned byte in the range 0 to 255
which is doubled and added to the rounded version of x to give the address (q) of a
16 bit signed value hh which is added to q to give the destination address (dest).
The compiler places the resolving half word as late as possible to increase the chance
that it can be shared by other relative addressing instructions to the same desination,
as could happen when several ENDCASE statements occur in a large SWITCHON
9.2. THE CINTCODE INSTRUCTION SET 193
command. The use of a 16 bit resolving word places a slight restriction on the maximum
size of relative references. Any Cintcode module of less than 64K bytes will have no
problem.
compile code with the opposite endianess to that of the machine on which the compiler
is running, see the description of the bcpl command on page 131.
Ln 0 ≤ n ≤ 10 B := A; A := n
LM1 B := A; A := -1
L b B := A; A := b
LM b B := A; A := -b
LH h B := A; A := h
LMH h B := A; A := -h
LW w B := A; A := w
MW w MW := w
These instructions load integer constants. Constants are in the range -1 to 10 are the
most common and have single byte instructions. The other cases use successively larger
instructions. The MW instruction is only used in 64-bit Cintcode. See page 189 for more
details.
LPn 3 ≤ n ≤ 16 B := A; A := P!n
LP b B := A; A := P!b
LPH h B := A; A := P!h
LPW w B := A; A := P!w
These instructions load local variables and anonymous results addressed relative to P.
Offsets in the range 3 to 16 are the most common and use single byte instructions. The
other cases use succesively larger instructions.
LG b B := A; A := G!b
LG1 b B := A; A := G!(b + 256)
LGH h B := A; A := G!h
LG loads the value of a global variables in the range 0 to 255, LG1 load globals in the
range 256 to 511, and LGH can load globals up to 65535. Global numbers must be in
the range 0 to 65535.
LL Ln B := A; A := variable Ln
LL$ Ln B := A; A := variable Ln
LF Ln B := A; A := entry point Ln
LF$ Ln B := A; A := entry point Ln
196 CHAPTER 9. THE DESIGN OF CINTCODE
LL loads the value of a static variable and LF loads the entry address of a function,
routine or label in the current module.
LLP b B := A; A := @P!b
LLPH h B := A; A := @P!h
LLPW w B := A; A := @P!w
LLG b B := A; A := @G!b
LLG1 b B := A; A := @G!(b + 256)
LLGH h B := A; A := @G!h
LLL Ln B := A; A := @(variable Ln)
LLL$ Ln B := A; A := @(variable Ln)
These instructions load the BCPL pointers to local, global and static variables.
These instructions are used in the implementation of byte and word indirection oper-
ators % and ! in right hand contexts.
MUL A := B * A
DIV A := B / A
MOD A := B MOD A
ADD A := B + A
SUB A := B - A
LSH A := B << A
RSH A := B >> A
9.2. THE CINTCODE INSTRUCTION SET 197
AND A := B & A
OR A := B | A
XOR A := B XOR A
These instructions provide for all the normal arithmetic and bit pattern dyadic opera-
tors. The instructions DIV and MOD generate exception 5 if the divisor is zero. Evalua-
tion of relational operators in non conditional contexts involve conditional jumps and
the FHOP instruction, see page 200. Addition is the most frequently used arithmetic
operation and so there are various special instructions improve its efficiency.
An 1≤n≤5 A := A + n
Sn 1≤n≤4 A := A - n
A b A := A + b
AH h A := A + h
AW w A := A + w
S b A := A - b
SH h A := A - h
These instructions implement addition and subtraction by constant integer amounts.
There are single byte instructions for incrementing by 1 to 5 and decremented by 1 to
4. For other values longer instructions are available.
APn 3 ≤ n ≤ 12 A := A + P!n
AP b A := A + P!b
APH h A := A + P!h
APW w A := A + P!w
AG b A := A + G!b
AG1 b A := A + G!(b+256)
AGH h A := A + G!h
These instructions allow local and global variables to be added to A. Special instructions
for addition by static variables are not provided, and subtraction by a variable is not
common enough to warrant special treatment.
E2 En
Kn 3 ≤ n ≤ 11
K b
KH h
KW w
These instructions call the function or routine whose entry point is in A and whose first
argument (if any) is in B. The new stack frame at position k relative to P where k is n,
b, h or w depending on which instruction is used. The effect of these instructions is as
follows:
As can be seen, three words of link information (the old P pointer, the return address
and entry address) are stored in the base of the new stack frame.
KnG b 3 ≤ n ≤ 11
KnG1 b 3 ≤ n ≤ 11
KnGH h 3 ≤ n ≤ 11
These instructions deal with the common situation where the entry point of the function
is in the global vector and the stack increment is in the range 3 to 11. The global number
gn is b, b + 256 or h depending on which function code is used and stack increment k
is n. The first argument (if any) is in A. The effect of these instructions is as follows:
RTN
This instruction causes a return from the current function or routine using the previous
P pointer and the return address held in P!0 and P!1. The effect of the instruction is
as follows:
J Ln PC := Ln
J$ Ln PC := Ln
Jrel Ln IF B rel A DO PC := Ln
Jrel$ Ln IF B rel A DO PC := Ln
Jrel0 Ln IF A rel 0 DO PC := Ln
Jrel0$ Ln IF A rel 0 DO PC := Ln
The destinations of these jump instructions are computed using the relative addressing
mechanism described in Section 9.1.3. Notice than when the comparison is with zero,
A holds the left operand of the relation.
GOTO PC := A
FHOP A := 0; PC := PC+1
The FHOP instruction is only used in the compilation of relational expressions in non
conditional contexts as in the compilation. The assignment: x := y < z is typically
compiled as follows:
LP4 Load y
LP5 Load z
JLS 2 Jump to the LM1 instruction if y<z
FHOP A := FALSE; and hop over the LM1 instruction
LM1 A := TRUE
SP3 Store in x
This instruction is used when there are sufficient case constants all within a small
enough range. It performs the jump by selecting an element from a vector of 16 bit
resolving half words. The quantities n, dlab, and L0 to Ln−1 are 16 bit half words,
aligned on 16 bit boundaries by the optional filler byte. If A is in the range 0 to n − 1 it
uses the appropriate resolving half word LA , otherwise it uses the resolving half word
9.2. THE CINTCODE INSTRUCTION SET 201
dlab to jump to the default label. See Section 9.1.3 for details on how resolving half
words are interpreted.
This instruction is used when the range of case constants is too large for SWL to be
economical. It performs the jump using a binary chop strategy. The quantities n, dlab,
K1 to Kn and L1 to Ln are 16 bit half words aligned on 16 bit boundaries by the
option filler byte. This instruction successively tests A with the case constants in the
balanced binary tree given in the instruction. The tree is structured in a way similar
to that used in heapsort with the children of the node at position i at positions 2i and
2i + 1. References to nodes beyond n are treated as null pointers. Within this tree, Ki
is greater than all case constants in the tree rooted at position 2i, and less than those
in the tree at 2i + 1. The search starts at position 1 and continues until a matching
case constant is found or a null pointer is reached. If A is equal to some Ki then PC is
set using the resolving half word Li , otherwise it uses the resolving half word dlab to
jump to the default label. See Section 9.1.3 for details on how resolving half words are
interpreted.
The use of this structure is particularly good for the hand written machine code
interpreter for the Pentium where there are rather few central registers. Cunning use
can be made of the add with carry instruction (adcl). In the following fragment of
code, %esi points to n, %eax holds i and A is held in %eab. There is a test elsewhere
to ensure that A is in the range 0 to 65535.
The compiler ensures that the tree always has at least 7 nodes allowing the code can
be further improved by preceeding this loop with two copies of:
9.2.10 Miscellaneous
XCH Exchange A and B
ATB B := A
ATC C := A
BTC C := B
202 CHAPTER 9. THE DESIGN OF CINTCODE
NOP
SYS
This instruction is used in body of the hand written library routine sys. If A is zero,
the interpreter returns with exception code P!4.
If A is -1 it sets register count to P!4, setting A to the previous value of count.
Changing the value of count may change which of the two interpreters is used. For
more details see Section 4.3.
Otherwise, it performs a system operation returning the result in A. In the C im-
plementation of the interpreter this is done by the following code:
c = dosys(p, g);
MDIV
This instruction is used as the one and only instruction in the body of the hand written
library routine muldiv, see Section 3.3. It divides P!5 into the double length product of
P!3 and P!4 placing the result in A and the remainder in the global variable result2.
It then performs a function return (RTN). Its effect is as follows:
A := <the result>
G!Gn_result2 := <the remainder>
PC := P!1 // PC := P!1
P := P!0 // P := P!0
CHGCO
This instruction is used in the implementation of coroutines. It is the one and only
instruction in the body of the hand written library routine changeco(val,cptr) where
val is passed in Cintcode register A and cptr is in P!4. Its effect, which is rather subtle,
is shown below. For more information see page 56.
BRK
This instruction is used by the debugger to implement break points. It causes the
interpreter to return with exception code 2.
9.2. THE CINTCODE INSTRUCTION SET 203
In the above table, b is a signed byte representing a decimal exponent in the range -128
to +127. Floating point numbers with exponents outside this range can be generated
using sys(Sys flt, fl mk, x, e) as described on page 3.3.
The mapping between op and its corresponding expression operator is given by the
table on page 183.
9.2.14 Corruption of B
To improve the efficiency of some hand written machine code interpreters, the following
instructions are permitted to corrupt the value held in B:
All other instructions either set B explicitly or leave its value unchanged.
9.2.15 Exceptions
When an exception occurs, the interpreter saves the Cintcode registers in its register
vector and yields the exception number as result. For exceptions caused by non existent
instructions, BRK, DIV or MOD the program counter is left pointing to the offend-
ing instruction. For more details see the description of sys(Sys interpret,...) on
page 80.
Chapter 10
Sial is an internal intermediate assembly language designed for BCPL. The first version
was called Cial (Compact Internal Assembly Language) was pronounced “seal”. It was
essentially an assembly language for Cintcode with the same function code mnemonics
and the same abstract machine registers. It was soon found that rather than having a
variety of codes to load an integer constant (such as L0, L1, L2, LM1, LW, LH or L), it was
better to have one function code to load positive integers and another for negative ones
with the values specified by operands. This form is more convenient for translation and
easier to compress. The new language is called Sial (also pronouced “seal”) with the S
standing for smaller. Sial therefore has fewer function codes than Cintcode and most
of them take operands but still uses the same abstract machine registers. Although
Cintcode load instructions save the value of the A register in B before setting A, Sial
loads typically do not. The current version of Sial has not yet been updated to deal
with the extended BCPL features such as floating point and op:= assignments.
As as example of the use of Sial, consider the program com/hello.b which is as
follows:
GET "libhdr"
LET start() = VALOF
{ writef("Hello*n")
RESULTIS 0
}
This can be translated into Sial using bcpl2sial com/hello.b to hello.sial. The
resulting file is:
F104
F113 K5 C115 C116 C97 C114 C116
F111 L1
F112 M9001
F32 P3 G94
F11 K0
F77
F107 M9001 K6 C72 C101 C108 C108 C111 C10
F106 K1 G1 L1 G94
F105
205
206 CHAPTER 10. THE DESIGN OF SIAL
This can be converted into something slightly more readable using the command:
sial-sasm hello.sial to * giving: This can be translated into Sial using the
bcpl2sial command as follows.
Alternatively, the Sial can be translated, statement by statement, into the assembly
language of a machine such as the Pentium as follows.
call *%eax
# L K0
xorl %ebx,%ebx
# RTN
movl 4(%ebp),%eax
movl 0(%ebp),%ebp
jmp *%eax
# STRING M9001 K6 C72 C101 C108 C108 C111 C10
.data
.align 4
MA9001:
.byte 6
.byte 72
.byte 101
.byte 108
.byte 108
.byte 111
.byte 10
.text
# GLOBAL K1
.globl prog
.globl _prog
prog:
_prog:
movl 4(%esp),%eax
# G1 L1
movl $LA1,4(%eax)
# G94
ret
# MODEND
0.020>
F An opcode or directive
P A stack offset, 0 to #xFFFFFF
G A global variable number, 0 to 65535
K A 24-bit unsigned constant, often small in value
W A signed integer, used for static data and large constants
C A byte in range 0 to 255
L A label generated by translation phase
M A label generated by the Sial codegenerator
The instructions are for an abstract machine with the following internal registers.
a The main accumulator, function first arg and result register
b The second accumulator used in dyadic operations
c Register used by pbyt and xpbyt, and possibly currupted by
some other instructions, such as mul, div, rem, xdiv and xrem
P Pointer to the base of the current stack frame
G Pointer to the base of the Global Vector
PC Set by jump and call instrunctions
The opcodes and directives are as follows:
Mnemonic Operand(s) Meaning
lp Pn a := P!n
lg Gn a := G!n
ll Ln a := !Ln
llp Pn a := @ P!n
llg Gn a := @ G!n
lll Ln a := @ !Ln
lf Ln a := address of entry point Ln
l Kn a := n
lm Kn a := - n
sp Pn P!n := a
sg Gn G!n := a
sl Ln !Ln := a
ap Pn a := a + P!n
ag Gn a := a + G!n
a Kn a := a + n
s Kn a := a - n
10.1. THE SIAL SPECIFICATION 209
lkp Kk Pn a := P!n!k
lkg Kk Gn a := G!n!k
rv a := ! a
rvp Pn a := P!n!a
rvk Kn a := a!k
st !a := b
stp Pn P!n!a := b
stk Kn a!n := b
stkp Kk Pn P!n!k := a
skg Kk Gn G!n!k := a
xst !b := a
k Pn Call a(b,...) incrementing P by n
leaving b in a
kpg Pn Gg Call Gg(a,...) incrementing P by n
neg a := - a
not a := ~ a
abs a := ABS a
xdiv a := a / b; c := ?
xmod a := a MOD b; c := ?
xsub a := a - b; c := ?
mul a := b * a; c := ?
div a := b / a; c := ?
mod a := b MOD a; c := ?
add a := b + a
sub a := b - a
eq a := b = a
ne a := b ~= a
ls a := b < a
gr a := b > a
le a := b <= a
ge a := b >= a
eq0 a := a = 0
ne0 a := a ~= 0
ls0 a := a < 0
gr0 a := a > 0
le0 a := a <= 0
ge0 a := a >= 0
210 CHAPTER 10. THE DESIGN OF SIAL
lsh a := b << a
rsh a := b >> a
and a := b & a
or a := b | a
xor a := b XOR a
eqv a := b EQV a
gbyt a := b % a
xgbyt a := a % b
pbyt b % a := c
xpbyt a % b := c
swb Kn Ld K1 L1 ... Kn Ln Binary chop switch, Ld default
swl Kn Ld L1 ... Ln Label vector switch, Ld default
xch Swap a and b
atb b := a
atc c := a
bta a := b
btc c := b
atblp Pn b := a; a := P!n
atblg Gn b := a; a := G!n
atbl Kk b := a; a := k
j Ln Jump to Ln
rtn Function or routine return
goto PC := a
ikp Kk Pn a := P!n + k; P!n := a
ikg Kk Gn a := G!n + k; G!n := a
ikl Kk Ln a := !Ln + k; !Ln := a
ip Pn a := P!n + a; P!n := a
ig Gn a := G!n + a; G!n := a
il Ln a := !Ln + a; !Ln := a
jeq Ln Jump to Ln if b = a
jne Ln Jump to Ln if b ~= a
jls Ln Jump to Ln if b < a
jgr Ln Jump to Ln if b > a
jle Ln Jump to Ln if b <= a
jge Ln Jump to Ln if b >= a
jeq0 Ln Jump to Ln if a = 0
jne0 Ln Jump to Ln if a ~= 0
jls0 Ln Jump to Ln if a < 0
jgr0 Ln Jump to Ln if a > 0
jle0 Ln Jump to Ln if a <= 0
jge0 Ln Jump to Ln if a >= 0
jge0m Mn Jump to Mn if a >= 0
10.1. THE SIAL SPECIFICATION 211
jfeq Ln Jump to Ln if b #= a; b := ?
jfne Ln Jump to Ln if b #~= a; b := ?
jfls Ln Jump to Ln if b #< a; b := ?
jfgr Ln Jump to Ln if b #> a; b := ?
jfle Ln Jump to Ln if b #<= a; b := ?
jfge Ln Jump to Ln if b #>= a; b := ?
jfeq0 Ln Jump to Ln if a #= 0; b := ?
jfne0 Ln Jump to Ln if a #~= 0; b := ?
jfls0 Ln Jump to Ln if a #< 0; b := ?
jfgr0 Ln Jump to Ln if a #> 0; b := ?
jfle0 Ln Jump to Ln if a #<= 0; b := ?
jfge0 Ln Jump to Ln if a #>= 0; b := ?
Notice that all floating point instructions currently leave register b undefined, but
this may be changed later. They may also may corrupt global 11 (tempval).
A second example of the use of Sial is the following program (com/fact.b):
SECTION "fact"
GET "libhdr"
LET start() = VALOF
{ FOR i = 1 TO 5 DO writef("fact(%n) = %i4*n", i, fact(i))
RESULTIS 0
}
AND fact(n) = n=0 -> 1, n*fact(n-1)
F104
F103 K4 C102 C97 C99 C116
F113 K5 C115 C116 C97 C114 C116
F111 L1
F11 K1
F13 P3
F111 L4
F3 P3
F69
F9 L2
F31 P9
F13 P9
F3 P3
F13 P8
F112 M1
F32 P4 G94
F79 K1 P3
F75 K5
F89 L4
F11 K0
F77
F107 M1 K15 C102 C97 C99 C116 C40 C37 C110
10.1. THE SIAL SPECIFICATION 213
Using the sial-sasm command we obtain the following more readable version:
MODSTART
SECTION K4 C102 C97 C99 C116
//Entry to: start
ENTRY K5 C115 C116 C97 C114 C116
LAB L1
L K1
SP P3
LAB L4
LP P3
ATB
LF L2
K P9
SP P9
LP P3
SP P8
LSTR M1
KPG P4 G94
IKP K1 P3
ATBL K5
JLE L4
L K0
RTN
STRING M1 K15 C102 C97 C99 C116 C40 C37 C110 C41 C32
C61 C32 C37 C105 C52 C10
//Entry to: fact
ENTRY K4 C102 C97 C99 C116
LAB L2
JNE0 L5
L K1
RTN
LAB L5
LM K1
AP P3
ATB
214 CHAPTER 10. THE DESIGN OF SIAL
LF L2
K P4
ATBLP P3
MUL
RTN
GLOBAL K1
G1 L1
G94
MODEND
This can be translated into assembly language using the program com/sial-686.b
which is a simple program based on sial-sasm.b. This version can now compile the
floating point instructions recently added to Sial. It generates the readable version
of the Sial source as comments interspersed with the corresponding Pentium assembly
code. For the example program given above, it outputs the following assembly language.
# SP P8
movl %ebx,32(%ebp)
# LSTR M1
leal MA1,%ebx
shrl $2,%ebx
# KPG P4 G94
movl 376(%esi),%eax
leal 16(%ebp),%edx
call *%eax
# IKP K1 P3
movl 12(%ebp),%ebx
incl %ebx
movl %ebx,12(%ebp)
# ATBL K5
movl %ebx,%ecx
movl $5,%ebx
# JLE L4
cmpl %ebx,%ecx
jle LA4
# L K0
xorl %ebx,%ebx
# RTN
movl 4(%ebp),%eax
movl 0(%ebp),%ebp
jmp *%eax
# STRING M1 K15 C102 C97 C99 C116 C40 C37 C110 C41 C32
# C61 C32 C37 C105 C52 C10
.data
.align 4
MA1:
.byte 15
.byte 102
.byte 97
.byte 99
.byte 116
.byte 40
.byte 37
.byte 110
.byte 41
.byte 32
.byte 61
.byte 32
.byte 37
.byte 105
.byte 52
.byte 10
.text
# Entry to: fact
# ENTRY K4 C102 C97 C99 C116
# LAB L2
LA2:
movl %ebp,0(%edx)
movl %edx,%ebp
popl %edx
216 CHAPTER 10. THE DESIGN OF SIAL
movl %edx,4(%ebp)
movl %eax,8(%ebp)
movl %ebx,12(%ebp)
# JNE0 L5
orl %ebx,%ebx
jne LA5
# L K1
movl $1,%ebx
# RTN
movl 4(%ebp),%eax
movl 0(%ebp),%ebp
jmp *%eax
# LAB L5
LA5:
# LM K1
movl $-1,%ebx
# AP P3
addl 12(%ebp),%ebx
# ATB
movl %ebx,%ecx
# LF L2
leal LA2,%ebx
# K P4
movl %ebx,%eax
movl %ecx,%ebx
leal 16(%ebp),%edx
call *%eax
# ATBLP P3
movl %ebx,%ecx
movl 12(%ebp),%ebx
# MUL
movl %ecx,%eax
imul %ebx
movl %eax,%ebx
# RTN
movl 4(%ebp),%eax
movl 0(%ebp),%ebp
jmp *%eax
# GLOBAL K1
.globl fact
.globl _fact
fact:
_fact:
movl 4(%esp),%eax
# G1 L1
movl $LA1,4(%eax)
# G94
ret
# MODEND
When implementing sial-686 it was necessary to decide how the Intel registers
were to be used and what the BCPL calling sequence should be. The chosen register
allocation was as follows:
10.1. THE SIAL SPECIFICATION 217
The structure of sial-686 is simple. It mainly consists of a large switch within the
function scan that has a case for each Sial function code and directive. For example,
the case for the function code kpg is approximately as follows:
The call cvfpg("KPG") reads the Sial statement knowing it is of the form: KPG
Pk Gn. This outputs the statement as an assembly language comment after placing k
and n in pval and gval, respectively. The three writef calls then output the three
assembly language instructions for the KPG operation, and ENDCASE transfers control to
where the next Sial statement is processed. All the other cases are equally simple.
To improve the efficiency of the floating point code, instructions that normally load
a value into A or B are delay until it is known how the value is to be used. If a floating
point operation is about to be performed it is better to load the value into X. Where
possible, sial-686.b remembers what value (such as which local or global) is currently
held in A, B, X and S.
The section name of the program, which must be present, compiles into a C callable
function that initialises the BCPL global vector with the entry points defined within
this module. To complete the 686 implementation, there is a short handwritten assem-
bly language library natbcpl/sysasm/mlib.s that defines the BCPL callable func-
tions sys, changeco and muldiv. The program must be linked the compiled ver-
sions of the BCPL library modules BLIB and DLIB, and also clib whose source is in
natbcpl/sysc/clib.c.
Every section must contain the definition of a function to initialise the global vector
with the entry points of functions defined in the section. For a section defined in
BCPL, the name of the initialisation function is the section name which must have been
specified in the BCPL source. A C program such as initprog.c must be provided to
with a definition of a function called initsections that calls the initialisation function
of every section of the program. The command makeinit, described on page 143, can
be used to create the initialiation program. For the program prog.b given above, the
following command:
If needed, the runtime stack size and the size of the global vector can be specified
by arguments to makeinit. The compilation of initprog must be linked in with the
object code of all the other sections needed by prog.b when building its executable.
Assuming the executable is placed in bin/prog, it can be executed by the bash shell
command ./bin/prog or possibly just prog if the PATH environment variable is suitably
set.
The MC Package
This chapter describes the MC package which provides a machine independent way to
generate and execute native machine code at runtime. The work on this package started
in January 2008 and is still under development, however, it currently works well enough
to run the n-queens problem on i386 machines about 24 times faster than the normal
Cintcode interpretive version. MC package development is performed in the directory
BCPL/bcplprogs/mc/ and fairly stable versions are copied to BCPL/cintcode/g/mc.h,
BCPL/cintcode/com/mci386.b and BCPL/cintcode/cin/mci386 which can be used
from any working directory. Currently the MC package does not have any floating
point operations. This will be rectified in due course.
The package is based on a simple machine independent abstract machine code
called MC which is easily translated into machine instructions for most architectures.
Although native code is generated by MC calls such as mcRDX(mc add, mc b, 20,
mc d), MC has a corresponding assembly language to assist debugging. The assembly
form of the instruction generated by the previous call is ADD B,20(D) meaning set
register B to the sum of B and the contents of the memory location whose address is 20
plus the value of register D. MC instructions are fairly low level and typically translate
into single native code instructions for most architectures. This example translates into
the i386 GNU statement: addl 20(%edx),%ebx.
The first operand is the destination for any instruction that updates a regis-
ter or memory location. Thus assignments are always from right to left as in
most programming languages but unlike many assembly codes where, for instance,
movl 20(%edx),%ebx updates the second operand.
The MC machine has six registers A, B, C, D, E and F that are directly available
to the programmer, and also a program counter, stack pointer, stack frame pointer and
a condition code register, although these cannot be accessed explicitly.
11.1 MC Example
The following program is a simple demonstration of the i386 version of the MC package.
GET "libhdr"
GET "mc.h"
221
222 CHAPTER 11. THE MC PACKAGE
MANIFEST {
A=mc_a; B=mc_b; C=mc_c; D=mc_d; E=mc_e; F=mc_f
a1=1; a2; a3
}
LET start() = VALOF
{ // Load the dynamic code generation package for i386 machines.
LET mcseg, mcb, n = globin(loadseg("mci386")), 0, 0
UNLESS mcseg DO
{ writef("Trouble with MC package: mci386*n")
GOTO fin
}
// Create an MC instance for 10 functions with a data space
// of 100 words and code space of 4000 words.
mcb := mcInit(10, 100, 4000)
UNLESS mcb DO
{ writef("Unable to create an mci386 instance*n")
GOTO fin
}
mc := 0 // Currently no selected MC instance.
mcSelect(mcb) // Select the new MC instance.
mcK(mc_debug, #b0011) // Trace comments and MC instructions.
mcKKK(mc_entry, 1, 3, 5) // Entry point for function 1
// having 3 arguments and 5 local variables
mcK(mc_debug, #b1111) // Trace comments, MC instructions, target
// instructions and the compiled code.
mcRA(mc_mv, A, a1) // A := <arg 1>
mcRA(mc_add, A, a2) // A := A + <arg 2>
n := mcNextlab()
mcL(mc_lab, n) // Ln:
mcRA(mc_add, A, a3) // A := A + <arg 3>
mcR(mc_dec, A) // A := A - 1
mcRK(mc_cmp, A, 100)
mcJS(mc_jlt, n) // IF A<100 JMP Ln
mcK(mc_debug, #b0011) // Trace only comments and MC instructions.
mcF(mc_rtn) // Return from function 1 with result in A.
mcF(mc_endfn) // End of function 1 code.
mcF(mc_end) // End of dynamic code generation.
writef("*nF1(10, 20, 30) => %n*n", mcCall(1, 10, 20, 30))
fin:
IF mcseg DO unloadseg(mcseg)
RESULTIS 0
}
// ENTRY 1 3 5
// DEBUG 15
11.1. MC EXAMPLE 223
// MV A,A1
movl 20(%ebp), %eax
573: 8B 45 14
// ADD A,A2
addl 24(%ebp), %eax
576: 03 45 18
// LAB L1
lab L1
579: L1:
// ADD A,A3
addl 28(%ebp), %eax
579: 03 45 1C
// DEC A
decl %eax
582: 48
// CMP A,$100
cmpl $100, %eax
583: 83 F8 64
// JLT L1
jl L1
586: 7C F7
// DEBUG 3
// RTN
// ENDFN
// END
F1(10, 20, 30) => 117
The result of 117 (= 10+20+(30-1)*3) shows that the body of the loop was correctly
executed three times.
The header file (mc.h) defines manifests (such as mc mv and mc add) and globals
(such as mcK and mcRA) provided by the package. The package itself must be dynami-
cally loaded (by globin(loadseg("mci386"))) and then selected (by mcSelect(mcb)).
MC instructions are compiled by calls such as mcRA(op,... or mcRK(op,... where op
specifies the instruction or directive and the letters following mc (eg RA or RK) specify
the sort of operands supplied.
A register operand is denoted by R and an integer operand by K. There are 9 possible
kinds of memory operands denoted by A, V, G, M, L, D, DX, DXs and DXsB. A denotes an
specified argument of the current function, V denotes a specified local variable of the
current function, G denotes a specified BCPL global variable, M denotes a location in
Cintcode memory specified by a BCPL pointer, L denotes the position within the data
or code areas of the compiled code corresponding to a given label, D denotes a specified
absolute machine address, DX denotes a location whose machine address is the sum of
a given byte offset and register, DXs is similar to DX only the index register is scaled
by a given factor of 1, 2, 4 or 8 and finally DXsB is like DXs but has a second specified
register added into the effective address.
The following table summarises the MC code generation functions. The first ar-
gument is always specifies the directive or instruction and the remaining arguments
specify the operands. The destination of any instruction that updates a register or
memory location is always the first operand.
224 CHAPTER 11. THE MC PACKAGE
Function Operands
mcF No operand
mcK One integer operand
mcR One MC register operand
mcA One operand specifying an argument number
mcV One operand specifying an local variable number
mcG One operand specifying a global variable number
mcM One operand giving the word address of a location in Cintcode mem-
ory
mcL One numeric label operand, defaulting to 32-bit relative
mcD One operand giving an absolute machine address
mcDX One memory operand specified by an offset added to an index register
mcDXs One memory operand specified by an offset added to an index register
scaled by s which must be 1, 2, 4 or 8
mcDXsB One memory operand specified by an offset added to a base register
and an index register scaled by s which must be 1, 2, 4 or 8
mcJS Jump instructions with near relative destinations
mcJL Jump instructions with possibly distant relative destinations
mcJR Jump instructions with destination given by resister
mcRA Two operands, R and A
mcRV Two operands, R and V
mcRG Two operands, R and G
mcRM Two operands, R and M
mcRL Two operands, R and L
mcRD Two operands, R and D
mcRDX Two operands, R and DX
mcRDXs Two operands, R and DXs
mcRDXsB Two operands, R and DXsB
mcRR Two operands, R and R
mcAR Two operands, A and R
mcVR Two operands, V and R
mcGR Two operands, G and R
mcMR Two operands, M and R
mcLR Two operands, L and R
mcDR Two operands, D and R
mcDXR Two operands, DX and R
mcDXsR Two operands, DXs and R
mcDXsBR Two operands, DXsB and R
11.2. MC LIBRARY FUNCTIONS 225
mcSelect(mcb)
Select an instance of the MC package by assigning mcb to the global variable mc.
For efficiency reasons, mcSelect copies various field in the control block to global
variables. If mc was non zero, the previous setting of the globals are saved in the
previously selected MC instance. It is thus important to set mc to zero before the first
call od mcSelect.
mcClose()
Close the currently selected MC instance deleting all its workspace and compiled
code. It also sets mc to zero.
mcPRF(mess, reg)
This function is an invaluable debugging aid which compiles code to call the C
function printf with the given format string (packed in the data area) and the value
of the specified register. All registers, including the condition code, are preserved. The
register argument may be omitted if the format string requires no register argument.
Typical use of mcPRF is as follows:
mcRK(mc_mv, D, #x01234567)
mcRK(mc_mv, A, #x89ABCDEF)
226 CHAPTER 11. THE MC PACKAGE
mcRK(mc_mv, A, #x10000000)
mcPRF("With D=%8x ", D)
mcPRF("A=%8x ", A)
mcPRF("B=%8x*n", B)
mcR(mc_div, B)
mcPRF("the instruction: DIV B*n")
mcPRF("gives D=%8x ", D)
mcPRF("A=%8x ", A)
mcPRF("B=%8x*n", B)
n := mcNextlab()
Allocate the next available label assigning its number to n. Labels are use by
instructions that refer to static data and in jump instructions. There is essentially no
limit to the number of labels that may be allocated.
mcComment(format, a, b,..., k)
This is a debugging aid to make the compiled code more readable using writef to
write a message to the listing output during code generation if the least significant bit
of mcDebug is a one. The variable mcDebug is set by the DEBUG directive described
below.
res := mcDatap()
res := mcCodep()
These calls return the current positions in the data and code area respectively.
All the other functions compile MC directives and instructions and are described
below.
ALIGNC K
Align the next instruction to an address which is a multiple of k which must be 2,
4 or 8.
228 CHAPTER 11. THE MC PACKAGE
ALIGND K
Align the next item of data to an address which is a multiple of k which must be
2, 4 or 8.
CALL KK
Call the function who number is the first argument with n arguments that have
already been pushed onto the stack when n is the second operand. On return these
arguments will have been popped and, by convention, the result will be in register A.
CDQ F
Sign extend register A into D. That is, if A is positive set D to zero, otherwise it is
to #xFFFFFFFF. This is normally used in conjuction with DIV.
will compile code to jump the label L25 is B<=100, using signed arithmetic.
DATAB K
Assemble one byte of data with the specified value.
DATAK K
Assemble one aligned word of data with the specified value.
DATAL L
Assemble one aligned word of data initialised with the absolute address of code or
data specified by the given label.
DEBUG K
Set the debug tracing level (mcDebug) to the specified value. The least significant
four bits of mcDebug control the level of tracing as follows.
DLAB L
Set the specified label to the absolute address of the next available byte in the data
area.
ENDFN F
This marks the end of the body of the current function.
END F
This directive specifies that no more code generation will be done. The system
will free all temporary work space only preseving the MC control block, the function
dispatch table, and the data and code areas.
ENTRY KKK
This specifies the entry point of the function whose number is given by the first
operand. The second operand specifies how many arguments the function takes and the
third specified how many local variables the function may use. Calls to this function
must have the required number of arguments pushed onto the stack, and on return
this number of values will be automatically popped from the stack. Functions called
directly from BCPL using mcCall always take three arguments, but functions called
using the CALL instruction can take any number of arguments.
JEQ JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
equal to its second operand.
JGE JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
greater than or equal to its second operand using signed arithmetic.
JGT JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
greater than its second operand using signed arithemetic.
JLE JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
less than or equal to its second operand using signed arithmetic.
230 CHAPTER 11. THE MC PACKAGE
JLT JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
less than its second operand using signed arithmetic.
JMP JS JL JR
Unconditionally jump to the specified location.
JNE JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
not equal to its second operand.
LAB L
Set the specified label to the machine address of the current position in the code
area.
LSH RK RR
Shift to the left the value in the register specified by the first operand by the
amount specified by the second operand. If the second operand is a register is must be
C. Vacated positions are filled with zeros. The effect is undefined if the shift distance
is not in the range 0 to 31.
NOP F
Performs no operation.
RSH RR RK
Shift to the right the value in the register specified by the first operand by the amount
specified by the second operand. If the second operand is a register is must be C.
Vacated positions are filled with zeros. The effect is undefined if the shift distance is
not in the range 0 to 31.
RTN F
This causes a return from the current function. The result, if any, should be in A.
232 CHAPTER 11. THE MC PACKAGE
SEQ R
Set the specified register to one if the first operand of a previous CMP instruction
was equal to its second operand, otherwise set it to zero.
SGE R
Set the specified register to one if the first operand of a previous CMP instruction
was greater than or equal to its second operand using signed arithmetic, otherwise set
it to zero.
SGT R
Set the specified register to one if the first operand of a previous CMP instruction
was greater than its second operand using signed arithmetic, otherwise set it to zero.
SLE R
Set the specified register to one if the first operand of a previous CMP instruction
was less than or equal to its second operand using signed arithmetic, otherwise set it
to zero.
SLT R
Set the specified register to one if the first operand of a previous CMP instruction
was less than its second operand using signed arithmetic, otherwise set it to zero.
SNE R
Set the specified register to one if the first operand of a previous CMP instruction
was not equal to its second operand, otherwise set it to zero.
UJGE JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
greater than or equal to its second operand using unsigned arithmetic.
UJGT JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
greater than its second operand using unsigned arithmetic.
UJLE JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
less than or equal to its second operand using unsigned arithmetic.
UJLT JS JL JR
Jump to the specified location if the first operand of a previous CMP instruction was
less than its second operand using unsigned arithmetic.
USGE R
Set the specified register to one if the first operand of a previous CMP instruction
was greater than or equal to its second operand using unsigned arithmetic, otherwise
set it to zero.
USGT R
Set the specified register or memory word to one if the first operand of a previous CMP
instruction was greater than its second operand using unsigned arithmetic, otherwise
set it to zero.
USLE R
Set the specified register to one if the first operand of a previous CMP instruction
was less than or equal to its second operand using unsigned arithmetic, otherwise set
it to zero.
USLT R
Set the specified register to one if the first operand of a previous CMP instruction
was less than its second operand using unsigned arithmetic, otherwise set it to zero.
GET "libhdr"
GET "mc.h"
MANIFEST {
// Register mnemonics
ld = mc_a
col = mc_b
rd = mc_c
poss = mc_d
p = mc_e
count = mc_f
}
LET start() = VALOF
{ // Load the dynamic code generation package
LET argv = VEC 50
LET lo, hi, dlevel = 1, 16, #x0000
LET mcname = "mci386" // Default setting
LET mcseg = 0
LET mcb = 0
11.5. THE N-QUEENS DEMONSTRATION 235
}
AND gencode(n) BE
{ LET all = (1<<n) - 1
mcKKK(mc_entry, n, 3, 0)
mcRK(mc_mv, ld, 0)
mcRK(mc_mv, col, 0)
mcRK(mc_mv, rd, 0)
mcRK(mc_mv, count, 0)
cmpltry(1, n, all) // Compile the outermost call of try
mcRR(mc_mv, mc_a, count) // return count
mcF(mc_rtn)
mcF(mc_endfn)
}
AND cmpltry(i, n, all) BE
{ LET L = mcNextlab()
mcComment("*n// Start of code from try(%n, %n, %n)*n", i, n, all)
mcRR(mc_mv, poss, ld) // LET poss = (~(ld | col | rd)) & all
mcRR(mc_or, poss, col)
mcRR(mc_or, poss, rd)
mcR (mc_not, poss)
mcRK(mc_and, poss, all)
mcRK(mc_cmp, poss, 0) // IF poss DO
TEST n-i<=2
THEN mcJS(mc_jeq, L) // (use a short jump if near the last row)
ELSE mcJL(mc_jeq, L)
TEST i=n
THEN { // We can place a queen in the final row.
mcR(mc_inc, count) // count := count+1
}
ELSE { // We can place queen(s) in a non final row.
LET M = mcNextlab()
mcL (mc_lab, M) // { Start of REPEATWHILE loop
mcRR(mc_mv, p, poss) // LET p = poss & -poss
mcR (mc_neg, p)
mcRR(mc_and, p, poss) // // p is a valid queens position
mcRR(mc_sub, poss, p) // poss := poss - p
In this implementation all the working variables are held in registers and all re-
cursive calls are unwound knowing that the depth of recursion will be limited, in this
case to no more than 16. The stack is used to save the state at the moment when a
recursive call would have been made in the original program. An optimisation is done
based on the knowledge that if a queen can be placed on the nth row of n × n board
then the solution count can be incremented.
When running on a Pentium IV this implementation executes approximately 24
times faster than the normal interpretive Cintcode version and 25% faster than the
corresponding optimised C version of the algorithm.
238 CHAPTER 11. THE MC PACKAGE
Chapter 12
Installation
The implementation of BCPL described in this report is freely available via my Home
Page [3] to individuals for private use and to academic institutions. If you install the
system, please send me an email (to [email protected]) so I can keep a record of
who is interested in it.
This implementation is designed to be machine independent being based on an in-
terpreter written in C. There are, however, hand written assembly language versions
of the interpreter for several architectures (including i386, MIPS, ALPHA and Hitachi
SH3), although these are now little used and are becoming out of date. For Win-
dows XP and Windows 10 there are precompiled .exe files such as wincintsys.exe
and winrastsys.exe. These files should be copied into the appropriate bin directory
and renamed as cintsys.exe and rastsys.exe. For all the other architectures it is
necessary to rebuild the system, but this should be reasonably easy to do.
The simplest installation is for 32-bit Linux machines which will be covered in detail
here.
ch $HOME/distribution
tar zxvf ../Downloads/bcpl.tgz
This creates the directory BCPL containing all the files needed to rebuild the system.
Next enter the cintcode directory by typing the following command.
239
240 CHAPTER 12. INSTALLATION
cd BCPL/cintcode
You are now ready to rebuild the system, but first you must ensure that the envi-
ronment variables BCPLROOT, BCPLHDRS, BCPLPATH, BCPLSCRIPTS are properly defined.
For convenience, there is a bash shell script in os/Linux/setbcplenv. This script also
adds the directory cintcode/bin to the PATH variable. To set all the variables run the
following command.
. $(HOME)/distribution/BCPL/cintcode/os/linux/setbcplenv
It is probably even better to place this line near the end of ~/.bashrc so that the
environment variables are setup every time a new shell window is created.
You are now ready to rebuild the BCPL system by typing the following commands.
cd $(HOME)/distribution/BCPL/cintcode
make clean
make
This should recompile all the C and BCPL code required by the BCPL system and
leave it waiting for the user to type a BCPL Cmmand Language (CLI) command. To
test it type the following commands.
type com/hello.b
c bc hello
hello
bcpl com/bcpl.b to junk
junk com/bcpl.b to junk
c bc bcpl
bench100
c bc cmpltest
cmpltest
What the make command did perhaps needs some explanation. Without arguments
make reads the file Makefile from the current directory and performs the first action
it finds in this file. This first causes bin/cintsys to be created by compiling and
linking all the source files needed to build cintsys. But before doing this it creates
the #include file sysc/INT.h by compiling and running sysc/mkint-h.c. INT.h con-
tains several #define macros that that allow the C programs to determine important
properties of the host machine, such as the C types for signed and unsigned characters.
The C source files for cintsys are all in the directory sysc/ and are: cintsys.c,
cinterp.c, kblib.c, cfuncs.c, joyfn.c, sdlfn.c, glfn.c and extfn.c.
Although cintsys can now be called, it will only work if precompiled Cintcode
compilations of sysb/boot.b, sysb/blib.b, sysb/dlib.b, sysb/cli.b, are placed in
the directorie cin/syscin/. The hand written Cintcode file syslib must also be placed
there for the (trivial) definitions of the functions sys, changeco and muldiv. Compiled
12.1. LINUX INSTALLATION 241
versions of the commands abort, c, echo and bcpl are then placed in cin/. Finally
several scripts such as b, bc and bs are placed in cintcode/. Most of these files have
different versions depending on whether the host is a big or little ender machine.
Finally, make causes the command c compall to be executed on the newly created
system. This compiles all the resident system components contained in sysb and the
standard commands in com/. The system is then ready for use.
You will notice that directory BCPL contains BCPL/cintcode, BCPL/bcplprogs and
BCPL/natbcpl. The directory BCPL/cintcode contains all the source files of the BCPL
Cintcode System, BCPL/bcplprogs contains a collection of directories holding demon-
stration programs, and BCPL/natbcpl contains a version of BCPL that compiles into
native code (for Intel and ALPHA machines) using a mechanism based on the Sial
abstract machine code.
Once the system hase been built it is normally entered using the command cintsys
which can be called when in any directory. If anything has gone wrong various debug-
ging aids can be turned on using either
cintsys -f -v
or
cintsys -f -vv
make clean64
make sys64
cintsys64
The resulting system is almost identical to the standard 32-bit Cintcode BCPL system
but uses a BCPL word length of 64 bits rather that the normal 32.
242 CHAPTER 12. INSTALLATION
Other versions of the system that can be created using other make files, for instance:
make -f MakefileSDL clean
make -f MakefileSDL
This will provide a version with an interface to the SDL graphics library. An interface
to the OpenGL graphics library is provided if MakefileGL is used. The GL version can
be demonstrated by the follow sequence of commands.
cd $(HOME)
cd ../bcplprogs/raspi
cintsys
sysinfo
c b engine
engine
c b dragon
dragon
c b bucket
bucket
c b gltst
gltst
When you enter cintsys you can choose one of two Cintcode interpreters. These
can be selected by the commands fast and slow. The slow interpreter performs
more runtime checks and has mode debugging aids than the fast interpreter and is
thus somewhat slower. Both interpreters are compilations of the same source file
sysc/cinterp.c with the differences controlled by conditional compilation statements
such as #ifdef FASTERPyes.
The make command actually creates rastsys in addition to cintsys. This is a ve-
rion of the system that allows the user to generate raster data that can be used to make
graphs such as the one in Figure 4.2 on page 149 showing memory references during
the compilation of a BCPL compiler. This version is built from the same the source
programs in C using conditional compilation statements such as #ifdef RASTERPyes.
There is a different but related system called cintpos that is closely related to
cintsys. It is a Cintcode based implementation of the Tripos Portable Operating
System originally implemented at Cambridge in the late 1970s. This system allows the
user to create tasks which in modern terminology would be called threads since they
all use the same address space. Information can be sent from one task to another using
the call qpkt(pkt). This appends the packet on the end of a work queue belonging to
the destination task. A task can extract the first packet on its work queue using a call
of taskwait(). If the work queue is empty the task becomes suspended. Every task
has a distinct integer priority and there is a scheduler that ensures the highest priority
task that can run is given control. As with the BCPL distribution, Cintpos has its own
directory ~/distribution/Cintpos and all its files are contained in cintpos.tgz.
Two of the main programs of Cintpos are called cintpos.c and cinterp.c. These
have much in common with cintsys.c and cinterp.c of the BCPL distribution and
the plan is make the C programs in Cintpos identical to the corresponding ones in
the BCPL distribution with the the differences controlled by conditional compilation
statements such as #ifdef CINTSYSyes and #ifdef CINTPOSyes. This change is still
under development.
12.2. COMMAND LINE ARGUMENTS 243
The rastering versions of the interpreter rastsys, rastsys64 can receive the same
arguments.
make
Variants of the above should work for the other architectures running Unix.
wincintsys
244 CHAPTER 12. INSTALLATION
It may be more convenient to move them into a different directory and rename
them as cintsys.exe and rastsys.exe.
I have recently upgraded the Windows version of BCPL so that it can be compiled
and run using the freely available Microsoft C compiler and libraries. On a new PC
I installed the freely available .NET Framework 3.5 and the corresponding SDK 3.5.
This provided amongst many other things a C compiler and all the relevant libraries.
I then created a shortcut on the desktop with
Target: %SystemRoot%\system32\cmd.exe /q /k os\windows\VC8env.bat
and
Start in: E:\distribution\BCPL\cintcode
Double clicking on this shortcut opens a Shell window with the required environ-
ment variable all set up C compilation and the BCPL running environment. If they
are not correct you may have to edit VC8env.bat. The BCPL system was then rebuilt
by the commands:
nmake /f os/windows/MakefileVC clean
nmake /f os/windows/MakefileVC
This should recompile and link all the C code of the BCPL Cintcode system and
then recompile all the standard BCPL system programs and commands. For good
measure, once the BCPL Cintcode system has been entered, recompile all the BCPL
code again by typing:
c compall
A version (64 bit) for the DEC Alpha is also available but is now out of date and
has not been tested recently. To re-build it, it is necessary to comment out the lines
for Linux and uncomment the lines for the ALPHA in Makefile, before running make.
Recently, a version for the ARM processor has been added, particularly for the
Raspberry Pi machine. In directory BCPL/natbcpl on the Raspberry Pi, try typing
If you have the SDL libraries installed (see bcpl4raspi.pdf), you could try
It is useful to know how the make commands such as those above work. Here is a
brief explanation.
The command make clean just deletes all previously built executables together
with all files in the directories obj, sial, temps and tempc since these can easily be
recreated.
The call make prog causes the required BCPL programs to be compiled, if neces-
sary, into Pentium assembly language by executing the following CLI commands. This
also ensures the C program tempc/initprog.c is up to date.
246 CHAPTER 12. INSTALLATION
If necessary make prog also updates the header file tempc/INT.h needed by clib.c
using the following bash commands.
Finally it updates the executable prog, if necessary, by compiling and linking all
the required C and assembly language programs.
The native code program can now be executed in a bash shell using the command
prog or possibly ./prog.
Chapter 13
Example Programs
13.1 Coins
The following program prints out how many different ways a sum of money can be
composed from coins of various denominations.
GET "libhdr"
LET coins(sum) = c(sum, (TABLE 200, 100, 50, 20, 10, 5, 2, 1, 0))
AND c(sum, t) = sum<0 -> 0,
sum=0 -> 1,
!t=0 -> 0,
c(sum, t+1) + c(sum-!t, t)
LET start() = VALOF
{ writes("Coins problem*n")
t(0); t(1); t(2); t(5); t(21); t(100); t(200)
RESULTIS 0
}
AND t(n) BE writef("Sum = %i3 number of ways = %i6*n", n, coins(n))
247
248 CHAPTER 13. EXAMPLE PROGRAMS
13.2 Primes
The following program prints out a table of all primes less than 1000, using the sieve
method.
GET "libhdr"
GLOBAL { count: ug }
MANIFEST { upb = 999 }
LET start() = VALOF
{ LET isprime = getvec(upb)
count := 0
FOR i = 2 TO upb DO isprime!i := TRUE // Until proved otherwise.
FOR p = 2 TO upb IF isprime!p DO
{ LET i = p*p
UNTIL i>upb DO { isprime!i := FALSE; i := i + p }
out(p)
}
writes("*nend of output*n")
freevec(isprime)
RESULTIS 0
}
AND out(n) BE
{ IF count MOD 10 = 0 DO newline()
writef(" %i3", n)
count := count + 1
}
13.3 Queens
The following program calculates the number of ways n queens can be placed on a n×n
chess board without any two occupying the same row, column or diagonal.
GET "libhdr"
GLOBAL { count:200; all:201 }
LET try(ld, col, rd) BE TEST col=all
THEN count := count + 1
ELSE { LET poss = all & ~(ld | col | rd)
UNTIL poss=0 DO
{ LET p = poss & -poss
poss := poss - p
try(ld+p << 1, col+p, rd+p >> 1)
}
}
13.4. FRIDAYS 249
13.4 Fridays
The following program prints a table of how often the 13th day of the month lies on
each day of the week over a 400 year period. Since there are an exact number of weeks
in 4 centuries, program shows that the 13th is most of a Friday!
GET "libhdr"
MANIFEST { mon=0; sun=6; jan=0; feb=1; dec=11 }
LET start() = VALOF
{ LET count = TABLE 0, 0, 0, 0, 0, 0, 0
LET daysinmonth = TABLE 31, ?, 31, 30, 31, 30,
31, 31, 30, 31, 30, 31
LET days = 0
FOR year = 1973 TO 1973+399 DO
{ daysinmonth!feb := febdays(year)
FOR month = jan TO dec DO
{ LET day13 = (days+12) MOD 7
count!day13 := count!day13 + 1
days := days + daysinmonth!month
}
}
FOR day = mon TO sun DO
writef("%i3 %sdays*n",
count!day,
select(day,
"Mon","Tues","Wednes","Thurs","Fri","Sat","Sun")
)
RESULTIS 0
}
AND febdays(year) = year MOD 400 = 0 -> 29,
year MOD 100 = 0 -> 28,
year MOD 4 = 0 -> 29,
28
AND select(n, a0, a1, a2, a3, a4, a5, a6) = n!@a0
250 CHAPTER 13. EXAMPLE PROGRAMS
GET "libhdr"
MANIFEST {
// selectors
H1=0; H2; H3; H4
// Expression operators and tokens
Id=1; Num; Pos; Neg; Mul; Div;Add; Sub
Eq; Cond; Lam; Ap; Y
Lparen; Rparen; Comma; Eof
}
GLOBAL {
space:200; str; strp; strt; ch; token; lexval
}
LET lookup(bv, e) = VALOF
{ WHILE e DO { IF bv=H1!e RESULTIS H2!e
e := H3!e
}
writef("Undeclared name %c*n", H2!bv)
RESULTIS 0
}
AND eval(x, e) = VALOF SWITCHON H1!x INTO
{ DEFAULT: writef("Bad exppression, Op=%n*n", H1!x)
RESULTIS 0
CASE Id: RESULTIS lookup(H2!x, e)
CASE Num: RESULTIS H2!x
CASE Pos: RESULTIS eval(H2!x, e)
CASE Neg: RESULTIS - eval(H2!x, e)
CASE Add: RESULTIS eval(H2!x, e) + eval(H3!x, e)
CASE Sub: RESULTIS eval(H2!x, e) - eval(H3!x, e)
CASE Mul: RESULTIS eval(H2!x, e) * eval(H3!x, e)
CASE Div: RESULTIS eval(H2!x, e) / eval(H3!x, e)
CASE Eq: RESULTIS eval(H2!x, e) = eval(H3!x, e)
CASE Cond: RESULTIS eval(H2!x, e) -> eval(H3!x, e), eval(H4!x, e)
CASE Lam: RESULTIS mk3(H2!x, H3!x, e)
CASE Ap: { LET f, a = eval(H2!x, e), eval(H3!x, e)
LET bv, body, env = H1!f, H2!f, H3!f
RESULTIS eval(body, mk3(bv, a, env))
}
CASE Y: { LET bigf = eval(H2!x, e)
// bigf should be a closure whose body is an
// abstraction eg Lf Ln n=0 -> 1, n*f(n-1)
LET bv, body, env = H1!bigf, H2!bigf, H3!bigf
// Make a closure with a missing environment
LET yf = mk3(H2!body, H3!body, ?)
// Make a new environment including an item for bv
LET ne = mk3(bv, yf, env)
H3!yf := ne // Now fill in the environment component
RESULTIS yf // and return the closure
}
}
13.5. LAMBDA EVALUATOR 251
AND try(expr) BE
{ LET v = VEC 2000
space := v+2000
writef("Trying %s*n", expr)
writef("Answer: %n*n", eval(parse(expr), 0))
}
AND start() = VALOF
{ try("(Lx x+1) 2")
try("(Lx x) (Ly y) 99")
try("(Ls Lk s k k) (Lf Lg Lx f x (g x)) (Lx Ly x) (Lx x) 1234")
try("(Y (Lf Ln n=0->1,n**f(n-1))) 5")
RESULTIS 0
}
GET "libhdr"
MANIFEST {
modulus = #x10001 // 2**16 + 1
$$ln10 // Set condition compilation flag to select data size
//$$walsh
$<ln16 omega = #x00003; ln = 16 $>ln16 // omega**(2**16) = 1
$<ln12 omega = #x0ADF3; ln = 12 $>ln12 // omega**(2**12) = 1
$<ln10 omega = #x096ED; ln = 10 $>ln10 // omega**(2**10) = 1
$<ln4 omega = #x08000; ln = 4 $>ln4 // omega**(2**4) = 1
$<ln3 omega = #x0FFF1; ln = 3 $>ln3 // omega**(2**3) = 1
$<walsh omega=1 $>walsh // The Walsh transform
N = 1<<ln // N is a power of 2
upb = N-1
}
STATIC { data=0 }
13.6. FAST FOURIER TRANSFORM 255
AND reorder(v, n) BE
{ LET j = 0
FOR i = 0 TO n-2 DO
{ LET k = n>>1
// j is i with its bits is reverse order
IF i<j DO { LET t = v!j; v!j := v!i; v!i := t }
// k = 100..00 10..0000..00
// j = 0xx..xx 11..10xx..xx
// j’ = 1xx..xx 00..01xx..xx
// k’ = 100..00 00..0100..00
WHILE k<=j DO { j := j-k; k := k>>1 } //) "increment" j
j := j+k //)
}
}
AND check(w, n) BE
{ // Check that w is a principal nth root of unity
LET x = 1
FOR i = 1 TO n-1 DO { x := mul(x, w)
IF x=1 DO writef("omega****%n = 1*n", i)
}
UNLESS mul(x, w)=1 DO writef("Bad omega**%n should be 1*n", n)
}
AND pr(v, max) BE
{ FOR i = 0 TO max DO { writef("%I5 ", v!i)
IF i MOD 8 = 7 DO newline()
}
newline()
}
AND dv(a, m, b, n) = a=1 -> m,
a=0 -> m-n,
a<b -> dv(a, m, b MOD a, m*(b/a)+n),
dv(a MOD b, m+n*(a/b), b, n)
[1] D.T. Ross et al. AED-0 programmer’s guide and user kit. Technical report, Elec-
tronic Systems Laboratory M.I.T, 1964.
[2] C. Jobson and J.M. Richards. BCPL for the BBC Microcomputer. Acornsoft Ltd,
Cambridge, 1983.
[5] M. Richards, A.R. Aylward, P. Bond, R.D. Evans, and B.J. Knight. The Tripos
Portable Operating System for Minicomputers. Software-Practice and Experience,
9:513–527, June 1979.
257
258 BIBLIOGRAPHY
Appendix A
The syntax of standard BCPL is specified using the transition diagrams given in fig-
ures A.1, A.2, A.3 and A.4. In extended BCPL the floating point operators have
the same precedence as the corresponding integer ones, and the op:= operators are
syntactically identical to the := operator. The syntax of the more binding sequenc-
ing operator (<>) requires some new diagrams to be drawn. This will be done in
due course. It is sufficient to know that <> is more binding than DO, THEN, ELSE,
REPEAT, REPEATWHILE, REPEATUNTIL, and colon. Within the diagrams the syntactic
categories program, section, declaration, command and expressionn are represented
by the rounded boxes: program , section , D , C , En , respectively.
The rectangular boxes are called test boxes and can only be traversed if the con-
dition labelling the box matches the current input. When the label is a token, as in
WHILE and := , it must match the next input token for the test to succeed. The
test box eof is only satisfied if the end of file has been reached. Sometimes the
test box contains a side condition, as in REM n<6 , in which case the side condition
must also be satisfied. The only other test boxes are is call and is name which
are only satisfied if the most recently read expression is syntactically a function call
or a name, respectively. By setting n successively from 0 to 8 in the definition of the
category En , we obtain the definitions of E0 to E8 . Starting from the
definition of program , we can construct an infinite transition diagram containing
only test boxes by simply replacing all rounded boxes by their definitions, recursively.
The parsing algorithm searches through this infinite diagram for a path with the same
sequence of tokens as the program being parsed. In order to eliminate ambiguities,
the left hand branch at a branch point is tried first. Notice how this rule causes the
command
to be equivalent to
259
260 APPENDIX A. BCPL SYNTAX DIAGRAMS
and not
A useful property of these diagrams is that, once a test box has been successfully
traversed, previous branching decisions need not be reconsidered and so the parser
never needs to backtrack.
program
section eof
section
SECTION string ;
NEEDS string ;
MANIFEST name = E0 ; } ;
STATIC {
name : E0 ; }
GLOBAL {
AND
, E0
=
LET name ( name )
BE C
MANIFEST name = E0 ; }
STATIC {
name : E0 ; }
GLOBAL {
AND
,
= E0
LET name ( name )
BE C
= VEC E0
, ,
, name = E0
BREAK REPEATWHILE E0
LOOP REPEATUNTIL
ENDCASE REPEAT
RETURN
FINISH
SKIP
GOTO E0
RESULTIS
DO
DO
IF E0
UNLESS
WHILE
UNTIL
{ D ; C ; }
SWITCHON E0 INTO {
, ,
E0 , E0 := E0
:= E0
is call
is name : C
CASE E0
DEFAULT
En
,
TRUE ( n<9 E0 )
FALSE
! n<8 E8
?
% n<8
name
OF n<8
number
* n<6 E6
character
/ n<6
string
MOD n<6
( E0 ) + n<5 E5
− n<5
! E7
= n<4 = n<4
@
~= n<4 ~= n<4
+ E5
< n<4 < n<4
−
> n<4 > n<4
ABS
<= n<4 <= n<4
NOT E3
>= n<4 >= n<4
,
E4
TABLE E0
<< n<4 E4
VALOF C
>> n<4
: E9 | n<2 E2
EQV n<1 E1
: E9
XOR n<1
−> n<1 E0 , E0