This manual documents the July2020 build of GnuCOBOL 3.1 RC-1.
This document describes the syntax, semantics and use of the COBOL programming language as implemented by GnuCOBOL, formerly known as OpenCOBOL.
The original principal developers of GnuCOBOL were Keisuke Nishida and Roger While. Since then, many members of the community have been involved in its development.
This document is intended to serve as a fully functional reference and user’s guide, suitable for both those readers learning COBOL for the first time as a training tool, as well as those already familiar with another dialect of COBOL.
A separate manual — containing only the details of the GnuCOBOL implementation and designed for experienced COBOL programmers — has been taken from this guide. That document (GnuCOBOL Quick Reference) contains no training subject matter.
Other documents that should be read is the gnucobol.pdf found in the doc directory of the compiler sources and the file NEWS supplied with the source code of the GnuCOBOL compiler, in the top-level directory. There you will find the latest COBOL language features that have been added, some of which may not be in this document due to time constraints. If you find any, please report it as a bug for the Programmer’s Guide so that it can be fixed.
Yet another document which delves deeper in to the compiler that is a must read, is the FAQ available via the GnuCOBOL Manuals and Guides, although it could do with a wee clean up to ease reading and finding required information.
For those wishing to learn COBOL for the first time, Gary can strongly recommend the following resources.
If you like to hold a book in your hands, I strongly recommend Murach’s Structured COBOL, by Mike Murach, Anne Prince and Raul Menendez (2000) - ISBN 9781890774059. Mike Murach and his various writing partners have been writing outstanding COBOL textbooks for decades. It’s an excellent book for those familiar with the concepts of programming in other languages, but unfamiliar with COBOL.
Would you prefer a web-based tutorial? Try the University of Limerick (Ireland) - COBOL web site.
In addition there is the GnuCOBOL FAQ — which has now exceeded 1,400 pages — available as HTML or a downloadable .pdf file.
Along with every release of the compiler sources is the file NEWS. It contains up to the minute updates regarding the compiler and additional COBOL language elements which may well not be yet included in this manual.
If you already know a programming language other than COBOL, chances are that language is Java, C or C++. You will find COBOL much different from those; sometimes the differences are a good thing and sometimes they aren’t. The thing to remember about COBOL is this: it was designed to solve business problems.
COBOL, first introduced to the programming public in 1959, was the very first programming language to become standardized (in 1960). This meant that a standard-compliant COBOL program written on computer “A” made by company “B” would be able to be compiled and executed on computer “X” made by company “Y” with very few, if any, changes. This may not seem like such a big deal today, but it was a radical departure from all programming languages that came before it and even many that came after it.
The name COBOL actually says it all — COBOL is an acronym that stands for “(CO)mmon (B)usiness (O)riented (L)anguage”. Note the fact that the word “common” comes before all others. The word “business” is a close second. Therein lies the key to COBOL’s success.
Despite statements from industry “insiders”, the COBOL programming language is not dead, even though newer and so-called “modern” languages like Java, C#, .NET, Ruby on Rails and so on appear to have become the languages of choice in the Information Technology world. These languages have become popular because they address the following desired requirements for “modern” programming:
Just because COBOL doesn’t traditionally support objects, classes, and the like doesn’t mean that its “procedural” approach to computing isn’t valuable — after all, it runs 70% of the worlds business transactions, and does so:
Today’s IT managers and business leaders are faced with a challenging dilemma — how do you maintain the enormous COBOL code base that is still running their businesses when academia has all but abandoned the language they need their people to use to keep the wheels rolling? The problem is compounded by the fact that those programmers that are skilled in COBOL are retiring and taking their knowledge with them. In some markets, this appears to be having an inflationary effect on the cost of resources (COBOL programmers) whose supply is becoming smaller and smaller. The pressure to update applications to make use of more up-to-date graphical user interfaces is also perceived as a reason to abandon COBOL in favour of GUI-friendly languages such as Java.
Businesses are addressing the COBOL challenge in different ways:
It’s probably a true that an IT professional can no longer afford to allow COBOL to be the only wrench in their toolbox, but with a massive code base still in production now and for the foreseeable future, adding COBOL to a multi-lingual curriculum vitae (CV) and/or resume (yes — they ARE different) is not a bad thing at all. Knowing COBOL as well as the language du-jour will make you the smartest person in the room when the discussion of migrating the current “legacy” environment to a “modern” implementation comes around.
You’ll find COBOL an easy language to learn and a FAR EASIER language to master than many of the “modern” languages.
The whole reason you’re reading this is that you’ve discovered GnuCOBOL — another implementation of COBOL in addition to those mentioned earlier. The distinguishing characteristic of GnuCOBOL versus those others is that GnuCOBOL is FREE open-source and therefore FREE to obtain and use. It is community-enhanced and community-supported. Later in this document (see So What is GnuCOBOL?), you’ll begin to learn more about this COBOL implementation’s capabilities.
Throughout the history of computer programming, the search for new ways to improve of the productivity of programmers has been a major consideration. Other than hobbyists, programming is an activity performed for money, and businesses abhor spending anything more than is absolutely necessary; even government agencies try to spend as little money on projects as is absolutely necessary.
The amount of programming necessary to accomplish a given task — including rework needed by any errors found during testing (testing is sometimes jokingly defined as: that time during which an application is actually in production, allowing users to discover the problems) is the measure of programmer productivity. Anything that reduces that effort will therefore reduce the time spent in such activities therefore reducing the expense of same. When the expense of programming is reduced, programmer productivity is increased.
Sometimes the quest for improved programmer productivity (and therefore reduced programming expense) has taken the form of introducing new features in programming languages, or even new languages altogether. Sometimes it has resulted in new ways of using the existing languages.
While many technological and procedural developments have made evolutionary improvements to programmer productivity, each of the following three events has been responsible for revolutionary improvements:
The reality is, however, that good programmers have been practising code re-usability for more than a half-century. Up until recently, COBOL programmers have had some of the best code re-usability tools available — they’ve been doing it with copybooks and subprograms rather than classes, methods and attributes but the net results have been similar. With the COBOL2002 and the COBOL 2014 standards, the COBOL programming language has become just as “object-oriented” as the “modern” languages, while preserving the ability to support, modify, compile and execute “legacy” COBOL programs as well.
While GnuCOBOL supports few of the OOP programming constructs defined by the COBOL2002 and COBOL2014 standards, it supports every aspect of the ANSI 85 standard and therefore fully meets the needs of points #1 and #2, above. With its supported feature set (see So What is GnuCOBOL?), it provides significant programmer productivity capabilities.
GnuCOBOL is a free and open sourced COBOL compiler and runtime environment, written using the C programming language. GnuCOBOL is typically distributed in source-code form, and must then be built for your computer’s operating system using the system’s C compiler and loader. While originally developed for the UNIX and Linux operating systems, GnuCOBOL has also been successfully built for computers running OSX and Windows utilizing the UNIX-emulation features of such tools as Cygwin and MinGW. Also see the GNU website for more information at.
The MinGW approach is a personal favourite with the author of this manual because it creates a GnuCOBOL compiler and runtime library that require only a single MinGW DLL to be available for the GnuCOBOL compiler, runtime library and user programs. That DLL is freely distributable under the terms of the GNU General Public License. A MinGW build of GnuCOBOL fits easily on and runs from a 128MB flash drive with no need to install any software onto the Windows computer that will be using it. Some functionality of the language, dealing with the sharing of files between concurrently executing GnuCOBOL programs and record locking on certain types of files, is sacrificed however as the underlying operating system routines needed to implement them aren’t available to Windows and aren’t provided by MinGW. The current version for MinGW is available at the download link along with various other platforms at the GnuCOBOL download website.
GnuCOBOL has also been built as a truly native Windows application utilizing Microsoft’s freely-downloadable Visual Studio Express package to provide the C compiler and linker/loader. This approach does not lend itself well to a “portable” distribution.
The GnuCOBOL compiler generates C code from your COBOL programs; that C code is then automatically compiled and linked using your system’s C compiler (typically, but not limited to, gcc).
GnuCOBOL fully supports much of the ANSI 85 standard for COBOL (the only major exclusion is the Communications Module) and also supports some of the components of the COBOL2002 and COBOL2014 standards, such as the SCREEN SECTION (see SCREEN SECTION), table-based SORT (see Table SORT) and user-defined functions. There are others with more being added almost weekly.
This chapter describes the syntax, semantics and usage of the COBOL programming language as implemented by the current version of GnuCOBOL. For the rest of this document the Language is spelt as COBOL to ease reading however the compiler name retains the mixed case of GnuCOBOL.
This document is intended to serve as a full-function reference and user’s guide suitable for both those readers learning COBOL for the first time as usage as a training tool, as well as those already familiar with some dialects of the COBOL language.
A separate manual exists that just contains the details of the Cobol grammar as implemented in GnuCOBOL, which is designed strictly for experienced COBOL programmers and this is taken from this guide. This does NOT contain any training subject matter.
These extra manuals are: GnuCOBOL Quick Reference containing just the COBOL semantics / grammar in a short document while the other, GnuCOBOL Sample Programs, shows detailed example Cobol programs with indication of syntax used in each program.
For each implementation of the GnuCOBOL compiler the supplied files NEWS should also be read for any last minute updates along with files README and INSTALL for building the compiler.
Cobol programs consist of a sequence of words and symbols. Words, which consist of sequences of letters (upper- and/or lower-case), digits, dashes (‘-’) and/or underscores (‘_’) may have a pre-defined, specific, meaning to the compiler or may be invented by the programmer for his/her purposes.
The GnuCOBOL language specification defines over 1130 Reserved Words — words to which the compiler assigns a special meaning. This list and number applies to the default list which covers many implementations and that it is possible to limit the list to either a specific implementation via -std=xyz[-strict] or to manually unreserve words if they are used in existing sources as user-defined words.
Programmers may use a reserved word as part of a word they are creating themselves, but may not create their own word as an exact duplicate (without regard to case) of a COBOL reserved word. Note that a reserved word includes all classes, such as intrinsic functions, mnemonics names, system routines and reserved words. The list of reserved words can be changed by adding or removing specific words for a given compile or as a default by use of the steering command -std and -conf. See the specific config files that are by default, held in /usr/local/share/gnucobol/config. Also using the option ‘FUNCTION ALL INTRINSIC’, will add another 100+ reserved words. These can be modified to match the requirements of a business or project team but be warned that these are updated when a new version of the compiler is built so might be more prudent to create your own configuation based on an existing one but with a different name.
In addition, you can add and/or remove reserved words by adding one of these options to cobc to add -freserved-words=value or -freserved=word or, to remove, -fnot-reserved=word. As well as -freserved=word:alias to create an alias for a word as well as -fnot-register=word or -fregister=word to remove or add, a special register word.
See Appendix B - Reserved Word List, for a complete list of GnuCOBOL reserved words .
For any given version of GnuCOBOL you can also list the full current set of reserved words by running cobc with --list-reserved, --list-intrinsic, --list-system as well as --list-mnemonics. Again subject to variation depending on usage of the --std line command.
When you write GnuCOBOL programs, you’ll need to create a variety of words to represent various aspects of the program, the program’s data and the external environment in which the program will run. This will include internal names by which data files will be referenced, data item names and names of executable logic procedures.
User-defined words may be composed from the characters ‘A’ through ‘Z’ (upper- and/or lower-case), ‘0’ through ‘9’, dash (‘-’) and underscore (‘_’). User-defined words may neither start nor end with hyphen or underscore characters.
Other programming languages provide the programmer with a similar capability of creating their own words (names) for parts of a program; COBOL is somewhat unusual when compared to other languages in that user-defined words may start with a digit.
With the exception of logic procedure names, which may consist entirely of nothing but digits, user-defined words must contain at least one letter.
All COBOL implementations allow the use of both upper and lower case letters in program coding. GnuCOBOL is completely insensitive to the case used when writing reserved words or user-defined names. Thus, AAAAA, aaaaa, Aaaaa and AaAaA are all the same word as far as GnuCOBOL is concerned.
The only time the case used does matter is within quoted character strings, where character values will be exactly as coded.
By convention throughout this document, COBOL reserved words will be shown entirely in UPPER-CASE while those words that were created by a programmer will be represented by tokens in mixed or lower case.
This isn’t a bad practice to use in actual programs, as it leads to programs where it is much easier to distinguish reserved words from user-defined ones!
Critics of COBOL frequently focus on the wordiness of the language, often citing the case of a so-called “Hello World” program as the “proof” that COBOL is so much more tedious to program in than more “modern” languages. This tedium is cited as such a significant impact to programmer productivity that, in their opinions, COBOL can’t go away quickly enough.
Here are two different “Hello World” applications, one written in Java and the second in GnuCOBOL. First, the Java version:
Class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello World!");
}
}
And here is the same program, written in GnuCOBOL:
IDENTIFICATION DIVISION.
PROGRAM-ID. HelloWorld.
PROCEDURE DIVISION.
DISPLAY "Hello World!".
Both of the above programs could have been written on a single line, if desired, and both languages allow a programmer to use (or not use) indentation as they see fit to improve program readability. Sounds like a tie so far.
Let’s look at how much more “wordy” COBOL is than Java. Count the characters in the two programs. The Java program has 95 (not counting carriage returns and any indentation). The COBOL program has 89 (again, not counting carriage returns and indentation)! Technically, it could have been only 65 because the IDENTIFICATION DIVISION. header is actually optional. Clearly, “Hello World” doesn’t look any more concise in Java than it does in COBOL.
Let’s look at a different problem. Surely a program that asks a user to input a positive integer, generates the sum of all positive integers from 1 to that number and then prints the result will be MUCH shorter and MUCH easier to understand when coded in Java than in COBOL, right?
You can be the judge. First, the Java version:
import java.util.Scanner;
public class sumofintegers {
public static void main(String[] arg) {
System.out.println("Enter a positive integer");
Scanner scan=new Scanner(System.in);
int n=scan.nextInt();
int sum=0;
for (int i=1;i<=n;i++) {
sum+=i;
}
System.out.println("The sum is "+sum);
}
}
And now for the COBOL version:
IDENTIFICATION DIVISION.
PROGRAM-ID. SumOfIntegers.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 n BINARY-LONG.
01 i BINARY-LONG.
01 sum BINARY-LONG VALUE 0.
PROCEDURE DIVISION.
DISPLAY "Enter a positive integer"
ACCEPT n
PERFORM VARYING i FROM 1 BY 1 UNTIL i > n
ADD i TO sum
END-PERFORM
DISPLAY "The sum is " sum.
My familiarity with COBOL may be prejudicing my opinion, but it doesn’t appear to me that the Java code is any simpler than the COBOL code. In case you’re interested in character counts, the Java code comes in at 278 (not counting indentation characters). The COBOL code is 298 (274 without the IDENTIFICATION DIVISION. header).
Despite what you’ve seen here, the more complex the programming logic being implemented, the more concise the Java code will appear to be, even compared to 2002-standard COBOL. That conciseness comes with a price though — program code readability. Java (or C or C++ or C#) programs are generally intelligible only to trained programmers. COBOL programs can, however, be quite understandable by non-programmers. This is actually a side-effect of the “wordiness” of the language, where COBOL statements use natural English words to describe their actions. This inherent readability has come in handy many times throughout my career when I’ve had to learn obscure business (or legal) processes by reading the COBOL program code that supports them.
The “modern” languages, like Java, also have their own “boilerplate” infrastructure overhead that must be coded in order to write the logic that is necessary in the program. Take for example the public static void main(String[] arg) and import java.util.Scanner; statements. The critics tend to forget about this when they criticize COBOL for its structural “overhead”.
When it first was developed, COBOL’s easily-readable syntax made it profoundly different from anything that had been seen before. For the first time, it was possible to specify logic in a manner that was — at least to some extent — comprehensible even to non-programmers. Take for example, the following code written in FORTRAN — a language developed only a year before COBOL:
EXT = PRICE * IQTY
INVTOT = INVTOT + EXT
With its original limitation on the length of variable names (one- to six-character names comprised of a letter followed by up to five letters and/or digits), its implicit rule that variables were automatically created as real (floating-point) unless their name started with a letter in the range I-N, and its use of algebraic notation to express actions being taken, FORTRAN wasn’t a particularly readable language, even for programmers. Compare this with the equivalent COBOL code:
MULTIPLY price BY quantity GIVING extended-amount
ADD extended-amount TO invoice-total
Clearly, even a non-programmer could at least conceptually understand what was going on! Over time, languages like FORTRAN evolved more robust variable names, and COBOL introduced a more formula-based syntactical capability for arithmetic operations, but FORTRAN was never as readable as COBOL.
Because of its inherent readability, I would MUCH rather be handed an assignment to make significant changes to a COBOL program about which I know nothing than to be asked to do the same with a C, C++, C# or Java program.
Those that argue that it is too boring / wasteful / time-consuming / insulting (pick one) to have to code a COBOL program “from scratch” are clearly ignorant of the following facts:
COBOL programs are structured into four major areas of coding, each with its own purpose. These four areas are known as divisions.
Each division may consist of a variety of sections and each section consists of one or more paragraphs. A paragraph consists of sentences, each of which consists of one or more statements.
This hierarchical structure of program components standardizes the composition of all COBOL programs. Much of this manual describes the various divisions, sections, paragraphs and statements that may comprise any COBOL program.
COPY statement (see COPY)
A
Copybook is a segment of program code that may be utilized by multiple programs simply by having those programs use the COPY statement to import that code. This code may define files, data structures or procedural code.
Today’s current programming languages have a statement (usually, this statement is named “import”, “include” or “#include”) that performs this same function. What makes the COBOL copybook feature different than the “include” facility in newer languages, however, is the fact that the COPY statement can edit the imported source code as it is being copied. This capability makes copybook libraries extremely valuable to making code reusable. Also see section 3. Compiler Directing Facility commands COPY and REPLACE.
A contiguous area of storage within the memory space of a program that may be referenced, by name, in a COBOL program is referred to as a Data Item. Other programming languages use the term variable, property or attribute to describe the same thing.
COBOL introduced the concept of structured data. The principle of structured data in COBOL is based on the idea of being able to group related and contiguously-allocated data items together into a single aggregate data item, called a Group Item. For example, a 35-character ’Employee-Name’ group item might consist of a 20-character ’Last-Name’ followed by a 14-character ’First-Name’ and a 1-character ’Middle-Initial’.
A data item that isn’t itself formed from other data items is referred to in COBOL as an Elementary Item. In the previous example, ’Last-Name’, ’First-Name’ and ’Middle-Initial’ are all elementary items.
One of COBOL’s strengths is the wide variety of data files it is capable of accessing. GnuCOBOL programs, like those created with other COBOL implementations, need to have the structure of any files they will be reading and/or writing described to them. The highest-level characteristic of a file’s structure is defined by specifying the organization of the file, as follows:
ORGANIZATION LINE SEQUENTIALThese are files with the simplest of all internal structures. Their contents are structured simply as a series of identically- or differently-sized data records, each terminated by a special end-of-record delimiter character. An ASCII line-feed character (hexadecimal 0A) is the end-of-record delimiter character used by any UNIX or pseudo-UNIX (MinGW, Cygwin, OSX) GnuCOBOL build. A truly native Windows build would use a carriage-return, line-feed (hexadecimal 0D0A) sequence.
Records must be read from or written to these files in a purely sequential manner. The only way to read (or write) record number 100 would be to have read (or written) records number 1 through 99 first.
When the file is written to by a GnuCOBOL program, the delimiter sequence will be automatically appended to each data record as it is written to the file. A WRITE (see WRITE) to this type of file will be done as if a BEFORE ADVANCING 1 LINE clause were specified on the WRITE, if no ADVANCING clause is coded.
When the file is read, the GnuCOBOL runtime system will strip the trailing delimiter sequence from each record. The data will be padded (on the right) with spaces if the data just read is shorter than the area described for data records in the program. If the data is too long, it will be truncated and the excess will be lost.
These files should not be defined to contain any exact binary data fields because the contents of those fields could inadvertently have the end-of-record sequence as part of their values — this would confuse the runtime system when reading the file, and it would interpret that value as an actual end-of-record sequence.
LINE ADVANCINGThese are files with an internal structure similar to that of a line sequential file. These files are defined (without an explicit ORGANIZATION specification) using the LINE ADVANCING clause on their SELECT statement (see SELECT).
When this kind of file is written to by a GnuCOBOL program, an end-of-record delimiter sequence will be automatically added to each data record as it is written to the file. A WRITE to this type of file will be done as if an AFTER ADVANCING 1 LINE clause were specified on the WRITE, if no ADVANCING clause is coded.
Like line sequential files, these files should not be defined to contain any exact binary data fields because the contents of those fields could inadvertently have the end-of-record sequence as part of their values — this would confuse the runtime system when reading the file, and it would interpret that value as an actual end-of-record sequence.
ORGANIZATION SEQUENTIALThese files also have a simple internal structure. Their contents are structured simply as an arbitrarily-long sequence of data characters. This sequence of characters will be treated as a series of fixed-length records simply by logically splitting the sequence of characters up into fixed-length segments, each as long as the maximum record size defined in the program. There are no special end-of-record delimiter characters in the file and when the file is written to by a GnuCOBOL program, no delimiter sequence is appended to the data.
Records in this type of file are all the same physical length, except possibly for the very last record in the file, which may be shorter than the others. If variable-length logical records are defined to the program, the space occupied by each physical record in the file will occupy the space described by the longest record description in the program.
So, if a file contains 1275 characters of data, and a program defines the structure of that file as containing 100-character records, then the file contents will consist of twelve (12) 100-character records with a final record containing only 75 characters.
It would appear that it should be possible to locate and process any record in the file directly simply by calculating its starting character position based upon the program-defined record size. Even so, however, records must be still be read or written to these files in a purely sequential manner. The only way to read (or write) record number 100 would be to have read (or written) records number 1 through 99 first.
When the file is read, the data is transferred into the program exactly as it exists in the file. In the event that a short record is read as the very last record, that record will be padded (to the right) with spaces.
Care must be taken that programs reading such a file describe records whose length is exactly the same as that used by the program that created the file. For example, the following shows the contents of a SEQUENTIAL file created by a program that wrote five 6-character records to it. The ‘A’, ‘B’, … values reflect the records that were written to the file:
Now, assume that another program reads this file, but describes 10-character records rather than 6. Here are the records that program will read:
There may be times where this is exactly what you were looking for. More often than not, however, this is not desirable behaviour. Suggestion: use a copybook to describe the record layouts of any file; this guarantees that multiple programs accessing that file will “see” the same record sizes and layouts by coding a COPY statement (see COPY) to import the record layout(s) rather than hand-coding them.
These files can contain exact binary data fields. Because there is no character sequence that constitutes an end-of-record delimiter, the contents of record fields are irrelevant to the reading process.
ORGANIZATION RELATIVEThe contents of these files consist of a series of fixed-length data records prefixed with a four-byte record header. The record header contains the length of the data, in bytes. The byte-count does not include the four-byte record header.
Records in this type of file are all the same physical length. If variable-length logical records are defined to the program, the space occupied by each physical record in the file will occupy the maximum possible space, and the logical record length field will contain the number of bytes of data in the record that are actually in use.
This file organization was defined to accommodate either sequential or random processing. With a RELATIVE file, it is possible to read or write record 100 directly, without having to have first read or written records 1-99. The GnuCOBOL runtime system uses the program-defined maximum record size to calculate a relative byte position in the file where the record header and data begin, and then transfers the necessary data to or from the program.
When the file is written by a GnuCOBOL program, no delimiter sequence is appended to the data, but a record-length field is added to the beginning of each physical record.
When the file is read, the data is transferred into the program exactly as it exists in the file.
Care must be taken that programs reading such a file describe records whose length is exactly the same as that used by the programs that created the file. It won’t end well if the GnuCOBOL runtime library interprets a four-byte ASCII character string as a record length when it transfers data from the file into the program!
Suggestion: use a copybook to describe the record layouts of any file; this guarantees that multiple programs accessing that file will “see” the same record sizes and layouts by coding a COPY statement (see COPY) to import the record layout(s) rather than hand-coding them.
These files can contain exact binary data fields. The contents of record fields are irrelevant to the reading process as there is no end-of-record delimiter.
ORGANIZATION INDEXEDThis is the most advanced file structure available to GnuCOBOL programs. It’s not possible to describe the physical structure of such files because that structure will vary depending upon which advanced file-management facility was included into the GnuCOBOL build you will be using (Berkeley Database [BDB], VBISAM, etc.). We will — instead — discuss the logical structure of the file.
There will be multiple structures stored for an INDEXED file. The first will be a data component, which may be thought of as being similar to the internal structure of a relative file. Data records may not, however, be directly accessed by their record number as would be the case with a relative file, nor may they be processed sequentially by their physical sequence in the file.
The remaining structures will be one or more index components. An index component is a data structure that (somehow) enables the contents of a field, called a primary key, within each data record (a customer number, an employee number, a product code, a name, etc.) to be converted to a record number so that the data record for any given primary key value can be directly read, written and/or deleted. Additionally, the index data structure is defined in such a manner as to allow the file to be processed sequentially, record-by-record, in ascending sequence of the primary key field values. Whether this index structure exists as a binary-searchable tree structure (b-tree), an elaborate hash structure or something else is pretty much irrelevant to the programmer — the behaviour of the structure will be as it was just described. The actual mechanism used will depend upon the advanced file-management package was included into your GnuCOBOL implementation when it was built.
The runtime system will not allow two records to be written to an indexed file with the same primary key value.
The capability exists for an additional field to be defined as what is known as an alternate key. Alternate key fields behave just like primary keys, allowing both direct and sequential access to record data based upon the alternate key field values, with one exception. That exception is the fact that alternate keys may be allowed to have duplicate values, depending upon how the alternate key field is described to the GnuCOBOL compiler.
There may be any number of alternate keys, but each key field comes with a disk space penalty as well as an execution time penalty. As the number of alternate key fields increases, it will take longer and longer to write and/or modify records in the file.
These files can contain exact binary data fields. The contents of record fields are irrelevant to the reading process as there is no end-of-record delimiter.
All files are initially described to a GnuCOBOL program using a SELECT statement (see SELECT). In addition to defining a name by which the file will be referenced within the program, the SELECT statement will specify the name and path by which the file will be known to the operating system along with its organization, locking and sharing attributes.
A file description in the FILE SECTION (see FILE SECTION) will define the structure of records within the file, including whether or not variable-length records are possible and, if so, what the minimum and maximum length might be. In addition, the file description entry can specify file I/O block sizes.
Other programming languages have arrays; COBOL has tables. They’re basically the same thing. There are two special statements that exist in the COBOL language — SEARCH and SEARCH ALL — that make finding data in a table easy.
SEARCH searches a table sequentially, stopping only when either a table entry matching one of any number of search conditions is found, or when all table entries have been checked against the search criteria and none matched any of those criteria.
SEARCH ALL performs an extremely fast search against a table sorted by a key field contained in each table entry. The algorithm used for such a search is a binary search. The algorithm ensures that only a small number of entries in the table need to be checked in order to find a desired entry or to determine that the desired entry doesn’t exist in the table. The larger the table, the more effective this search becomes. For example, a binary search of a table containing 32,768 entries will locate a particular entry or determine the entry doesn’t exist by looking at no more than fifteen (15) entries! The algorithm is explained in detail in the documentation of the SEARCH ALL statement (see SEARCH ALL).
Finally, COBOL has the ability to perform in-place sorts of the data that is found in a table.
The COBOL language includes a powerful SORT statement that can sort large amounts of data according to arbitrarily complex key structures. This data may originate from within the program or may be contained in one or more external files. The sorted data may be written automatically to one or more output files or may be processed, record-by-record in the sorted sequence.
A companion statement — MERGE — can combine the contents of multiple files together, provided those files are all pre-sorted in a similar manner according to the same key structure. The resulting output will consist of the contents of all of the input files, merged together and sequenced according to the common key structure(s). The output generated by a MERGE statement may be written automatically to one or more output files or may be processed internally by the program.
A special form of the SORT statement also exists just to sort the data that resides in a table. This is particularly useful if you wish to use SEARCH ALL against the table.
There have been programming languages designed specifically for the processing of text strings, and there have been programming languages designed for the sole purpose of performing high-powered numerical computations. Most programming languages fall somewhere in the middle.
COBOL is no exception, although it does include some very powerful string manipulation capabilities; GnuCOBOL actually has even more string-manipulation capabilities than many other COBOL implementations. The following summarizes GnuCOBOL’s string-processing capabilities:
CONCATENATE intrinsic function (see CONCATENATE).
STRING statement (see STRING).
LOCALE-TIME intrinsic function (see LOCALE-TIME).
LOCALE-DATE intrinsic function (see LOCALE-DATE).
CHAR intrinsic function (see CHAR). Add 1 to argument before invoking the function; the description of the CHAR intrinsic function presents a technique utilizing the MOVE statement that will accomplish the same thing without the need of adding 1 to the numeric argument value first.
LOWER-CASE intrinsic function (see LOWER-CASE).
C$TOLOWER built-in system subroutine (see C$TOLOWER).
CBL_TOLOWER built-in system subroutine (see CBL_TOLOWER).
UPPER-CASE intrinsic function (see UPPER-CASE).
C$TOUPPER built-in system subroutine (see C$TOUPPER).
CBL_TOUPPER built-in system subroutine (see CBL_TOUPPER).
C$PRINTABLE built-in system subroutine (see C$PRINTABLE).
ORD intrinsic function (see ORD). Subtract 1 from the result; the description of the ORD intrinsic function presents a technique utilizing the MOVE statement that will accomplish the same thing without the need of adding 1 to the numeric argument value first.
INSPECT statement (see INSPECT) with the TALLYING clause.
LENGTH intrinsic function (see LENGTH).
BYTE-LENGTH intrinsic function (see BYTE-LENGTH).
MOVE statement (see MOVE) with picture-symbol editing applied to the receiving field:
C$JUSTIFY built-in system subroutine (see C$JUSTIFY).
INSPECT statement (see INSPECT) with the CONVERTING.
TRANSFORM statement (see TRANSFORM).
SUBSTITUTE intrinsic function (see SUBSTITUTE).
SUBSTITUTE-CASE intrinsic function (see SUBSTITUTE-CASE).
UNSTRING statement (see UNSTRING).
TRIM intrinsic function (see TRIM).
MOVE statement (see MOVE) with a reference modifier on the “receiving” field (see Reference Modifiers).
INSPECT statement (see INSPECT) with a REPLACING clause.
SUBSTITUTE intrinsic function (see SUBSTITUTE).
SUBSTITUTE-CASE intrinsic function (see SUBSTITUTE-CASE).
SUBSTITUTE intrinsic function (see SUBSTITUTE).
SUBSTITUTE-CASE intrinsic function (see SUBSTITUTE-CASE).
The COBOL2002 standard formalizes extensions to the COBOL language that allow for the definition and processing of text-based screens, as is a typical function on mainframe and midframe computers as well as on many point-of-sale (i.e. “cash register”) systems. GnuCOBOL implements virtually all the screen-handling features described by COBOL2002.
These features allow fields to be displayed at specific row/column positions, various colors and video attributes to be assigned to screen fields and the pressing of specific function keys (F1, F2, …) to be detectable. All of this takes place through the auspices of the SCREEN SECTION (see SCREEN SECTION) and special formats of the ACCEPT statement (see ACCEPT) and the DISPLAY statement (see DISPLAY).
The COBOL2002 standard, and therefore GnuCOBOL, only covers textual user interface (TUI) screens (those comprised of ASCII characters presented using a variety of visual attributes) and not the more-advanced graphical user interface (GUI) screen design and processing capabilities built into most modern operating systems. There are subroutine-based packages available that can do full GUI presentation — most of which may be called by GnuCOBOL programs, with a moderate research time investment (Tcl/Tk, for example) — but none are currently included with GnuCOBOL.
A Sample Screen Produced by a GnuCOBOL Program:
================================================================================
GCic (2014/01/02 11:24) GnuCOBOL 2.1 23NOV2013 Interactive Compilation
+------------------------------------------------------------------------------+
: Filename: GCic.cbl :
: Folder: E:\Programs\GCic\2013-11-23 :
+------------------------------------------------------------------------------+
Set/Clr Switches Via F1-F9; Set Config Via F12; ENTER Key Compiles; ESC Quits
+------------------------------------------------------------------------------+
: F1 Assume WITH DEBUGGING MODE F6 >"FUNCTION" Is Optional : Current :
: F2 Procedure+Statement Trace F7 >Enable All Warnings : Config: :
: F3 Make a Library (DLL) F8 Source Is Free-Format : DEFAULT :
: F4 Execute If Compilation OK F9 >No COMP/BINARY Truncation : :
: F5 Listing Off : :
+------------------------------------------------------------------------------+
Extra "cobc" Switches, If Any ("-save-temps=xxx" Prevents Listings):
+------------------------------------------------------------------------------+
: ____________________________________________________________________________ :
: ____________________________________________________________________________ :
+------------------------------------------------------------------------------+
Program Execution Arguments, If Any:
+------------------------------------------------------------------------------+
: ____________________________________________________________________________ :
: ____________________________________________________________________________ :
+------------------------------------------------------------------------------+
GCic for Windows/MinGW Copyright (C) 2009-2014, Gary L. Cutler, GPL
================================================================================
The above screen was produced by the GnuCOBOL Interactive Compiler, or GCic. See GCic in GnuCOBOL Sample Programs, for the source and cross-reference listing of this program. PDF versions of this document will include an actual graphical image of this sample screen.
Screens are defined in the screen section of the data division. Once defined, screens are used at run-time via the ACCEPT and DISPLAY statements.
GnuCOBOL supports the following visual attribute specifications in the SCREEN SECTION (see SCREEN SECTION):
Eight (8) different colors may be specified for both the background (screen) and foreground (text) color of any row/column position on the screen. Colors are specified by number, although a copybook supplied with all GnuCOBOL distributions (screenio.cpy) defines COB-COLOR-xxxxxx names for the various colors so they may be specified as a more meaningful name rather than a number. The eight colors, by number, with the constant names defined in screenio.cpy, are as follows:
BlackCOB-COLOR-BLACK
BlueCOB-COLOR-BLUE
GreenCOB-COLOR-GREEN
CyanCOB-COLOR-CYAN
RedCOB-COLOR-RED
MagentaCOB-COLOR-MAGENTA
YellowCOB-COLOR-YELLOW
WhiteCOB-COLOR-WHITE
There are three possible brightness levels supported for text — lowlight (dim), normal and highlight (bright). Not all GnuCOBOL implementations will support all three (some treat lowlight the same as normal). The deciding factor as to whether two or three levels are supported lies with the version of the curses package that is being used. This is a utility screen-IO package that is included into the GnuCOBOL run-time library when the GnuCOBOL software is built.
As a general rule of thumb, Windows implementations support two levels while Unix ones support all three.
This too is a video feature that is dependent upon the curses package built into your version of GnuCOBOL. If blinking is enabled in that package, text displayed in fields defined in the screen section as being blinking will endlessly cycle between the brightest possible setting (highlight) and an “invisible” setting where the text color matches that of the field background color. A Windows build, which generally uses the “pcurses” package, will uses a brighter-than-normal background color to signify “blinking”.
This video attribute simply swaps the foreground and background colors and display options.
It is possible, if supported by the curses package being used, to draw borders on the top, left and/or bottom edges of a field.
If desired, screen fields used as input fields may defined as “secure” fields, where each input character (regardless of what was actually typed) will appear as an asterisk (*) character. The actual character whose key was pressed will still be stored into the field in the program, however. This is very useful for password or account number fields.
Input fields may have any character used as a fill character. These fill characters provide a visual indication of the size of the input field, and will automatically be transformed into spaces when the input field is processed by the program. If no such character is defined for an input field, an underscore (‘_’) will be assumed.
GnuCOBOL includes an implementation of the Report Writer Control System, or RWCS. The reportwriter module is now fully implemented as of version 3.0. This is a standardized, optional add-on feature to the COBOL language which automates much of the mechanics involved in the generation of printed reports by:
SORT statement (see SORT) keys — have changed. This is known as a control break. The RWCS can automatically perform the following reporting actions when a control break occurs:
The REPORT SECTION (see REPORT SECTION) documentation explores the description of reports and the PROCEDURE DIVISION (see PROCEDURE DIVISION) chapter documents the various language statements that actually produce reports. Before reading these, you might find it helpful to read Report Writer Usage, which is dedicated to putting the pieces together for you.
There are three ways in which data division data gets initialized.
SPACES.
ZERO.
VALUE (see VALUE) clause in their definition will be initialized to that specific value.
The various sections of the data division each have their own rules as to when the actions described above will occur — consult the documentation on those sections for additional information.
These default initialization rules can vary quite substantially from one COBOL implementation to another. For example, it is quite common for data division storage to be initialized to all binary zeros except for those data items where VALUE clauses are present. Take care when working with applications originally developed for another COBOL implementation to ensure that GnuCOBOL’s default initialization rules won’t prove disruptive.
INITIALIZE statement (see INITIALIZE) to initialise any group or elementary data item at any time. This statement provides far more initialization options than just the simple rules stated above.
ALLOCATE statement (see ALLOCATE) statement is used to allocate a data item or to simply allocate an area of storage of a size specified on the ALLOCATE, that allocation may occur with or without initialization, as per the programmer’s needs.
Syntax of the GnuCOBOL language will be described in special syntax diagrams using the following syntactical-description techniques:
Reserved words of the COBOL language will appear in UPPER-CASE. When they appear underlined, as this one is, they are required reserved words.
When reserved words appear without underlining, as this one is, they are optional; such reserved words are available in the language syntax merely to improve readability — their presence or absence has no effect upon the program.
When only a portion of a reserved word is underlined, it indicates that the word may either be coded in its full form or may be abbreviated to the portion that is underlined.
Generic terms representing user-defined substitutable items will be shown entirely in lower-case in syntax diagrams. When such items are referenced in text, they will appear as substitutable-items.
Items appearing in Mixed Case within a syntax diagram represent complex clauses of other syntax elements that may appear in that position. Some COBOL syntax gets quite complicated, and using a convention such as this significantly reduces the complexity of a syntax diagram. When such items are referenced in text, they will appear as Complex-Syntax-Clause.
Square bracket meta characters on syntax diagrams document language syntax that is optional. The [] characters themselves should not be coded. If a syntax diagram contains ‘a [b] c’, the ‘a’ and ‘c’ syntax elements are mandatory but the ‘b’ element is optional.
Vertical bar meta characters on syntax diagrams document simple choices. The | character itself should not be coded. If a syntax diagram contains ‘a|b|c’, exactly one of the items ‘a’, ‘b’ or ‘c’ must be selected.
A vertical list of items, bounded by multiple brace characters, is another way of signifying a choice between a series of items where exactly one item must be selected. This form is used to show choices when one or more of the selections is more complex than just a single word, or when there are too many choices to present horizontally with ‘|’ meta characters.
A vertical list of items, bounded by multiple vertical bar characters, signifies a choice between a series of items where one or more of the choices could be selected.
The ... meta character sequence signifies that the syntax element immediately preceding it may be repeated. The ... sequence itself should not be coded. If a syntax diagram contains a b... c, syntax element ‘a’ must be followed by at least one ‘b’ element (possibly more) and the entire sequence must be terminated by a ‘c’ syntax element.
The braces (‘{’ and ‘}’) meta characters may be used to group a sequence of syntax elements together so that they may be treated as a single entity. The {} characters themselves should not be coded. These are typically used in combination with the ‘|’ or ‘...’ meta characters.
Any of these characters appearing within a syntax diagram are to be interpreted literally, and are characters that must be coded — where allowed — in the statement whose format is being described. Note that a ‘.’ character is a literal character that must be coded on a statement whereas a ‘...’ symbol is the meta character sequence described above.
Prior to the COBOL2002 standard, source statements in COBOL programs were structured around 80-column punched cards. This means that each source line in a COBOL program consisted of five different “areas”, defined by their column number(s).
As of the COBOL2002 standard, a second mode now exists for COBOL source code statements — in this mode of operation, COBOL statements may each be up to 255 characters long, with no specific requirements as to what should appear in which columns.
Of course, in keeping with the long-standing COBOL tradition of maintaining backwards compatibility with older standards, programmers (and, of course, compliant COBOL compilers) are capable of working in either mode. It is even possible to switch back and forth in the same program. The terms Fixed Format Mode and Free Format Mode are used to refer to these two modes of source code formatting.
The GnuCOBOL compiler (cobc) supports both of these source line format modes, defaulting to Fixed Format Mode lacking any other information.
The compiler can be instructed to operate in either mode in any of the following four ways:
SOURCEFORMAT AS FIXED and SOURCEFORMAT AS FREE clauses of the >>SET CDF directive (see >>SET) within your source code to switch to Fixed or Free Format Mode, respectively.
>>FORMAT IS FIXED and FORMAT IS FREE clauses of the >>DEFINE CDF directive (see >>DEFINE) within your source code to switch to Fixed or Free Format Mode, respectively.
>>SOURCE CDF directive (see >>SOURCE) to switch to Free Format Mode (>>SOURCE FORMAT IS FREE) or Fixed Format Mode (>>SOURCE FORMAT IS FIXED.
Using methods 2-4 above, you may switch back and forth between the two formats at will.
The last three options above are all equivalent; all three are supported by GnuCOBOL so that source code compatibility may be maintained with a wide variety of other COBOL implementations. With all three, if the compiler is currently in Fixed Format Mode, the >> must begin in column 8 or beyond, provided no part of the directive extends past column 72. If the compiler is currently in Free Format Mode, the >> may appear in any column, provided no part of the directive extends past column 255.
Depending upon which source format mode the compiler is in, you will need to follow various rules for the format mode currently in effect. These rules are presented in the upcoming paragraphs.
The following discussion presents the various components of every GnuCOBOL source line record when the compiler is operating in Fixed Format Mode. Remember that this is the default mode for the GnuCOBOL compiler.
Sequence Number Area
Historically, back in the days when punched-cards were used to submit COBOL program source to a COBOL compiler, this part of a COBOL statement was reserved for a six-digit sequence number. While the contents of this area are ignored by COBOL compilers, it existed so that a program actually punched on 80-character cards could — if the card deck were dropped on the floor — be run through a card sorter machine and restored to its proper sequence. Of course, this isn’t necessary today; if truth be told, it hasn’t been necessary for a long time.
See Marking Changes in Programs, for discussion of a valuable use to which the sequence number area may be put today.
Indicator Area
Column 7 serves as an indicator in which one of five possible values will appear — space, D (or d), - (dash), / or *. The meanings of these characters are as follows:
No special meaning — this is the normal character that will appear in this area.
The line contains a valid GnuCOBOL statement that is normally treated as a comment unless the program is being compiled in debugging mode.
The line is a comment.
The line is a comment that will also force a page eject in the compilation listing. While GnuCOBOL will honour such a line as a comment, it will not form-feed any generated listing.
The line is a continuation of the previous line. These are needed only when an alphanumeric literal (quoted character string), reserved word or user-defined word are being split across lines.
Area A
Language DIVISION, SECTION and paragraph section headers must b egin in Area A, as must the level numbers 01, 77 in data description entries and the FD and SD file and SORT description headers.
Area B
All other COBOL programming language components are coded in these columns.
Program Name Area
This is another obsolete area of COBOL statements. This part of every statement also hails back to the day when programs were punched on cards; it was expected that the name of the program (or at least the first 8 characters of it) would be punched here so that — if a dropped COBOL source deck contained more than one program — that handy card sorter machine could be used to first separate the cards by program name and then sort them by sequence number. Today’s COBOL compilers (including GnuCOBOL) simply ignore anything past column 72.
See Marking Changes in Programs, for discussion of a valuable use to which the program name area may be put today.
[ IDENTIFICATION DIVISION. ]
~~~~~~~~~~~~~~~~~~~~~~~
PROGRAM-ID|FUNCTION-ID. name-1 [ Program-Options ] .
~~~~~~~~~~ ~~~~~~~~~~~
[ ENVIRONMENT DIVISION. ]
~~~~~~~~~~~ ~~~~~~~~
[ CONFIGURATION SECTION. ]
~~~~~~~~~~~~~ ~~~~~~~
[ SOURCE-COMPUTER. Compilation-Computer-Specification . ]
~~~~~~~~~~~~~~~
[ OBJECT-COMPUTER. Execution-Computer-Specification . ]
~~~~~~~~~~~~~~~
[ REPOSITORY. Function-Specification... . ]
~~~~~~~~~~
[ SPECIAL-NAMES. Program-Configuration-Specification . ]
~~~~~~~~~~~~~
[ INPUT-OUTPUT SECTION. ]
~~~~~~~~~~~~ ~~~~~~~
[ FILE-CONTROL. General-File-Description... . ]
~~~~~~~~~~~~
[ I-O-CONTROL. File-Buffering-Specification... . ]
~~~~~~~~~~~
[ DATA DIVISION. ]
~~~~~~~~~~~~~
[ FILE SECTION. Detailed-File-Description... . ]
~~~~~~~~~~~~
[ WORKING-STORAGE SECTION. Permanent-Data-Definition... . ]
~~~~~~~~~~~~~~~ ~~~~~~~
[ LOCAL-STORAGE SECTION. Temporary-Data-Definition... . ]
~~~~~~~~~~~~~ ~~~~~~~
[ LINKAGE SECTION. Subprogram-Argument-Description... . ]
~~~~~~~ ~~~~~~~
[ REPORT SECTION. Report-Description... . ]
~~~~~~ ~~~~~~~
[ SCREEN SECTION. Screen-Layout-Definition... . ]
~~~~~~ ~~~~~~~
PROCEDURE DIVISION [ { USING Subprogram-Argument... } ]
~~~~~~~~~ ~~~~~~~~ { ~~~~~ }
{ CHAINING Main-Program-Argument... }
~~~~~~~~
[ RETURNING identifier-1 ] .
[ DECLARATIVES. ] ~~~~~~~~~
~~~~~~~~~~~~
[ Event-Handler-Routine... . ]
[ END DECLARATIVES. ]
~~~ ~~~~~~~~~~~~
General-Program-Logic
[ Nested-Subprogram... ]
[ END PROGRAM|FUNCTION name-1 ]
~~~ ~~~~~~~ ~~~~~~~~
Each program consists of up to four Divisions (major groupings of sections, paragraphs and descriptive or procedural coding that all relate to a common purpose), named Identification, Environment, Data and Procedure.
IDENTIFICATION DIVISION. header is always optional.
SOURCE-COMPUTER and OBJECT-COMPUTER, for example). Each of these paragraphs serves a specific purpose. If no code is required for the purpose one of the paragraphs serves, the entire paragraph may be omitted.
ENVIRONMENT DIVISION. header itself may be omitted.
DATA DIVISION. header itself may be omitted.
DECLARATIVES. and END DECLARATIVES. lines may be omitted.
END PROGRAM or END FUNCTION statement is optional.
END PROGRAM or END FUNCTION statements. The final program in such a source code file need not have an END PROGRAM or END FUNCTION statement.
END PROGRAM A. or END FUNCTION A. statement. For now, that’s all that will be said about nesting. See Independent vs Contained vs Nested Subprograms, for more information.
The following information describes how comments may be embedded into GnuCOBOL program source to provide documentation.
| Comment Type | Source Mode — Description |
|---|---|
| Blank Lines | FIXED — Blank lines may be inserted as desired.
FREE — Blank lines may be inserted as desired. |
| Full-line comments | FIXED — An entire source line will be treated as a comment (and will be ignored by the compiler) by coding an asterisk (‘*’) in column seven (7).
FREE — An entire source line will be treated as a comment (and will be ignored by the compiler) by coding the sequence ‘*>’, starting in any column, as the first non-blank characters on the line. |
| Full-line comments with form-feed | FIXED — An entire source line will be treated as a comment by coding a slash (‘/’) in column seven (7). Many COBOL compilers will also issue a form-feed in the program listing so that the ‘/’ line is at the top of a new page. The GnuCOBOL compiler does not support this form-feed behaviour.
The GnuCOBOL Interactive Compiler, or GCic, does support this form-feed behaviour when it generates program source listings! See GCic in GnuCOBOL Sample Programs, for the source and cross-reference listing (produced by GCic) of this program — you can see the effect of ‘/’ there. FREE — There is no Free Source Mode equivalent to ‘/’. |
| Partial-line comments | FIXED — Any text following the character sequence ‘*>’ on a source line will be treated as a comment. The ‘*’ must appear in column seven (7) or beyond.
FREE — Any text following the character sequence ‘*>’ on a source line will be treated as a comment. The ‘*’ may appear in any column. |
| Comments that may be treated as code, typically for debugging purposes | FIXED — By coding a ‘D’ in column 7
(upper- or lower-case), an otherwise valid GnuCOBOL source line will be treated as a comment by the compiler.
FREE — By specifying the character sequence ‘>>D’ (upper- or lower-case) as the first non-blank characters on a source line, an otherwise valid GnuCOBOL source line will be treated as a comment by the compiler. Debugging statements may be compiled either by specifying the -fdebugging-line switch on the GnuCOBOL compiler or by adding the |
Literals are constant values that will not change during the execution of a program. There are two fundamental types of literals — numeric and alphanumeric.
9.92E25, representing 9.92 x 1025 (10 raised to the 25th power) or 5.7E-14, representing 5.7 x 10-14 (10 raised to the -14th power). Both the mantissa (the number before the ‘E’) and the exponent (the number after the ‘E’) may be explicitly specified as positive (with a ‘+’), negative (with a ‘-’) or unsigned (and therefore implicitly positive). A floating-point literals value must be within the range -1.7 x 10308 to +1.7 x 10308 with no more than 15 decimal digits of precision.
L"characters".
H# or X# ‘0’ - ‘F’.
B" character ".
BX" hex character ".
N" character " or NC" character ".
NX" character ".
An alphanumeric literal is not valid for use in arithmetic expressions unless it is first converted to its numeric computational equivalent; there are three numeric conversion intrinsic functions built into GnuCOBOL that can perform this conversion — NUMVAL (see NUMVAL), NUMVAL-C (see NUMVAL-C) and NUMVAL-F (see NUMVAL-F).
Alphanumeric literals may take any of the following forms:
Alphanumeric literals too long to fit on a single line may be continued to the next line in one of two ways:
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890123
01 LONG-LITERAL-VALUE-DEMO PIC X(60) VALUE "This is a long l
- "ong literal that
- " must be continu
- "ed.".
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890123
01 LONG-LITERAL-VALUE-DEMO PIC X(60) VALUE "This is a" &
" long literal that must " &
"be continued.".
If your program is using Free Format Mode, there’s less need to continue long alphanumeric literals because statements may be as long as 255 characters.
Numeric literals may be split across lines just as alphanumeric literals are, using either of the above techniques and both reserved and user-defined words can be split across lines too (using the first technique). The continuation of numeric literals and user-defined/reserved words is provided merely to provide compatibility with older COBOL versions and programs, but should not be used with new programs — it just makes for ugly-looking programs.
Figurative constants are reserved words that may be used as literals anywhere the figurative constants value could be interpreted as an arbitrarily long sequence of the characters in question. When a specific length is required, such as would be the case with an argument to a subprogram, a figurative constant may not be used. Thus, the following are valid uses of figurative constants:
05 FILLER PIC 9(10) VALUE ZEROS. ... MOVE SPACES TO Employee-Name
But this is not:
CALL "SUBPGM" USING SPACES
The following are the GnuCOBOL figurative constants and their respective equivalent values.
ZEROThis figurative constant has a value of numeric 0 (zero). ZEROS and ZEROES are both synonyms of ZERO.
SPACEThis figurative constant has a value of one or more space characters. SPACES is a synonym of SPACE.
QUOTEThis figurative constant has a value of one or more double-quote characters ("). QUOTES is a synonym of QUOTE.
LOW-VALUEThis figurative constant has a value of one or more of whatever character occupies the lowest position in the program’s collating sequence as defined in the OBJECT-COMPUTER (see OBJECT-COMPUTER) paragraph or — if no such specification was made — in whatever default character set the program is using (typically, this is the ASCII character set). LOW-VALUES is a synonym of LOW-VALUE.
When the character set in use is ASCII with no collating sequence modifications, the LOW-VALUES figurative constant value is the ASCII NUL character. Because character sets can be redefined, however, you should not rely on this fact. Use the NULL figurative constant instead.
HIGH-VALUEThis figurative constant has a value of one or more of whatever character occupies the highest position in the program’s collating sequence as defined in the OBJECT-COMPUTER paragraph or — if no such specification was made — in whatever default character set the program is using (typically, this is the ASCII character set). HIGH-VALUES is a synonym of HIGH-VALUE.
NULLA character comprised entirely of zero-bits (regardless of the programs collating sequence).
Programmers may create their own figurative constants via the SYMBOLIC CHARACTERS (see Symbolic-Characters-Clause) clause of the SPECIAL-NAMES (see SPECIAL-NAMES) paragraph.
The use of comma characters can cause confusion to a COBOL compiler if the DECIMAL POINT IS COMMA clause is used in the SPECIAL-NAMES (see SPECIAL-NAMES) paragraph, as might be the case in Europe. The following statement, which calls a subroutine passing it two arguments (the numeric constants 1 and 2):
Would — with DECIMAL POINT IS COMMA in effect — actually be interpreted as a subroutine call with 1 argument (the non-integer numeric literal whose value is 1 and 2 tenths). For this reason, it is best to always follow a comma with a space.
The rules for where and when periods are needed in the procedure division are somewhat complicated. See Use of Periods, for the details.
Through the CALL statement, COBOL programs may invoke other COBOL programs serving as subprograms. This is quite similar to cross-program linkage capabilities provided by other languages. In GnuCOBOL’s case, the CALL facility is powerful enough to be tailored to the point where a GnuCOBOL program can communicate with operating system, database management and run-time library APIs, even if they weren’t written in COBOL themselves. See GnuCOBOL Main Programs CALLing C Subprograms, for an example of how a GnuCOBOL program could invoke a C-language subprogram, passing information back and forth between the two.
The fact that GnuCOBOL supports a full-featured two-way interface with C-language programs means that — even if you cannot access a library API directly — you could always do so via a small C “wrapper” program that is CALLed by a GnuCOBOL program.
COBOL uses parenthesis to specify the subscripts used to reference table entries (tables in COBOL are what other programming languages refer to as arrays).
For example, observe the following data structure which defines a 4 column by 3 row grid of characters:
01 GRID.
05 GRID-ROW OCCURS 3 TIMES.
10 GRID-COLUMN OCCURS 4 TIMES.
15 GRID-CHARACTER PIC X(1).
If the structure contains the following grid of characters:
A B C D E F G H I J K L
Then GRID-CHARACTER (2, 3) references the ‘G’ and GRID-CHARACTER (3, 2) references the ‘J’.
Subscripts may be specified as numeric (integer) literals, numeric (integer) data items, data items created with any of the picture-less integer USAGE (see USAGE) specifications, USAGE INDEX data items or arithmetic expressions resulting in a non-zero integer value.
In the above examples, a comma is used as a separator character between the two subscript values; semicolons (;) are also valid subscript separator characters, as are spaces! The use of a comma or semicolon separator in such a situation is technically optional, but by convention most COBOL programmers use one or the other. The use of no separator character (other than a space) is not recommended, even though it is syntactically correct, as this practice can lead to programmer-unfriendly code. It isn’t too difficult to read and understand GRID-CHARACTER(2 3), but it’s another story entirely when trying to comprehend GRID-CHARACTER(I + 1 J / 3) (instead of GRID-CHARACTER(I + 1, J / 3)). The compiler accepts it, but too much of this would make my head hurt.
COBOL allows data names to be duplicated within a program, provided references to those data names may be made in such a manner as to make those references unique through a process known as qualification.
To see qualification at work, observe the following segments of two data records defined in a COBOL program:
01 EMPLOYEE. 01 CUSTOMER.
05 MAILING-ADDRESS. 05 MAILING-ADDRESS.
10 STREET PIC X(35). 10 STREET PIC X(35).
10 CITY PIC X(15). 10 CITY PIC X(15).
10 STATE PIC X(2). 10 STATE PIC X(2).
10 ZIP-CODE. 10 ZIP-CODE.
15 ZIP-CODE-5 PIC 9(5). 15 ZIP-CODE-5 PIC 9(5).
15 FILLER PIC X(4). 15 FILLER PIC X(4).
Now, let’s deal with the problem of setting the CITY portion of an EMPLOYEEs MAILING-ADDRESS to ‘Philadelphia’. Clearly, MOVE 'Philadelphia' TO CITY cannot work because the compiler will be unable to determine which of the two CITY fields you are referring to.
In an attempt to correct the problem, we could qualify the reference to CITY as MOVE 'Philadelphia' TO CITY OF MAILING-ADDRESS.
Unfortunately that too is insufficient because it still insufficiently specifies which CITY is being referenced. To truly identify which specific CITY you want, you’d have to code MOVE 'Philadelphia' TO CITY OF MAILING-ADDRESS OF EMPLOYEE.
Now there can be no confusion as to which CITY is being changed. Fortunately, you don’t need to be quite so specific; COBOL allows intermediate and unnecessary qualification levels to be omitted. This allows MOVE 'Philadelphia' TO CITY OF EMPLOYEE to do the job nicely.
If you need to qualify a reference to a table, do so by coding something like identifier-1 OF identifier-2 ( subscript(s) ).
The reserved word IN may be used in lieu of OF.
identifier-1 [ OF|IN identifier-2 ] [ (subscript...) ] (start:[ length ])
~~ ~~
intrinsic-function-reference (start:[ length ])
The COBOL 1985 standard introduced the concept of a reference modifier to facilitate references to only a portion of a data item; GnuCOBOL fully supports reference modification.
The start value indicates the starting character position being referenced (character position values start with 1, not 0 as is the case in some programming languages) and length specifies how many characters are wanted.
If no length is specified, a value equivalent to the remaining character positions from start to the end of identifier-1 or to the end of the value returned by the function will be assumed.
Both start and length may be specified as integer numeric literals, integer numeric data items or arithmetic expressions with an integer value.
Here are a few examples:
CUSTOMER-LAST-NAME (1:3)References the first three characters of CUSTOMER-LAST-NAME.
CUSTOMER-LAST-NAME (4:)References all character positions of CUSTOMER-LAST-NAME from the fourth onward.
FUNCTION CURRENT-DATE (5:2)References the current month as a 2-digit number in character form. See CURRENT-DATE, for more information.
Hex-Digits (Nibble + 1:1)Assuming that Nibble is a numeric data item with a value in the range 0-15, and Hex-Digits is a PIC X(16) item with a value of 0123456789ABCDEF, this converts that numeric value to a hexadecimal digit.
Table-Entry (6) (7:5)References characters 7 through 11 (5 characters in total) in the 6th occurrence of Table-Entry.
Reference modification may be used anywhere an identifier is legal, including serving as the receiving field of statements like MOVE (see MOVE), STRING (see STRING) and ACCEPT (see ACCEPT), to name a few.
Unary-Expression-1 { **|^ } Unary-Expression-2
{ *|/ }
{ +|- }
{ [ +|- ] { ( Arithmetic-Expression-1 ) } }
{ { [ LENGTH OF ] { identifier-1 } } }
{ { ~~~~~~ ~~ { literal-1 } } }
{ { { Function-Reference } } }
{ Arithmetic-Expression-2 }
Arithmetic expressions are formed using four categories of operations — exponentiation, multiplication & division, addition & subtraction, and sign specification.
In complex expressions composed of multiple operators and operands, a precedence of operation applies whereby those operations having a higher precedence are computed first before operations with a lower precedence.
As is the case in almost any other programming language, the programmer is always free to use pairs of parenthesis to enclose sub-expressions of complex expressions that are to be evaluated before other sub-expressions rather than let operator precedence dictate the sequence of evaluation.
In highest to lowest order of precedence, here is a discussion of each category of operation:
+ and - with a single argument)The unary “minus” (-) operator returns the arithmetic negation of its single argument, effectively returning as its value the product of its argument and -1.
The unary “plus” (+) operator returns the value of its single argument, effectively returning as its value the product of its argument and +1.
** or ^)The value of the left argument is raised to the power indicated by the right argument. Non-integer powers are allowed. The ^ and ** operators are both supported to provide compatibility with programs written for other COBOL implementations.
*) and division (/)The * operator computes the product of the left and right arguments while the / operator computes the value of the left argument divided by the value of the right argument. If the right argument has a value of zero, expression evaluation will be prematurely terminated before a value is generated. This may cause program failure at run-time.
A sequence of multiple 3rd-level operations (A * B / C, for example) will evaluate in strict left-to-right sequence if no parenthesis are used to control the order of evaluation.
+) or subtraction (+)The + operator calculates the sum of the left and right arguments while the - operator computes the value of the right argument subtracted from that of the left argument.
A sequence of multiple 4th-level operations (A - B + C, for example) will evaluate in strict left-to-right sequence if no parenthesis are used to control the order of evaluation.
The syntactical rules of COBOL, allowing a dash (-) character in data item names, can lead to some ambiguity.
01 C PIC 9 VALUE 5. 01 D PIC 9 VALUE 2. 01 C-D PIC 9 VALUE 7. 01 I PIC 9 VALUE 0. … COMPUTE I=C-D+1
The COMPUTE (see COMPUTE) statement will evaluate the arithmetic expression C-D+1 and then save that result in I.
What value will be stored in I? The number 4, which is the result of subtracting the value of D (2) from the value of C (5) and then adding 1? Or, will it be the number 8, which is the value of adding 1 to the value of data item C-D (7)?
The right answer is 8 — the value of data item C-D plus 1! Hopefully, that was the intended result.
The GnuCOBOL compiler actually went through the following decision-making logic when generating code for the COMPUTE Statement:
C-D defined? If so, use its value for the character sequence C-D.
C-D data item, then are there C and D data items? If not, the COMPUTE statement is in error. If there are, however, then code will be generated to subtract the value of D from C and add 1 to the result.
Had there been at least one space to the left and/or the right of the -, there would have been no ambiguity — the compiler would have been forced to use the individual C and D data items.
To avoid any possible ambiguity, as well as to improve program readability, it’s considered good COBOL programming practice to always code at least one space to both the left and right of every operator in arithmetic expressions as well as the = sign on a COMPUTE.
Here are some examples of how the precedence of operations affects the results of arithmetic expressions (all examples use numeric literals, to simplify the discussion).
| Expression | Result | Notes |
|---|---|---|
| 3 * 4 + 1 | 13 | * has precedence over + |
| 4 * 2 ^ 3 - 10 | 22 | 2^3 is 8 (^ has precedence over *), times 4 is 32, minus 10 is 22. |
| (4 * 2) ^ 3 - 10 | 502 | Parenthesis provide for a recursive application of the arithmetic expression rules, effectively allowing you to alter the precedence of operations. 4 times 2 is 8 (the use of parenthesis “trumps” the exponentiation operator, so the multiplication happens first); 8 ^ 3 is 512, minus 10 is 502. |
| 5 / 2.5 + 7 * 2 - 1.15 | 15.35 | Integer and non-integer operands may be freely intermixed |
Of course, arithmetic expression operands may be numeric data items (any USAGE except POINTER or PROGRAM POINTER) as well as numeric literals.
Conditional expressions are expressions which identify the circumstances under which a program may take an action or cease taking an action. As such, conditional expressions produce a value of TRUE or FALSE.
There are seven types of conditional expressions, as discussed in the following sections.
These are the simplest of all conditions. Observe the following code:
05 SHIRT-SIZE PIC 99V9.
88 TINY VALUE 0 THRU 12.5
88 XS VALUE 13 THRU 13.5.
88 S VALUE 14, 14.5.
88 M VALUE 15, 15.5.
88 L VALUE 16, 16.5.
88 XL VALUE 17, 17.5.
88 XXL VALUE 18, 18.5.
88 XXXL VALUE 19, 19.5.
88 VERY-LARGE VALUE 20 THRU 99.9.
The condition names TINY, XS, S, M, L, XL, XXL, XXXL and VERY-LARGE will have TRUE or FALSE values based upon the values within their parent data item (SHIRT-SIZE).
A program wanting to test whether or not the current SHIRT-SIZE value can be classified as XL could have that decision coded as a combined condition (the most complex type of conditional expression), as either:
IF SHIRT-SIZE = 17 OR SHIRT-SIZE = 17.5 - or - IF SHIRT-SIZE = 17 OR 17.5
Or it could simply utilize the condition name XL as follows:
IF XL
identifier-1 IS [ NOT ] { NUMERIC }
~~~ { ~~~~~~~ }
{ ALPHABETIC }
{ ~~~~~~~~~~ }
{ ALPHABETIC-LOWER }
{ ~~~~~~~~~~~~~~~~ }
{ ALPHABETIC-UPPER }
{ ~~~~~~~~~~~~~~~~ }
{ OMITTED }
{ ~~~~~~~ }
{ class-name-1 }
Class conditions evaluate the type of data that is currently stored in a data item.
NUMERIC class test considers only the characters ‘0’, ‘1’, … , ‘9’ to be numeric; only a data item containing nothing but digits will pass a NUMERIC class test. Spaces, decimal points, commas, currency signs, plus signs, minus signs and any other characters except the digit characters will all fail NUMERIC class tests.
ALPHABETIC class test considers only upper-case letters, lower-case letters and spaces to be alphabetic in nature.
ALPHABETIC-LOWER and ALPHABETIC-UPPER class conditions consider only spaces and the respective type of letters to be acceptable in order to pass such a class test.
NOT option reverses the TRUE/FALSE value of the condition.
CHARACTER CLASSIFICATION specifications in the OBJECT-COMPUTER (see OBJECT-COMPUTER) paragraph.
USAGE (see USAGE) is either explicitly or implicitly defined as DISPLAY may be used in NUMERIC or any of the ALPHABETIC class conditions.
PIC A items with NUMERIC class conditions and the use of PIC 9 items with ALPHABETIC class conditions. GnuCOBOL has no such restrictions.
OMITTED class condition is used when it is necessary for a subprogram to determine whether or not a particular argument was passed to it. In such class conditions, identifier-1 must be a linkage section item defined on the USING clause of the subprograms PROCEDURE DIVISION header. See PROCEDURE DIVISION USING, for additional information.
The class-name-1 option allows you to test for a user-defined class. Here’s an example. First, assume the following SPECIAL-NAMES (see SPECIAL-NAMES) definition of the user-defined class ‘Hexadecimal’:
SPECIAL-NAMES.
CLASS Hexadecimal IS '0' THRU '9', 'A' THRU 'F', 'a' THRU 'f'.
Now observe the following code, which will execute the 150-Process-Hex-Value procedure if Entered-Value contains nothing but valid hexadecimal digits:
IF Entered-Value IS Hexadecimal
PERFORM 150-Process-Hex-Value
END-IF
identifier-1 IS [ NOT ] { POSITIVE }
~~~ { ~~~~~~~~ }
{ NEGATIVE }
{ ~~~~~~~~ }
{ ZERO }
~~~~
Sign conditions evaluate the numeric state of a data item defined with a PICTURE (see PICTURE) and/or USAGE (see USAGE) that supports numeric values.
POSITIVE or NEGATIVE class condition will be TRUE only if the value of identifier-1 is strictly greater than or less than zero, respectively.
ZERO class condition can be passed only if the value of identifier-1 is exactly zero.
NOT option reverses the TRUE/FALSE value of the condition.
In the SPECIAL-NAMES paragraph, an external switch name can be associated with one or more condition names. These condition names may then be used to test the ON/OFF status of the external switch.
Here are the relevant sections of code in a program named testprog, which is designed to simply announce if SWITCH-1 is on:
…
ENVIRONMENT DIVISION.
SPECIAL-NAMES.
SWITCH-1 ON STATUS IS Switch-1-Is-ON.
…
PROCEDURE DIVISION.
…
IF Switch-1-Is-ON
DISPLAY "Switch 1 Is On"
END-IF
…
The following are two different command window sessions — the left on a Unix/Cygwin/OSX system and the right on a windows system — that will set the switch on and then execute the testprog program. Notice how the message indicating that the program detected the switch was set is displayed in both examples:
$ COB_SWITCH_1=ON C:>SET COB_SWITCH_1=ON $ export COB_SWITCH_1 C:>testprog $ ./testprog Switch 1 Is On Switch 1 Is On C:> $
{ identifier-1 } IS [ NOT ] RelOp { identifier-2 }
{ literal-1 } ~~~ { literal-2 }
{ arithmetic-expression-1 } { arithmetic-expression-2 }
{ index-name-1 } { index-name-2 }
{ EQUAL TO }
{ ~~~~~ }
{ EQUALS }
{ ~~~~~~ }
{ GREATER THAN }
{ ~~~~~~~ }
{ GREATER THAN OR EQUAL TO }
{ ~~~~~~~ ~~ ~~~~~ }
{ LESS THAN }
{ ~~~~ }
{ LESS THAN OR EQUAL TO }
{ ~~~~ ~~ ~~~~~ }
{ = }
{ > }
{ >= }
{ < }
{ <= }
These conditions evaluate how two different values "relate" to each other.
USAGE (see USAGE) and number of significant digits in either value are irrelevant as the comparison is performed using the actual algebraic values.
TRUE/FALSE value for the relation test can be established. Characters are compared according to their relative position in the program’s COLLATING SEQUENCE (as defined in SPECIAL-NAMES (see SPECIAL-NAMES)), not according to the bit-pattern values the characters have in storage.
COLLATING SEQUENCE will, however, be based entirely on the bit-pattern values of the various characters.
IS EQUAL TO, IS LESS THAN, …) versus the symbolic version (=, <, …) of the actual relation operators.
[ ( ] Condition-1 [ ) ] { AND } [ ( ] Condition-2 [ ) ]
{ ~~~ }
{ OR }
{ ~~ }
A combined condition is one that computes a TRUE/FALSE value from the TRUE/FALSE values of two other conditions (which could themselves be combined conditions).
TRUE, the result of ORing the two together will result in a value of TRUE. ORing two FALSE conditions will result in a value of FALSE.
AND to yield a value of TRUE, both conditions must have a value of TRUE. In all other circumstances, AND produces a FALSE value.
IF ACCOUNT-STATUS = 1 OR ACCOUNT-STATUS = 2 OR ACCOUNT-STATUS = 7
Could be abbreviated as:
IF ACCOUNT-STATUS = 1 OR 2 OR 7
AND take precedence over OR in combined conditions. Use parenthesis to change this precedence, if necessary. For example:
FALSE AND FALSE OR TRUE AND TRUEEvaluates to TRUE
(FALSE AND FALSE) OR (TRUE AND TRUE)Evaluates to TRUE (since AND has precedence over OR) - this is identical to the previous example
(FALSE AND (FALSE OR TRUE)) AND TRUEEvaluates to FALSE
NOT Condition-1 ~~~
A condition may be negated by prefixing it with the
NOT operator.
NOT operator has the highest precedence of all logical operators, just as a unary minus sign (which “negates” a numeric value) is the highest precedence arithmetic operator.
NOT TRUE AND FALSE AND NOT FALSEEvaluates to FALSE AND FALSE AND TRUE which evaluates to FALSE
NOT (TRUE AND FALSE AND NOT FALSE)Evaluates to NOT (FALSE) which evaluates to TRUE
NOT TRUE AND (FALSE AND NOT FALSE)Evaluates to FALSE AND (FALSE AND TRUE) which evaluates to FALSE
All COBOL implementations distinguish between sentences and statements in the procedure division. A Statement is a single executable COBOL instruction. For example, these are all statements:
MOVE SPACES TO Employee-Address ADD 1 TO Record-Counter DISPLAY "Record-Counter=" Record-Counter
Some COBOL statements have a scope of applicability associated with them where one or more other statements can be considered to be part of or related to the statement in question. An example of such a situation might be the following, where the interest on a loan is being calculated and displayed at 4% interest if the loan balance is under $10,000, and 4.5% otherwise. (WARNING: the following code has an error!):
IF Loan-Balance < 10000
MULTIPLY Loan-Balance BY 0.04 GIVING Interest
ELSE
MULTIPLY Loan-Balance BY 0.045 GIVING Interest
DISPLAY "Interest Amount = " Interest
In this example, the IF statement actually has a scope that can include two sets of associated statements: one set to be executed when the IF (see IF) condition is TRUE, and another if it is FALSE.
Unfortunately, there’s a problem with the above. A human being looking at that code would probably infer that the DISPLAY (see DISPLAY) statement, because of its lack of indentation, is to be executed regardless of the TRUE/FALSE value of the IF condition. Unfortunately, the compiler (any COBOL compiler) won’t see it that way because it really couldn’t care less what sort of indentation, if any, is used. In fact, any COBOL compiler would be just as happy to see the code written like this:
IF Loan-Balance < 10000 MULTIPLY Loan-balance BY 0.04 GIVING Interest ELSE MULTIPLY Loan-Balance BY 0.045 GIVING Interest DISPLAY "Interest Amount = " Interest
How then do we inform the compiler that the DISPLAY statement is outside the scope of the IF?
That’s where sentences come in.
A COBOL Sentence is defined as any arbitrarily long sequence of statements, followed by a period (.) character. The period character is what terminates the scope of a set of statements. Therefore, our example should have been coded like this:
IF Loan-Balance < 10000
MULTIPLY Loan-Balance BY 0.04 GIVING Interest
ELSE
MULTIPLY Loan-Balance BY 0.045 GIVING Interest.
DISPLAY "Interest Amount = " Interest
See the period at the end of the second MULTIPLY (see MULTIPLY)? That is what terminates the scope of the IF, thus making the DISPLAY statement’s execution completely independent of the TRUE/FALSE status of the IF.
Prior to the 1985 COBOL standard, using a period character was the only way to signal the end of a statement’s scope.
Unfortunately, this caused some problems. Take a look at this code:
IF A = 1
IF B = 1
DISPLAY "A & B = 1"
ELSE *> This ELSE has a problem!
IF B = 1
DISPLAY "A NOT = 1 BUT B = 1"
ELSE
DISPLAY "NEITHER A NOR B = 1".
The problem with this code is that indentation — so critical to improving the human-readability of a program — can provide an erroneous view of the logical flow. An ELSE is always associated with the most-recently encountered IF; this means the emphasized ELSE will be associated with the IF B = 1 statement, not the IF A = 1 statement as the indentation would appear to imply.
This sort of problem led to a band-aid solution being added to the COBOL language: the NEXT SENTENCE clause:
IF A = 1
IF B = 1
DISPLAY "A & B = 1"
ELSE
NEXT SENTENCE
ELSE
IF B = 1
DISPLAY "A NOT = 1 BUT B = 1"
ELSE
DISPLAY "NEITHER A NOR B = 1".
NEXT SENTENCE informs the compiler that if the B = 1 condition is false, control should fall into the first statement that follows the next period.
With the 1985 standard for COBOL, a much more elegant solution was introduced. Any COBOL
Verb (the first reserved word of a statement) that needed such a thing was allowed to use an END-verb construct to end its scope without disrupting the scope of any other statement it might have been in. Any COBOL 85 compiler would have allowed the following solution to our problem:
IF A = 1
IF B = 1
DISPLAY "A & B = 1"
END-IF
ELSE
IF B = 1
DISPLAY "A NOT = 1 BUT B = 1"
ELSE
DISPLAY "NEITHER A NOR B = 1".
This new facility made the period almost obsolete, as our program segment would probably be coded like this today:
IF A = 1
IF B = 1
DISPLAY "A & B = 1"
END-IF
ELSE
IF B = 1
DISPLAY "A NOT = 1 BUT B = 1"
ELSE
DISPLAY "NEITHER A NOR B = 1"
END-IF
END-IF
COBOL (GnuCOBOL included) still requires that each procedure division paragraph contain at least one sentence if there is any executable code in that paragraph, but a popular coding style is now to simply code a single period right before the end of each paragraph.
The standard for the COBOL language shows the various END-verb clauses are optional because using a period as a scope-terminator remains legal.
If you will be porting existing code over to GnuCOBOL, you’ll find it an accommodating facility capable of conforming to whatever language and coding standards that code is likely to use. If you are creating new GnuCOBOL programs, however, I would strongly counsel you to use the END-verb structures in those programs.
The manipulation of data files is one of the COBOL language’s great strengths. There are features built into COBOL to deal with the possibility that multiple programs may be attempting to access the same file concurrently. Multiple program concurrent access is dealt with in two ways — file sharing and record locking.
Not all GnuCOBOL implementations support file sharing and record-locking options. Whether they do or not depends upon the operating system they were built for and the build options that were used when the specific GnuCOBOL implementation was generated.
GnuCOBOL controls concurrent-file access at the highest level through the concept of file sharing, enforced when a program attempts to open a file. This is accomplished via a UNIX operating-system routine called fcntl. That module is not currently supported by Windows and is not present in the MinGW Unix-emulation package. GnuCOBOL builds created using a MinGW environment will be incapable of supporting file-sharing controls — files will always be shared in such environments. A GnuCOBOL build created using the Cygwin environment on Windows would have access to fcntl and therefore will support file sharing. Of course, actual Unix builds of GnuCOBOL, as well as OSX builds, should have no issues because fcntl should be available.
Any limitations imposed on a successful OPEN (see OPEN) will remain in place until your program either issues a CLOSE (see CLOSE) against the file or the program terminates.
File sharing is controlled through the use of a SHARING clause:
SHARING WITH { ALL OTHER }
~~~~~~~ { ~~~ }
{ NO OTHER }
{ ~~ }
{ READ ONLY }
~~~~ ~~~~
This clause may be used either in the file’s SELECT statement (see SELECT), on the OPEN statement (see OPEN) which initiates your program’s use of the file, or both. If a SHARING option is specified in both places, the specifications made on the OPEN statement will take precedence over those from the SELECT statement.
Here are the meanings of the three options:
ALL OTHERWhen your program opens a file with this sharing option in effect, no restrictions will be placed on other programs attempting to OPEN the file after your program did. This is the default sharing mode.
NO OTHERWhen your program opens a file with this sharing option in effect, your program announces that it is unwilling to allow any other program to have any access to the file as long as you are using that file; OPEN attempts made in other programs will fail with a file status of 37 (PERMISSION DENIED) until such time as you CLOSE (see CLOSE) the file.
READ ONLYOpening a file with this sharing option indicates you are willing to allow other programs to OPEN the file for input while you have it open. If they attempt any other OPEN, theirs will fail with a file status of 37. Of course, your program may fail if someone else got to the file first and opened it with a sharing option that imposed file-sharing limitations.
If the SELECT of a file is coded with a FILE STATUS clause, OPEN failures — including those induced by sharing failures — will be detectable by the program and a graceful recovery (or at least a graceful termination) will be possible. If no such clause was coded, however, a runtime message will be issued and the program will be terminated.
Record-locking is supported by advanced file-management software built-in to the GnuCOBOL implementation you are using. This software provides a single point-of-control for access to files — usually ORGANIZATION INDEXED files. One such runtime package capable of doing this is the Berkeley Database (BDB) package — a package frequently used in GnuCOBOL builds to support indexed files.
The various I/O statements your program can execute are capable of imposing limitations on access by other concurrently-executing programs to the file record they just accessed. These limitations are syntactically imposed by placing a lock on the record using a LOCK clause. Other records in the file remain available, assuming that file-sharing limitations imposed at the time the file was opened didn’t prevent access to the entire file.
DB_HOME run-time environment variable.
SELECT (see SELECT) statement or file OPEN (see OPEN) specifies SHARING WITH NO OTHER, record locking will be disabled.
SELECT contains a LOCK MODE IS AUTOMATIC clause, every time a record is read from the file, that record is automatically locked. Other programs may access other records within the file, but not a locked record.
SELECT contains a LOCK MODE IS MANUAL clause, locks are placed on records only when a READ statement executed against the file includes a LOCK clause (this clause will be discussed shortly).
LOCK ON clause is specified in the file’s SELECT, locks (either automatically or manually acquired) will continue to accumulate as more and more records are read, until they are explicitly released. This is referred to as
multiple record locking.
Locks acquired vie multiple record locking remain in-effect until the program holding the lock…
LOCK ON clause is not specified, then the next I/O statement your program executes, except for START (see START), will release the lock. This is referred to as
single record locking.
LOCK clause, which may be coded on a READ (see READ), REWRITE (see REWRITE) or WRITE statement (see WRITE) looks like this:
{ IGNORING LOCK }
{ ~~~~~~~~ ~~~~ }
{ WITH [ NO ] LOCK }
{ ~~ ~~~~ }
{ WITH KEPT LOCK }
{ ~~~~ ~~~~ }
{ WITH IGNORE LOCK }
{ ~~~~~~ ~~~~ }
{ WITH WAIT }
~~~~
The WITH [ NO ] LOCK option is the only one available to REWRITE or WRITE statements.
The meanings of the various record locking options are as follows:
IGNORING LOCKThese options (which are synonymous) inform GnuCOBOL that any locks held by other programs should be ignored.
WITH LOCKAccess to the record by other programs will be denied.
WITH NO LOCKThe record will not be locked. This is the default for all statements.
WITH KEPT LOCKWhen single record locking is in effect, as a new record is accessed, locks held for previous records are released. By using this option, not only is the newly accessed record locked (as WITH LOCK would do), but prior record locks will be retained as well. A subsequent READ without the KEPT LOCK option will release all “kept” locks, as will the UNLOCK statement.
WITH WAITThis option informs GnuCOBOL that the program is willing to wait for a lock held (by another program) on the record being read to be released.
Without this option, an attempt to read a locked record will be immediately aborted and a file status of 51 will be returned.
With this option, the program will wait for a preconfigured time for the lock to be released. If the lock is released within the preconfigured wait time, the read will be successful. If the preconfigured wait time expires before the lock is released, the read attempt will be aborted and a 51 file status will be issued.
The Compiler Directing Facility, or CDF, is a means of controlling the compilation of GnuCOBOL programs. CDF provides a mechanism for dynamically setting or resetting certain compiler switches, introducing new source code from one or more source code libraries, making dynamic source code modifications and conditionally processing or ignoring source statements altogether. This is accomplished via a series of special CDF statements and directives that will appear in the program source code.
When the compiler is operating in Fixed Format Mode, all CDF statements must begin in column eight (8) or beyond.
There are two types of supported CDF statements in GnuCOBOL — Text Manipulation Statements and Compiler Directives.
The CDF text manipulation statements COPY and REPLACE are used to introduce new code into programs either with or without changes, or may be used to modify existing statements already in the program. Text manipulation statements are always terminated with a period.
CDF directives, denoted by the presence of a >> character sequence as part of the statement name itself, influence the process of program compilation.
Compiler directives are never terminated with a period.
The compiler command-line option -D offers additional control (see cobc - The GnuCOBOL Compiler).
>>CALL-CONVENTION { COBOL }
~~~~~~~~~~~~~~~~~ { EXTERN }
{ STDCALL }
{ STATIC }
This directive instructs the compiler how to treat references to program names and may be used to determine other details for interacting with a function or program. There are four options with COBOL being the default.
COBOLThe program name is treated as a COBOL word that maps to the externalised name program to be called, cancelled or referenced in the program-address-identifier, applying the same mapping rules as for a program name for which no AS phrase is specified.
(The is the default.)
EXTERNThe program name is treated as an external reference.
STDCALL[more info needed]
STATICThe program name is called as a included element and not dynamically which is the normal default.
COPY copybook-name
~~~~
[ IN|OF library-name ]
~~ ~~
[ SUPPRESS PRINTING ]
~~~~~~~~
[ REPLACING { Phrase-Clause | String-Clause }... ] .
~~~~~~~~~
{ ==pseudo-text-1== } BY { ==pseudo-text-2== }
{ identifier-1 } ~~ { identifier-2 }
{ literal-1 } { literal-2 }
{ word-1 } { word-2 }
[ LEADING|TRAILING ] ==partial-word-1== BY ==partial-word-2== ~~~~~~~ ~~~~~~~~ ~~
COPY statements are used to import copybooks (see Copybooks) into a program.
COPY statements may be used anywhere within a COBOL program where the code contained within the copybook would be syntactically valid.
SUPPRESS clause (with or without the optional
PRINTING reserved word) is valid syntactically but is non-functional. It is supported to facilitate compatibility with source code written for other versions of COBOL.
IN and the word OF — use the one you prefer.
COPY statement, even if the statement occurs within the scope of another one where a period might appear disruptive, such as within the scope of an IF (see IF) statement. This mandatory period at the end of the statement does not, however, affect the statement scope in which the COPY occurs.
COPY statements are located and the contents of the corresponding copybooks inserted into the program source code before the actual compilation process begins. If a copybook contains a COPY statement, the copybook insertion process will be repeated to resolve the embedded COPY. This will continue until no unresolved COPY statements remain. At that point, actual program compilation will begin.
REPLACING clause allows for one or more of either of the following kinds of text replacements to be made:
Replacement of one or more complete reserved words, user-defined identifiers or literals; the following points apply to this option:
BY will be referred to here as the search string.
identifier-1, literal-1 or word-1 being replaced.
==pseudo-text-1== option. For example, to replace all occurrences of UPON PRINTER, you would specify ==UPON PRINTER==.
BY, may be specified using any of the four options.
==pseudo-text-2== option. If pseudo-text-2 is null (in other words, the replacement text is specified as ====), all encountered occurrences of the search string will be deleted.
Using this, you may replace character sequences that occur at the beginning (see
LEADING) or end (see
TRAILING) of reserved or user-defined words. For example, to change all words of the form "0100-xxxxxx" to "020-xxxxxx", code LEADING ==0100-== BY ==020-==. To simply remove all "0100-" prefixes from words, code LEADING ==0100-== BY ====.
REPLACE [ ALSO ] { Phrase-Clause | String-Clause }... .
~~~~~~~ ~~~~
REPLACE [ LAST ] OFF . ~~~~~~~ ~~~~ ~~~
{ ==pseudo-text-1== } BY { ==pseudo-text-2== }
~~
[ LEADING|TRAILING ] ==partial-word-1== BY ==partial-word-2== ~~~~~~~ ~~~~~~~~ ~~
REPLACE statement provides a mechanism for changing all or part of one or more GnuCOBOL statements.
REPLACE statement (either format), even if the statement occurs within the scope of another one where a period might appear disruptive (such as within the scope of an IF (see IF) statement; the period will not, however, affect the statement scope in which the REPLACE occurs.
REPLACE statement:
REPLACE statement can be used to make changes to program source code in much the same way as the
REPLACING option of the COPY statement can, via these options:
Replace one or more complete reserved words, user-defined identifiers or literals; the following points apply to this option:
BY will be referred to here as the search string.
REPLACE are always specified using the ==pseudo-text-1== option. For example, to replace all occurrences of UPON PRINTER, you would specify ==UPON PRINTER==.
BY, is specified using the ==pseudo-text-2== option. If pseudo-text-2 is null (in other words, the replacement text is specified as ====), all encountered occurrences of the search string will be deleted.
Using this, you may replace character sequences that occur at the beginning (see
LEADING) or end (see
TRAILING) of reserved or user-defined words. For example, to change all words of the form "0100-xxxxxx" to "020-xxxxxx", code LEADING ==0100-== BY ==020-==. To simply remove all "0100-" prefixes from words, code LEADING ==0100-== BY ====.
REPLACE statement is encountered in the currently-compiling source file, Replace Mode becomes active, and the change(s) specified by that statement will be automatically made on all subsequent source statements the compiler reads from the file.
REPLACE is encountered, the end of currently compiling program source file is reached or a Format 2 REPLACE statement is encountered.
REPLACE statement with the
ALSO keyword is encountered without Replace Mode being currently active, the effect will be as if the ALSO had not been specified. If Replace Mode already was in effect, the effect will be to “push” the current change specification(s) onto the top of a stack and add the specification(s) of the new statement to those that were already in effect.
REPLACE without the ALSO keyword is encountered, any stacked change specification(s), if any, will be discarded and the currently in-effect change specification(s), if any, will be replaced by those of the new statement.
REPLACE statement:
REPLACE statement will be ignored.
REPLACE OFF. will deactivate Replace Mode and discard any replace specification(s) on the stack. The compiler will henceforth operate as if no REPLACE had ever been encountered, until such time as another Format 1 REPLACE is encountered.
REPLACE LAST OFF. will replace the current replace specification(s) with those popped off the top of the stack. If there were no replace specification(s) on the stack, the effect will be as if a REPLACE OFF. had been coded.
>>DEFINE [ CONSTANT ] cdf-variable-1 AS { OFF }
~~~~~~~~ ~~~~~~~~ { ~~~ }
{ literal-1 [ OVERRIDE ] }
{ ~~~~~~~~ }
{ PARAMETER [ OVERRIDE ] }
~~~~~~~~~ ~~~~~~~~
Use the >>DEFINE CDF directive to create CDF variables and (optionally) assign them either literal or environment variable values.
AS is optional and may be included, or not, at the discretion of the programmer. The presence or absence of this word has no effect upon the program.
END PROGRAM or END FUNCTION directive is encountered in the input source.
>>DEFINE CDF directive is one way to create CDF variables that may be processed by other CDF statements such as >>IF (see >>IF). The >>SET CDF directive (see >>SET) provides another way to create them.
CONSTANT option is not specified, but such names are not recommended.
CONSTANT option is valid only in conjunction with literal-1. When CONSTANT is specified, the CDF variable that is created may be used within your regular COBOL code as if it were a literal value. Without this option, the CDF variable may only be referenced on other CDF statements. The
OFF option is used to create a variable without assigning it any value.
PARAMETER option is used to create a variable whose value is that of the environment variable of the same name. Note that this value assignment occurs at compilation time, not program execution time.
OVERRIDE option, cdf-variable-1 must not yet have been defined. When the OVERRIDE option is specified, cdf-variable-1 will be created with the specified value, if it had not yet been defined. If it had already been defined, it will be redefined with the new value.
>>IF CDF-Conditional-Expression-1 ~~~~ [ Program-Source-Lines-1 ] [ >>ELIF CDF-Conditional-Expression-2 ~~~~~~ [ Program-Source-Lines-2 ] ]... [ >>ELSE ~~~~~~ [ Program-Source-Lines-3 ] ] >>END-IF ~~~~~~~~
{ cdf-variable-1 } IS [ NOT ] { DEFINED }
{ literal-1 } ~~~ { ~~~~~~~ }
{ SET }
{ ~~~ }
{ CDF-RelOp { cdf-variable-2 } }
{ { literal-2 } }
>= or GREATER THAN OR EQUAL TO
~~~~~~~ ~~ ~~~~~
> or GREATER THAN
~~~~~~~
<= or LESS THAN OR EQUAL TO
~~~~ ~~ ~~~~~
< or LESS THAN
~~~~
= or EQUAL TO
~~~~~
<> or EQUAL TO (with "NOT")
~~~~~
The >>IF CDF directive causes the GnuCOBOL compiler to process or ignore COBOL source statements, CDF text-manipulation statements and/or CDF directives depending upon the value of one or more conditional expressions based upon CDF variables.
IS, THAN and TO are optional and may be omitted. The presence or absence of these words has no effect on the program.
>>IF directive must be terminated by an
>>END-IF directive.
>>ELIF clauses following an >>IF, including zero.
>>ELSE clause following an >>IF. When >>ELSE is used, it must follow the >>IF and all >>ELIF clauses.
>>IF … >>END-IF may be processed by the compiler. Which one (if any) that gets processed will be decided as follows:
>>ELIF clauses that may be present until one evaluates to TRUE. Once one of them evaluates to TRUE, the Program-Source-Lines-n block of code that corresponds to the TRUE CDF-Conditional-Expression-n will be one that is processed. All others within the >>IF->>END-IF scope will be ignored.
TRUE, and there is an >>ELSE clause, the Program-Source-Lines-3 block of statements following the >>ELSE clause will be processed by the compiler and all others within the >>IF->>END-IF scope will be ignored.
TRUE and there is no >>ELSE clause, then none of the Program-Source-Lines-n block of statements within the >>IF->>END-IF scope will be processed by the compiler.
>>IF->>END-IF structure.
DEFINED option tests for whether cdf-variable-1 has been defined, but not yet assigned a value (>>DEFINE … OFF); use the NOT option to test for the variable not being defined.
SET option tests for whether cdf-variable-1 has been given a value, either via a >>SET statement or via a >>DEFINE without the OFF option.
IF statement (see IF), multiple comparisons cannot be ANDed or ORed together; you may nest a second >>IF inside the first, however, to simulate an AND and an OR may be simulated via the >>ELIF option.
<> symbol stands for NOT EQUAL TO.
>>SET { [ CONSTANT ] cdf-variable-1 literal-1 ] }
~~~~~ { ~~~~~~~~ }
{ SOURCEFORMAT AS FIXED|FREE }
{ ~~~~~~~~~~~~ ~~~~~ ~~~~ }
{ NOFOLDCOPYNAME }
{ ~~~~~~~~~~~~~~ }
{ FOLDCOPYNAME AS UPPER|LOWER }
~~~~~~~~~~~~ ~~~~~ ~~~~~
The >>SET CDF directive provides an alternate means of performing the actions of the >>DEFINE and >>SOURCE directives, as well as a means of controlling the compiler’s
-free switch,
-fixed switch and
-ffold-copy switch from within program source code.
AS is optional (only on the SOURCEFORMAT and FOLDCOPYNAME clauses) and may be included, or not, at the discretion of the programmer. The presence or absence of this word has no effect upon the program.
END PROGRAM or END FUNCTION directive is encountered in the input source.
FOLDCOPYNAME option provides the equivalent of specifying the compiler -ffold-copy=xxx switch, where xxx is either
UPPER or
LOWER.
NOFOLDCOPYNAME option turns off the effect of either the >>SET FOLDCOPYNAME statement or the compiler -ffold-copy=xxx switch.
CONSTANT option is used, literal-1 must also be used. This option provides another means of defining constants that may be used anywhere in the program that a literal could be specified.
>>SET CDF directive provide equivalent functionality to the >>DEFINE and >>SOURCE directives, as follows:
>>SET cdf-variable-1>>DEFINE cdf-variable-1 AS OFF
>>SET cdf-variable-1 AS literal-1>>DEFINE cdf-variable-1 AS literal-1
>>SET CONSTANT cdf-variable-1 literal-1>>DEFINE CONSTANT cdf-variable-1 literal-1
>>SET SOURCEFORMAT AS FIXED>>SOURCE FORMAT IS FIXED
>>SET SOURCEFORMAT AS FREE>>SOURCE FORMAT IS FREE
>>SET XFD literal-1[to do]
>>SET Micro-Focus-Directive[to do]
>>SOURCE FORMAT IS FIXED|FREE|VARIABLE ~~~~~~~~ ~~~~~ ~~~~ ~~~~~~~~
The >>SOURCE CDF directive puts the compiler into FIXED or FREE source-code format mode. This, in effect, provides yet another mechanism for controlling the compiler’s
-free switch and
-fixed switch.
FORMAT and IS are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
FIXED and FREE mode as desired.
>>SET CDF directive to perform this function.
>>TURN { exception-name-1 [ file-name-1 ]... }...
~~~~~~
{ OFF }
{ ~~~ }
{ CHECKING ON [ WITH LOCATION ] }
~~~~~~~~ ~~ ~~~~~~~~
The directive will (de-)activate exception checks.
>>D ~~~
The directive removes all floating debug lines if debug mode not active. Otherwise will ignore the directive part of the line.
>>DISPLAY source-text [ VCS = version-string ] ~~~~~~~~~ ~~~
The directive is a v1.0 extension and will display messages during compilation.
>>PAGE ~~~~~~
The directive allows usage of the IBM paging controls EJECT, SKIP1, SKIP2, SKIP3 and TITLE.
>>LISTING {ON}
~~~~~~~~~ {OFF}
The directive allows the program listing to be de-(activated).
>>LEAP-SECONDS ~~~~~~~~~~~~~~
The >>LEAP-SECONDS CDF directive is syntactically recognized but is otherwise non-functional.
Allows for more than 60 seconds per minute.
$ (Dollar) Directives - Active. These directives are active and have the same function as ones starting with >>: $DEFINE $DISPLAY ON|OFF $IF $ELIF $ELSE $ELSE-IF $END $SET It is recommend to use the standard directives only instead of the MF directives (when possible) as these have a a higher chance for being portable. $ (Dollar) Directives - Not Active. These are NOT active and will produce a warning message: $DISPLAY VCS ... Recognised but otherwise ignored. @OPTIONS options-text Additional Micro-Focus directives accepted : ADDRSV | ADD-RSV literal-1 ADDSYN | ADD-SYN literal-1 = literal-2 ASSIGN "EXTERNAL" | "DYNAMIC" BOUND CALLFH literal-1 COMP1 | COMP-1 "BINARY" | "FLOAT" FOLDCOPYNAME | FOLD-COPY-NAME AS "UPPER" | "LOWER" MAKESYN | MAKE-SYN NOBOUND | NO-BOUND NOFOLDCOPYNAME | NOFOLD-COPY-NAME | NO-FOLD-COPY-NAME OVERRIDE literal-1 = literal-2 REMOVE literal-1 SOURCEFORMAT | SOURCE-FORMAT "FIXED" | "FREE" | "VARIABLE" SSRANGE "2" NOSSRANGE | NO-SSRANGE
Offers support for MF Compiler Directives.
GnuCOBOL defines compilation variables when various conditions are true. If the condition associated with a variable is false, the variable is not defined.
DEBUGThe -d debug flag is specified.
EXECUTABLEModule being compiled contains the main program.
GCCOMPThe size of a COMP item is determined according to the GnuCOBOL scheme where for a picture of length:
item = 1 byte
item = 2 bytes
item = 4 bytes
item = 8 bytes
GNUCOBOLGnuCOBOL is compiling the source unit.
HOSTSIGNSA signed packed decimal item’s value may be considered NUMERIC if sign = X"F".
IBMCOMPThe size of a COMP item is determined according to the IBM scheme, where for a PICTURE of length:
item = 2 bytes
item = 4 bytes
item = 8 bytes
MODULEThe element being compiled does not contain the main program.
NOHOSTSIGNSA signed packed decimal item’s value may not be considered NUMERIC if sign = X"F".
NOIBMCOMPThe size of a COMP item is not determined according to the IBM scheme.
NOSTICKY-LINKAGESticky linkage (linkage section items remaining allocated between invocations) is not active.
NOTRUNCNumeric data items are truncated according to their internal representation.
P64Pointers are greater than 32 bits.
STICKY-LINKAGESticky linkage (linkage section items remaining allocated between invocations) is active.
TRUNCNumeric data items are truncated according to their PICTURE clauses.
While still supported, this may well be removed in the future and should not be
used. See GCCOMP and GNUCOBOL instead:
OCCOMPThe size of a COMP item is determined according to the GnuCOBOL scheme, where for a PICTURE of length:
item = 1 byte
item = 2 bytes
item = 4 bytes
item = 8 bytes
OPENCOBOLGnuCOBOL is compiling the source unit.
[{ IDENTIFICATION } DIVISION. ]
{ ~~~~~~~~~~~~~~ } ~~~~~~~~
{ ID }
~~
{ PROGRAM-ID. } { program name } .
{ ~~~~~~~~~~ } { literal-1 } [ AS { literal-2 } ] [ Type-clause ] .
{ FUNCTION-ID. } { literal-3 } [ AS literal-4 ] .
~~~~~~~~~~~ { function-name } .
{ OPTIONS. }
~~~~~~~
[ DEFAULT ROUNDED MODE IS {AWAY-FROM-ZERO }
~~~~~~~ ~~~~~~~ {NEAREST-AWAY-FROM-ZERO }
{NEAREST-EVEN }
{NEAREST-TOWARDS-ZERO }
{PROHIBITED }
{TOWARDS-GREATER }
{TOWARDS-LESSER }
{TRUNCATION }]
[ ENTRY-CONVENTION IS {COBOL }
~~~~~~~~~~~~~~~~ {EXTERN }
{STDCALL }]
[ AUTHOR. comment-1. ]
~~~~~~
[ DATE-COMPILED. comment-2. ]
~~~~~~~~~~~~~
[ DATE-MODIFIED. comment-3. ]
~~~~~~~~~~~~~
[ DATE-WRITTEN. comment-4. ]
~~~~~~~~~~~~
[ INSTALLATION. comment-5. ]
~~~~~~~~~~~~
[ REMARKS. comment-6. ]
~~~~~~~
[ SECURITY. comment-7. ]
~~~~~~~~
The
AUTHOR,
DATE-COMPILED,
DATE-MODIFIED,
DATE-WRITTEN,
INSTALLATION,
REMARKS and
SECURITY
paragraphs are supported by GnuCOBOL only to provide compatibility with programs written for the ANS1974 (or earlier) standards. As of the ANS1985 standard, these clauses have become obsolete and should not be used in new programs.
IS [ COMMON ] [ INITIAL|RECURSIVE PROGRAM ]
~~~~~~ ~~~~~~~ ~~~~~~~~~
The identification division provides basic identification of the program by giving it a name and optionally defining some high-level characteristics via the eight pre-defined paragraphs that may be specified.
AS, IS and PROGRAM are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
PROGRAM-ID is specified. If one is coded, either COMMON, COMMON INITIAL or COMMON RECURSIVE must be specified.
IDENTIFICATION DIVISION or ID DIVISION header is optional, the PROGRAM-ID /
FUNCTION-ID paragraphs are not; only one or the other, however, may be coded.
PROGRAM-ID and FUNCTION-ID paragraphs serve to identify the program to the external (i.e. operating system) environment. If there is no AS clause present, the program-id will serve as that external identification. If there is an AS clause specified, that specified literal will serve as the external identification. For the remainder of this document, that "external identification" will be referred to as the primary entry-point name.
INITIAL, COMMON and RECURSIVE words are used only within subprograms serving as subroutines. Their purposes are as follows:
COMMON should be used only within subprograms that are nested subprograms. A nested subprogram declared as COMMON may be called from any nested program in the source file being compiled, not just those "above" it in the nesting structure.
RECURSIVE clause, if any, will cause the compiler to generate different object code for the subprogram that will enable it to invoke itself and to properly return back to the program that invoked it.
User-defined functions (i.e. FUNCTION-ID) are always recursive.
INITIAL clause, if specified, guarantees the subprogram will be in its initial (i.e. compiled) state each and every time it is executed, not just the first time.
ENVIRONMENT DIVISION. ~~~~~~~~~~~ ~~~~~~~~ [ CONFIGURATION SECTION. ] ~~~~~~~~~~~~~ ~~~~~~~~ [ SOURCE-COMPUTER. Compilation-Computer-Specification . ] ~~~~~~~~~~~~~~~ [ OBJECT-COMPUTER. Execution-Computer-Specification . ] ~~~~~~~~~~~~~~~ [ SPECIAL-NAMES. Program-Configuration-Specification . ] ~~~~~~~~~~~~~ [ REPOSITORY. Function-Specification... . ] ~~~~~~~~~~ [ INPUT-OUTPUT SECTION. ] ~~~~~~~~~~~~ ~~~~~~~ [ FILE-CONTROL. General-File-Description... . ] ~~~~~~~~~~~~ [ I-O-CONTROL. File-Buffering Specification... . ] ~~~~~~~~~~~
This division defines the external computer environment in which the program will be operating. This includes defining any files that the program may be .
SOURCE-COMPUTER and OBJECT-COMPUTER, for example), each of which serves a specific purpose. If no code is required for the purpose one of the paragraphs serves, the entire paragraph may be omitted.
ENVIRONMENT DIVISION. header itself may be omitted.
CONFIGURATION SECTION. ~~~~~~~~~~~~~ ~~~~~~~ [ SOURCE-COMPUTER. Compilation-Computer-Specification . ] ~~~~~~~~~~~~~~~ [ OBJECT-COMPUTER. Execution-Computer-Specification . ] ~~~~~~~~~~~~~~~ [ SPECIAL-NAMES. Program-Configuration-Specification . ] ~~~~~~~~~~~~~ [ REPOSITORY. Function-Specification... . ] ~~~~~~~~~~
This section defines the computer system upon which the program is being compiled and executed and also specifies any special environmental configuration or compatibility characteristics.
CONFIGURATION SECTION. header may be omitted from the program.
SOURCE-COMPUTER. computer-name [ WITH DEBUGGING MODE ] . ~~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~
This paragraph defines the computer upon which the program is being compiled and provides one way in which debugging code embedded within the program may be activated.
WITH is optional and may be omitted. The presence or absence of this word has no effect upon the program.
SOURCE-COMPUTER settings of its parent program.
OBJECT-COMPUTER paragraph, if any.
DEBUGGING MODE clause, if present, will inform the compiler that debugging lines (those with a ‘D’ in column 7 if Fixed Source Mode is in effect, or those prefixed with a >>D if Free Source Mode is in effect) — normally treated as comments — are to be compiled.
DEBUGGING MODE clause, it is still possible to compile debugging lines. Debugging lines may also be compiled by specifying the
-fdebugging-line switch to the GnuCOBOL compiler.
OBJECT-COMPUTER. [ computer-name ]
~~~~~~~~~~~~~~~
[ MEMORY SIZE IS integer-1 WORDS|CHARACTERS ]
~~~~~~ ~~~~ ~~~~~ ~~~~~~~~~~
[ PROGRAM COLLATING SEQUENCE IS alphabet-name-1 ]
~~~~~~~~~
[ SEGMENT-LIMIT IS integer-2 ]
~~~~~~~~~~~~~
[ CHARACTER CLASSIFICATION IS { locale-name-1 } ]
~~~~~~~~~~~~~~ { LOCALE }
{ ~~~~~~ }
{ USER-DEFAULT }
{ ~~~~~~~~~~~~ }
{ SYSTEM-DEFAULT }
~~~~~~~~~~~~~~
.
The
MEMORY SIZE and
SEGMENT-LIMIT
clauses are syntactically recognized but are otherwise non-functional.
This paragraph describes the computer upon which the program will execute.
OBJECT-COMPUTER paragraph name. The remaining clauses may be coded in any sequence.
CHARACTER, IS, PROGRAM and SEQUENCE are optional and may be omitted. The presence or absence of these words has no effect on the program.
SOURCE-COMPUTER paragraph, if any.
OBJECT-COMPUTER paragraph is not allowed in a nested subprogram. A nested program inherits the OBJECT-COMPUTER settings of its parent program.
COLLATING SEQUENCE clause allows you to specify a customized character collating sequence to be used when alphanumeric values are compared to one another. Data will still be stored in the character set native to the computer, but the logical sequence in which characters are ordered for comparison purposes can be altered from that defined by the computer’s native character set. The alphabet-name-1 you specify needs to be defined in the SPECIAL-NAMES (see SPECIAL-NAMES) paragraph.
COLLATING SEQUENCE clause is specified, the collating sequence implied by the character set native to the computer (usually ASCII) will be used.
CLASSIFICATION clause may be used to specify a locale for the environment in which the program will execute, for the purpose of influencing the upper-case and lower-case mappings of characters for the UPPER-CASE (see UPPER-CASE) and LOWER-CASE (see LOWER-CASE) intrinsic functions and the classification of characters for the ALPHABETIC, ALPHABETIC-LOWER and ALPHABETIC-UPPER class tests. The definitions of these classes is taken from the cultural convention specification (LC_CTYPE) from the specified locale.
The meanings of the four locale specifications are as follows:
LOCALE (see SPECIAL-NAMES) definition.
LOCALE refers to the current locale (in effect at the time the program is executed)
USER-DEFAULT references the default locale specified for the user currently executing this program.
SYSTEM-DEFAULT denotes the default locale specified for the computer upon which the program is executing.
CLASSIFICATION clause will cause character classification to occur according to the rules for the computer’s native character set (ASCII, EBCDIC, etc.).
SPECIAL-NAMES.
~~~~~~~~~~~~~
[ CALL-CONVENTION integer-1 IS mnemonic-name-1 ]
~~~~~~~~~~~~~~~
[ CONSOLE IS CRT ]
~~~~~~~ ~~~
[ CRT STATUS IS identifier-1 ]
~~~ ~~~~~~
[ CURRENCY SIGN IS literal-1 ]
~~~~~~~~ ~~~~
[ CURSOR IS identifier-2 ]
~~~~~~
[ DECIMAL-POINT IS COMMA ]
~~~~~~~~~~~~~ ~~~~~
[ EVENT STATUS IS identifier-3 ]
~~~~~ ~~~~~~
[ LOCALE locale-name-1 IS literal-2 ]...
~~~~~~
[ NUMERIC SIGN IS TRAILING SEPARATE ]
~~~~~~~ ~~~~ ~~~~~~~~ ~~~~~~~~
[ SCREEN CONTROL IS identifier-4 ]
~~~~~~ ~~~~~~~
[ device-name-1 IS mnemonic-name-2 ]...
[ feature-name-1 IS mnemonic-name-3 ]...
[ Alphabet-Clause ]...
[ Class-Definition-Clause ]...
[ Switch-Definition-Clause ]...
[ Symbolic-Characters-Clause ]...
.
The
EVENT STATUS and
SCREEN CONTROL
clauses are syntactically recognized but are otherwise non-functional.
The SPECIAL-NAMES paragraph provides a means for specifying various program and operating environment configuration options.
SPECIAL-NAMES paragraph may be coded in any order.
IS is optional and may be omitted. The presence or absence of this word has no effect upon the program.
SPECIAL-NAMES paragraph is not allowed in a nested subprogram. A nested program inherits the SPECIAL-NAMES settings of its parent program.
CALL-CONVENTION clause allows a decimal integer, representing a series of ON/OFF switch settings, to be associated with a mnemonic name which may then be coded on a CALL statement (see CALL). The switch settings defined by this mnemonic will then control how the linkage to a subroutine invoked by the CALL statement that references mnemonic-name-1 will be handled.
CONSOLE IS CRT clause, if specified, will cause a DISPLAY statement lacking an explicit UPON clause to be treated as a DISPLAY screen-data-item statement (see DISPLAY screen-data-item), and any ACCEPT statement lacking a FROM clause to be treated as a ACCEPT screen-data-item statement (see ACCEPT screen-data-item).
CRT STATUS clause is not specified, an implicit
COB-CRT-STATUS identifier (with a PICTURE 9(4)) will be allocated for the purpose of receiving screen ACCEPT statuses. If CRT STATUS is specified, then identifier-1 must be defined in the program as a PICTURE 9(4) field.
CURRENCY SIGN clause may be used to redefine the character to be used as a currency sign in a PICTURE (see PICTURE) clause. The default currency sign is a dollar-sign (‘$’). You may specify any character except 0-9, A-Z, a-z, +, -, ,, ., *, /, ;, (, ), =, \\, quote (‘"’) or space.
CURSOR IS clause allows you to specify a 4- or 6-character data item into which the cursor screen location at the time a screen ACCEPT is satisfied. The value will be returned as rrcc or rrrccc, depending upon the length of the specified identifier-2, where rr and rrr represent the row number (starting at zero) and cc and ccc represent the column number (also starting at zero). There is no default data item allocated for this data if the CURSOR IS clause is not specified, and it is the programmer’s responsibility to define identifier-2 if the clause is specified.
DECIMAL POINT IS COMMA clause reverses the definition of the ‘,’ and ‘.’ characters when they are used as PICTURE editing symbols and within numeric literals. This can have unwanted side-effects - see Punctuation.
LOCALE clause may be used to associate external OS-defined locale names (literal-2) with an internal name (locale-name-1) that may then be referenced within the program. Locale names are defined by the Operating System and/or C compiler GnuCOBOL will be utilizing on your computer.
af_ZA, am_ET, ar_AE, ar_BH, ar_DZ, ar_EG, ar_IQ, ar_JO, ar_KW, ar_LB, ar_LY, ar_MA, ar_OM, ar_QA, ar_SA, ar_SY, ar_TN, ar_YE, arn_CL, as_IN, az_Cyrl_AZ, az_Latn_AZ
ba_R, be_BY, bg_BG, bn_IN bo_BT, bo_CN, br_FR, bs_Cyrl_BA, bs_Latn_BA
ca_ES, cs_CZ, cy_GB
da_DK, de_AT, de_CH, de_DE, de_LI, de_LU, dsb_DE, dv_MV
el_GR, en_029, en_AU, en_BZ, en_CA, en_GB, en_IE, en_IN, en_JM, en_MY en_NZ, en_PH, en_SG, en_TT, en_US, en_ZA, en_ZW, es_AR, es_BO, es_CL, es_CO, es_CR, es_DO, es_EC, es_ES, es_GT, es_HN, es_MX, es_NI, es_PA, es_PE, es_PR, es_PY, es_SV, es_US, es_UY es_VE, et_EE, eu_ES
fa_IR, fi_FI, fil_PH, fo_FO, fr_BE, fr_CA, fr_CH, fr_FR, fr_LU, fr_MC, fy_NL
ga_IE, gbz_AF, gl_ES, gsw_FR, gu_IN
ha_Latn_NG, he_IL, hi_IN, hr_BA, hr_HR, hu_HU, hy_AM
id_ID, ig_NG, ii_CN, is_IS, it_CH, it_IT, iu_Cans_CA, iu_Latn_CA
ja_JP
ka_GE, kh_KH, kk_KZ, kl_GL, kn_IN, ko_KR, kok_IN, ky_KG
lb_LU, lo_LA, lt_LT, lv_LV
mi_NZ, mk_MK, ml_IN, mn_Cyrl_MN, mn_Mong_CN moh_CA, mr_IN, ms_BN, ms_MY, mt_MT
nb_NO, ne_NP, nl_BE, nl_NL, nn_NO, ns_ZA
oc_FR, or_IN
pa_IN, pl_PL, ps_AF, pt_BR, pt_PT
qut_GT, quz_BO, quz_EC, quz_PE
rm_CH, ro_RO, ru_RU, rw_RW
sa_IN, sah_RU, se_FI, se_NO se_SE, si_LK, sk_SK, sl_SI, sma_NO, sma_SE, smj_NO, smj_SE, smn_FI, sms_FI, sq_AL, sr_Cyrl_BA, sr_Cyrl_CS, sr_Latn_BA, sr_Latn_CS, sv_FI, sv_SE, sw_KE syr_SY
ta_IN, te_IN, tg_Cyrl_TJ, th_TH tk_TM, tmz_Latn_DZ, tn_ZA, tr_IN, tr_TR, tt_RU
ug_CN, uk_UA, ur_PK, uz_Cyrl_UZ, uz_Latn_UZ
vi_VN
wen_DE, wo_SN
xh_ZA
yo_NG
zh_CN, zh_HK, zh_MO, zh_SG, zh_TW, zu_ZA
NUMERIC SIGN TRAILING SEPARATE specification causes all signed numeric USAGE DISPLAY data items to be created as if the SIGN IS TRAILING SEPARATE CHARACTER clause was included in their definitions.
device-name-1 IS mnemonic-name-2 clause allows you to specify an alternate name (device-name-1) for one of the built-in GnuCOBOL device name mnemonic-name-2. The list of device names built-into GnuCOBOL, and the physical device associated with that name, are as follows:
CONSOLEThis is the (screen-mode) display of the PC or Unix system.
STDINSYSINSYSIPTThese devices (they are all synonymous) represent standard system input (pipe 0). On a PC or UNIX system, this is typically the keyboard. The contents of a file may be delivered to a GnuCOBOL program for access via one of these device names by adding the sequence ‘0< filename’ to the end of the programs execution command.
PRINTERSTDOUTSYSLISTSYSLSTSYSOUTThese devices (they are all synonymous) represent standard system output (pipe 1). On a PC or UNIX system, this is typically the display. Output sent to one of these devices by a GnuCOBOL program can be sent to a file by adding the sequence ‘1> filename’ to the end of the programs execution command.
STDERRSYSERRThese devices (they are synonymous) represent standard system error output (pipe 2). On a PC or UNIX system, this is typically the display. Output sent to one of these devices by a GnuCOBOL program can be sent to a file by adding the sequence ‘2> filename’ to the end of the programs execution command.
feature-name-1 IS mnemonic-name-3 clause allow for mnemonic names to be assigned to up to the 13 printer channel (i.e. vertical page positioning) position feature names Cnn (nn=01-12) and CSP. Once a channel position has been assigned a mnemonic name, statements of the form WRITE record-name AFTER ADVANCING mnemonic-name-3 may be coded to write the specified print record at the channel position assigned to mnemonic-name-3.
Printers supporting channel positioning are generally mainframe-type line printers. When writing to printers that do not support channel positioning, a formfeed will be issued to the printer.
The CSP positioning option stands for “No Spacing”. Testing on a MinGW build of GnuCOBOL shows that this too results in a formfeed being issued.
ALPHABET alphabet-name-1 IS { ASCII }
~~~~~~~~ { ~~~~~ }
{ EBCDIC }
{ ~~~~~~ }
{ NATIVE }
{ ~~~~~~ }
{ STANDARD-1 }
{ ~~~~~~~~~~ }
{ STANDARD-2 }
{ ~~~~~~~~~~ }
{ Literal-Clause... }
literal-1 [ { THRU|THROUGH literal-2 } ]
{ ~~~~ ~~~~~~~ }
{ {ALSO literal-3}... }
~~~~
The
ALPHABET clause relates alphabet-name-1 to a specified character code set or collating sequence, including one you define yourself using the literal-1 option.
IS is optional and may be omitted. The presence or absence of this word has no effect upon the program.
THRU and THROUGH are interchangeable.
ASCII,
STANDARD-1 and
STANDARD-2 to be interchangeable.
NATIVE specifies the system default character set.
NATIVE character set, either by its actual text value (alphanumeric quoted character) or by ordinal position in the NATIVE character set (integer),
NATIVE character set.
SPACE, SPACES, ZERO, ZEROS, ZEROES, QUOTE, QUOTES, HIGH-VALUE, HIGH-VALUES, LOW-VALUE or LOW-VALUES for any of the literal-1, literal-2 or literal-3 specifications.
CODE-SET, COLLATING SEQUENCE, or SYMBOLIC CHARACTERS clauses elsewhere in the program.
CLASS class-name-1 IS { literal-1 [ THRU|THROUGH literal-2 ] }...
~~~~~ ~~~~ ~~~~~~~
IS is optional and may be omitted. The presence or absence of this word has no effect upon the program.
THRU and THROUGH are interchangeable.
Hexadecimal, the definition of which specifies the only characters that may be present in an alphanumeric data item if that data item is to be part of the Hexadecimal class:
CLASS Hexadecimal IS '0' THRU '9'
'A' THRU 'F'
'a' THRU 'f'
Hexadecimal has been defined, program code could then use a statement such as IF input-item IS Hexadecimal to determine if the value of characters in a data item are valid according to that class.
switch-name-1 [ IS mnemonic-name-1 ]
[ ON STATUS IS condition-name-1 ]
~~
[ OFF STATUS IS condition-name-2 ]
~~~
The switch-definition clause associates a condition-name with a run-time execution switch so that the status of that switch may be tested from within a program.
IS and STATUS are optional and may be omitted. The presence or absence of these words has no effect upon the program.
SWITCH-n (n = 0-36).
SWn (n = 0-15) are also valid; they correspond to SWITCH-0 through SWITCH-15, respectively as well as SWITCH-16 through SWITCH-36, SWITCH 0 through SWITCH 26 and SWITCH A through SWITCH Z.
COB_SWITCH_n run-time environment variable, where n will have the value ‘0’ through ‘15’. Any of these sixteen environment variables that have the value ON (regardless of upper- or lower-case value) will be considered to be set “on”. Any of these sixteen environment variables having no value at all or a value other than ON will be considered OFF.
IS mnemonic-name-1, ON STATUS or an OFF STATUS option defined for it, otherwise there will be no way to reference the switch from within a GnuCOBOL program.
IS mnemonic-name-1 syntax provides a means for setting the switch to either an ON or OFF value via the SET statement (see SET).
ON STATUS and
OFF STATUS syntax provides a way of associating a condition-name with either the on or off status of the switch, so that status may be tested at execution time via the IF statement (see IF).
SYMBOLIC CHARACTERS
~~~~~~~~
{ symbolic-character-1... IS|ARE integer-1... }...
[ IN alphabet-name-1 ]
~~
This clause may be used to define your own figurative constants.
ARE, CHARACTERS and IS are optional and may be omitted. The presence or absence of these words has no effect upon the program.
There must be exactly as many integer-1 values specified as there are
symbolic-character-1 names.
IN clause. The integer values are selecting characters from the alphabet by their ordinal position and not by their numeric value; thus, an integer of 15 will select the 15th character in the specified alphabet, regardless of the actual numeric value of the bit pattern that constitutes that character.
SYMBOLIC CHARACTERS NUL IS 1
SOH IS 2
BEL IS 8
DC1 IS 18
DC2 IS 19
SYMBOLIC CHARACTERS NUL SOH BEL DC1 DC2
ARE 1 2 8 18 19
REPOSITORY.
~~~~~~~~~~
FUNCTION { function-prototype-name-1 [ AS literal-1 ] }...
~~~~~~~~ { ~~ }
{ intrinsic-function-name-1 [ AS literal-2 ] }
{ ~~ }
{ intrinsic-function-name-2 INTRINSIC }
{ ALL INTRINSIC ~~~~~~~~~ }
~~~ ~~~~~~~~~
The REPOSITORY paragraph provides a way to control access to the various built-in intrinsic functions and any user defined functions that your program will be using.
REPOSITORY paragraph is not allowed in a nested subprogram. A nested program inherits the REPOSITORY settings of its parent program.
INTRINSIC clause allows you to flag one or more (or
ALL) built-in intrinsic functions as being usable without the need to code the keyword
FUNCTION in front of the function names.
ALL INTRINSIC clause, you may instead compile your GnuCOBOL programs using the
-fintrinsics=ALL switch.
AS clause to provide an alias name for a built-in intrinsic function.
FUNCTION keyword,
MY-FUNCTION-1 and MY-FUNCTION-2 that will be used by the program and
SIGMA for the intrinsic function STANDARD-DEVIATION and MF2 for MY-FUNCTION-2.
REPOSITORY.
FUNCTION ALL INTRINSIC.
FUNCTION MY-FUNCTION-1.
FUNCTION MY-FUNCTION-2 AS "MF2".
FUNCTION STANDARD-DEVIATION AS "SIGMA".
A special note about user-defined functions — because you must name a user-defined function that your program will be using in the REPOSITORY paragraph, you may always reference that function from your program’s procedure division without needing to use the FUNCTION keyword.
[ INPUT-OUTPUT SECTION. ]
~~~~~~~~~~~~ ~~~~~~~
[ FILE-CONTROL. ]
~~~~~~~~~~~~
[ SELECT-Statement... ]
[ I-O-CONTROL. ]
~~~~~~~~~~~
[ MULTIPLE-FILE-Statement ]
[ SAME-RECORD-Statement ]
The INPUT-OUTPUT section provides for the definition of any files the program will be accessing as well as control of the I/O buffering process against those files through the FILE-CONTROL and I-O-CONTROL paragraphs, respectively.
INPUT-OUTPUT SECTION. header itself may be omitted, otherwise it is normally required.
relaxed-syntax-check set to ‘yes’, the FILE-CONTROL and I-O-CONTROL paragraphs may be specified without the INPUT-OUTPUT SECTION header having been coded.
I-O-CONTROL paragraph, the order in which those statements are coded is irrelevant.
SELECT [ [ NOT ] OPTIONAL ] file-name-1
~~~~~~ ~~~ ~~~~~~~~
[ ASSIGN { TO } [{ EXTERNAL }] [{ DISC|DISK }] [{ identifier-1 }] ]
~~~~~~ { USING } { ~~~~~~~~ } { ~~~~ ~~~~ } { word-1 }
{ DYNAMIC } { DISPLAY } { literal-1 }
~~~~~~~ { ~~~~~~~ }
{ KEYBOARD }
{ ~~~~~~~~ }
{ LINE ADVANCING }
{ ~~~~ ~~~~~~~~~ }
{ PRINTER }
{ ~~~~~~~ }
{ RANDOM }
{ ~~~~~~ }
{ TAPE }
~~~~
[ COLLATING SEQUENCE IS alphabet-name-1 ]
~~~~~~~~~
[ FILE|SORT ] STATUS IS identifier-2 [ identifier-3 ] ]
~~~~ ~~~~ ~~~~~~
[ LOCK MODE IS { MANUAL|AUTOMATIC } ]
~~~~ { ~~~~~~ ~~~~~~~~~ }
{ EXCLUSIVE [ WITH { LOCK ON MULTIPLE RECORDS } ] }
~~~~~~~~~ { ~~~~ ~~ ~~~~~~~~ ~~~~~~~ }
{ LOCK ON RECORD }
{ ~~~~ ~~ ~~~~~~ }
{ ROLLBACK }
{ ~~~~~~~~ }
[ ORGANIZATION Clause ]
~~~~~~~~~~~~
[ ORGANISATION Clause ]
~~~~~~~~~~~~
[ RECORD DELIMITER IS STANDARD-1 ]
~~~~~~ ~~~~~~~~~ ~~~~~~~~~~
[ RESERVE integer-1 AREAS ]
~~~~~~~
[ SHARING WITH { ALL OTHER } ]
~~~~~~~ { ~~~ }
{ NO OTHER }
{ ~~ }
{ READ ONLY }
~~~~ ~~~~
The
COLLATING SEQUENCE,
RECORD DELIMITER,
RESERVE and
ALL OTHER clauses are syntactically recognized but are otherwise non-functional.
The SELECT statement creates a definition of a file and links that COBOL definition to the external operating system environment.
AREAS, IS, MODE, OTHER, SEQUENCE, TO, USING and WITH are optional and may be omitted. The presence or absence of these words has no effect upon the program.
OPTIONAL clause, to be used only for files that will be used to provide input data to the program, indicates the file may or may not actually be available at run-time. Attempts to OPEN an OPTIONAL file when the file does not exist will receive a special non-fatal file status value (see status 05 in the list of file status values below) indicating the file is not available; a subsequent attempt to READ that file will return an AT END (end-of-file) condition. Optionally, files may be designated as NOT OPTIONAL, if desired. This is useful when specifying the compiler’s
-foptional-file switch, which automatically makes all files OPTIONAL except for those explicitly declared as NOT OPTIONAL.
ASSIGN clause specifies how — at runtime, when file-name-1 is opened — either a logical device (STDIN, STDOUT) or a file anywhere in one of the currently-mounted file systems will be associated with file-name-1, as follows:
ASSIGN clause:
TypeEXTERNAL, DYNAMIC or neither
Devicethe list of device choices
Locatorshown as a choice between identifier-1, word-1 and literal-1.
ASSIGN TO DISC file-name-1 will be assumed if there is no ASSIGN clause on a SELECT.
ASSIGN clause is coded without a Device, the device DISC will be assumed.
EXTERNAL, then word-1 itself will serve as the File Location String that will identify the data file. If, however, a Type of EXTERNAL was not specified, the compiler will create a PIC X(1024) data item named word-1 within the program; the contents of that data item at the time the program opens file-name-1 will then serve as the File Location String that will identify the data file.
DISC or DISK will assume an attachment to a file named file-name-1 in whatever directory is current at the time the file is opened.
DISPLAY will assume an attachment to the STDOUT logical device; these files should only be used for output.
KEYBOARD will assume an attachment to the STDIN logical device; these files should only be used for input.
PRINTER will assume an attachment to the LPT1 logical device/port; these files should only be used for output.
RANDOM or TAPE will behave exactly as DISC does. These two additional Devices are provided to facilitate the compilation of COBOL source from other COBOL implementations.
LINE ADVANCING device requires that a Locator be specified; these files should only be used for output. A COBOL Line Advancing file will allow carriage-control characters such as line-feeds and form-feeds to be written to the attached operating system file, via the ADVANCING clause of the WRITE statement (see WRITE).
filename-mapping value of yes, the GnuCOBOL runtime system will first attempt to identify a currently-defined environment variable whose value will serve as the data file’s path and filename, as follows:
mf as the assign-clause value, then the File Locator String will be interpreted according to Microfocus COBOL rules — namely, everything before the last ‘-’ in the File Locator String will be ignored; the characters after the last ‘-’ will be treated as the base of an environment variable name. If there is no ‘-’ character in the File Locator String then the entire File Locator S