XPL is a programming language based on PL/I, a portable one-pass compiler written in its own language, and a parser generator tool for easily implementing similar compilers for other languages. XPL was designed in 1967 as a way to teach compiler design principles and as starting point for students to build compilers for their own languages.
XPL was designed and implemented by William M. McKeeman,David B. Wortman , James J. Horning and others at Stanford University. XPL was first announced at the 1968 Fall Joint Computer Conference. The methods and compiler are described in detail in the 1971 textbook A Compiler Generator.
They called the combined work a 'compiler generator'. But that implies little or no language- or target-specific programming is required to build a compiler for a new language or new target. A better label for XPL is a translator writing system. It helps to write a compiler with less new or changed programming code.
The XPL language is a simple, small, efficient dialect of PL/I intended mainly for the task of writing compilers. The XPL language was also used for other purposes once it was available. XPL can be compiled easily to most modern machines by a simple compiler. Compiler internals can be written easily in XPL, and the code is easy to read. The PL/I language was designed by an IBM committee in 1964 as a comprehensive language replacing Fortran, COBOL, and ALGOL, and meeting all customer and internal needs. These ambitious goals made PL/I complex, hard to implement efficiently, and sometimes surprising when used. XPL is a small dialect of the full language. XPL has one added feature not found in PL/I: a STRING datatype with dynamic lengths. String values live in a separate text-only heap memory space with automatic garbage collection of stale values. Much of what a simple compiler does is manipulating input text and output byte streams, so this feature helps simplify XPL-based compilers.
The XPL compiler, called XCOM, is a one-pass compiler using a table-driven parser and simple code generation techniques. Versions of XCOM exist for different machine architectures, using different hand-written code generation modules for those targets. The original target was IBM System/360, which is a proper subset of IBM System/370, IBM System/390 and IBM System z.
XCOM compiles from XPL source code, but since XCOM itself is written in XPL it can compile itself – it is a self-compiling compiler, not reliant on other compilers. Several famous languages have self-compiling compilers, including Burroughs B5000 Algol, PL/I, C, LISP, and Java. Creating such compilers is a chicken-and-egg conundrum. The language is first implemented by a temporary compiler written in some other language, or even by an interpreter (often an interpreter for an intermediate code, as BCPL can do with intcode or O-code).
XCOM began as an Algol program running on Burroughs machines, translating XPL source code into System/360 machine code. The XPL team manually turned its Algol source code into XPL source code. That XPL version of XCOM was then compiled on Burroughs, creating a self-compiling XCOM for System/360 machines. The Algol version was then thrown away, and all further improvements happened in the XPL version only. This is called bootstrapping the compiler. The authors of XPL invented the tombstone diagram or T-diagram to document the bootstrapping process.
Retargeting the compiler for a new machine architecture is a similar exercise, except only the code generation modules need to be changed.
XCOM is a one-pass compiler (but with an emitted code fix-up process for forward branches, loops and other defined situations). It emits machine code for each statement as each grammar rule within a statement is recognized, rather than waiting until it has parsed the entire procedure or entire program. There are no parse trees or other required intermediate program forms, and no loop-wide or procedure-wide optimizations. XCOM does, however, perform peephole optimization. The code generation response to each grammar rule is attached to that rule. This immediate approach can result in inefficient code and inefficient use of machine registers. Such are offset by the efficiency of implementation, namely, the use of dynamic strings mentioned earlier: in processing text during compilation, substring operations are frequently performed. These are as fast as an assignment to an integer; the actual substring is not moved. In short, it is quick, easy to teach in a short course, fits into modest-sized memories, and is easy to change for different languages or different target machines.
The XCOM compiler has a hand-written lexical scanner and a mechanically-generated parser. The syntax of the compiler's input language (in this case, XPL) is described by a simplified BNF grammar. XPL's grammar analyzer tool ANALYZER or XA turns this into a set of large data tables describing all legal combinations of the syntax rules and how to discern them. This table generation step is re-done only when the language is changed. When the compiler runs, those data tables are used by a small, language-independent parsing algorithm to parse and respond to the input language. This style of table-driven parser is generally easier to write than an entirely hand-written recursive descent parser. XCOM uses a bottom-up parsing method, in which the compiler can delay its decision about which syntax rule it has encountered until it has seen the rightmost end of that phrase. This handles a wider range of programming languages than top-down methods, in which the compiler must guess or commit to a specific syntax rule early, when it has only seen the left end of a phrase.
XPL includes a minimal runtime support library for allocating and garbage-collecting XPL string values. The source code for this library must be included into most every program written in XPL.
The last piece of the XPL compiler writing system is an example compiler named SKELETON. This is just XCOM with parse tables for an example toy grammar instead of XPL's full grammar. It is a starting point for building a compiler for some new language, if that language differs much from XPL.
XPL is run under the control of a monitor, XMON, which is the only operating system-specific part of this system, and which acts as a "loader" for XCOM itself or any programs which were developed using XCOM, and also provides three auxiliary storage devices for XCOM's use, and which are directly accessed by block number. The originally published XMON was optimized for IBM 2311s. An XMON parameter FILE= enabled the monitor to efficiently use other disks with larger block sizes. The working disk block size was also a compile-time constant in XCOM.
XMON used a very simple strategy for disk direct access. NOTE provided the address of a disk track. POINT set the location of the next disk track to be the address previously returned by NOTE. This strategy was adopted to allow easy porting of XMON to other OSes, and to avoid the much more complicated disk direct access options available at that time.
Converting XMON from its primitive use of NOTE, POINT and READ/WRITE disk operations—with precisely 1 block per track—to EXCP (i.e., write/create new records) and XDAP (i.e., read/update old records)—with n blocks per track, where n was computed at run-time from the target device's physical characteristics and could be significantly greater than 1—achieved significantly improved application performance and decreased operating system overhead.
Although originally developed for OS/360, XMON (either the original NOTE, POINT and READ/WRITE implementation; or the EXCP and XDAP enhancement) will run on subsequently released IBM OSes, including OS/370, XA, OS/390 and z/OS, generally without any changes.
XCOM originally used a now-obsolete bottom-up parse table method called Mixed Strategy Precedence, invented by the XPL team (although the officially released version retains the MSP parser and does not include later-released "peephole optimizations" and additional data types which were developed outside of the original implementation team.) MSP is a generalization of the simple precedence parser method invented by Niklaus Wirth for PL360. Simple precedence is itself a generalization of the trivially simple operator precedence methods that work nicely for expressions like A+B*(C+D)-E. MSP tables include a list of expected triplets of language symbols. This list grows larger as the cube of the grammar size, and becomes quite large for typical full programming languages. XPL-derived compilers were difficult to fit onto minicomputers of the 1970s with limited memories.[nb 1] MSP is also not powerful enough to handle all likely grammars. It is applicable only when the language designer can tweak the language definition to fit MSP's restrictions, before the language is widely used.
The University of Toronto subsequently changed XCOM and XA to instead use a variant of Donald Knuth's LR parser bottom-up method.[nb 2] XCOM's variant is called Simple LR or SLR. It handles more grammars than MSP but not quite as many grammars as LALR or full LR(1). The differences from LR(1) are mostly in the table generator's algorithms, not in the compile-time parser method. XCOM and XA predate the widespread availability of Unix and its yacc parser generator tool. XA and yacc have similar purposes.
XPL is open source. The System/360 version of XPL was distributed through the IBM SHARE users organization. Other groups ported XPL onto many of the larger machines of the 1970s. Various groups extended XPL, or used XPL to implement other moderate-sized languages.
XPL has been used to develop a number of compilers for various languages and systems.
Edited: 2021-06-18 18:22:10