Lessons in Program Representation

This paper is a brief technical memoir. It describes a few personal lessons I have learned in the course of forty years of thinking about application-development languages. These lessons seem to have arisen from a now-unusual beginning tempered by a minimalist aesthetic.

Contents:

Reveal Only One Program Representation

Factor Out Sequence

Distinguish Applications from Algorithms

1. Reveal Only One Program Representation

Background

When I started programming in 1956, IBM's big scientific computer, the 704, had a minimum memory configuration of 4K 36-bit words. (A 36-bit word held one floating-point number.) I think you could get up to 16K words, but there were 4K machines that had to be taken into account if you were writing software for general use.

Computer science didn't exist then as a recognized discipline; if it had, it would have been of little practical use, because the meager hardware couldn't support the cost of implementing its concepts. What real programmers did in the 50s was struggle to get 20 pounds of computation into a 5 pound bag (abstracting again). To succeed at that was a matter of pride, not as a macho exercise, but because that's how the computing job had to get done. The way you achieved that kind of compression was to come up with a powerful intermediate representation of your problem solution which had enough expressive power that the memory left over after implementing the intermediate representation (completely in memory, for speed) had space for much more computation than you could write in the native language using the whole memory. (The benchmark of that era was the Bell Laboratories Interpretive System for the IBM 650, which had a memory capacity of 2000 10-digit words.)

This was the environment in which I learned my craft. As hardware got more powerful, I found that I stayed at my learning edge by hanging out with the smallest machines, below the knee of the power curve, where you couldn't get anything interesting done unless you came up with a powerful computational representation. Thus, when I was a consultant in the 1970's, my most interesting client was Monroe Calculator (now defunct). Around 1972 I discovered the programmable calculator, and I became aware of a dissonance, imperceptible to most but grating to me. For the next 20 years that dissonance bugged me.

The dissonance is between the totally different computational representations which characterize "build" time (i.e., "source" language, what humans can read) and "run" time (i.e., "machine" or "object" language, what devices can execute). Computer people take this duality for granted. But in the world of programmable calculators, the only difference between solving a problem for yourself a step at a time or teaching the calculator to solve it autonomously is a feature of the calculator called "learn" mode, in which the calculator records the steps you take to solve the problem, and then is able to repeat your steps in "run" mode. The single computational representation of the programmable calculator is the language of its keyboard, whether you are solving a problem for yourself or teaching the calculator to solve it.

The duality arose in the world of computers because of the desire for powerful algorithmic languages, beginning with Fortran and Cobol in the 1950s. Because of the memory limitations of the hardware, translating a source program written in Fortran, say, to an object program in the machine language of the IBM 704, say, required "compilation," a multi-stage process involving storing the two forms of the program as well as intermediate results on magnetic tape. Thus was born the edit-compile-run-debug cycle. Originally this cycle arose from economic necessity; then it became the prevailing paradigm; now it's so ubiquitous that it's invisible.

I reacted to the build/run dissonance in the late 70s by doing some private research to see if I could resolve it with the Pascal source language which, as the world knew, required the edit-compile-run-debug cycle to use. Then, when I saw the Rockwell AIM-65, I found the vehicle for solving this problem below the knee of the power curve, where I most enjoyed working.

The Challenge

I proposed to Rockwell to build an "Instant Pascal" trainer to run on the AIM-65, and the proposal was accepted. The AIM-65 was a single-board computer with an 8-bit Rockwell 6502 as the processor, 4K bytes (that's right, 4096 bytes) of internal "random-access" memory (RAM), up to 20K bytes of on-board "read-only" memory (ROM), and various byte-at-a-time input-output possibilities. The user interface consists of a teletype-like keyboard, a single-line 20-position LED alphanumeric display, and a thermal strip printer with 20 characters per line. (The only generally used external storage was an audio cassette recorder.) The intention was that the user would get Instant Pascal in the form of a set of ROMs. You plug these ROMs into the AIM-65 and you have a Pascal trainer.

The AIM-65 had multiple hardware limitations. The major one, of course, was the 4K RAM. But the single-line display wasn't helpful. Clearly, any Pascal editor would have to be a "line" editor, not a "screen" editor. The traditional solution would have been to prepare a program in four steps: (1) the entry step, in which you prepare a source tape; (2) the edit step, in which you modify your source program by doing tape-to-tape copies under control of the keyboard and display; (3) the compile step, in which you read a source tape and write an object tape; and (4) the run step, in which you load the object tape and execute it. Debugging is clearly a problem because you have to have both the source program in view (so you can find and fix your mistakes) and the object program in memory (so the program can run). How can you do that in 4K of memoory? And how are you going to do all four (or five, if there is a separate debugger) steps under control of 20K of ROM? I had no intention of building a language trainer that way.

There was an alternative: to store the Pascal source code in the 4K RAM and interpret it directly. Waterloo Pascal did that. They had the right idea, but directly interpreting source code made it really slow compared to BASIC, which was the alternative at the time. I was, frankly, relieved to discover how slow Waterloo Pascal was, because I already had the outline of another approach.

The Approach

As with Waterloo Pascal, my solution to eliminating the build-run dissonance was a single internal representation of the program which stayed in RAM. It had to be compact, efficiently interpretable, and readily translatable back to Pascal source so that no source program needed to be stored. The approach I took is today called a "tokenized" representation of the Pascal program. The internal representation of the Pascal program was syntactically correct Pascal: I specified the token language using the formal Pascal syntax. This assured syntactic (but not morphological) similarity between external and internal representations of a program. I put one constraint on the user: text had to be entered in "prettyprint" form (no indenting necessary), one line per prettyprint line. Fortunately, this was not a source of complaint, because entering code this way was considered to be good practice; thus this most serious limitation was easily reframed as a feature. The prettyprint input constraint permitted line-by-line translation into token form (and local syntax checking, with immediate feedback), and there was no need to store more than one line of Pascal text internally.

The parts of the program consisted of (1) a single-line source-code tokenizer/syntax checker ("forward translator" from keyboard to RAM); (2) its reverse, a single-line token-to-source "backward translator" for RAM to printer; and (3) a token interpreter that executed the program. There was a fourth component, a "binder," that scanned the tokenized program after you entered the "run" command, and established (and checked) those syntactic connections that spanned multiple lines. The binder had very little to do, and its execution time was typically insignificant. Turnaround from source entry to execution was almost instantaneous; hence the product's name "Instant Pascal." Source-code debugging fell out of this design: any line being executed by the interpreter can at the same time be output as Pascal source by the backward translator. Thus, the system could easily single-step at the source level, stop on lines with breakpoints, and evaluate source-level "watch" expressions during step-by-step execution.

The Lesson

The result of this design was the illusion of executing Pascal source language, but at the speeds of typical BASIC programs. In fact, putting forth and maintaining the illusion of a single program representation became the principal user-interface design goal. It is this illusion of always dealing with your program at the source-code level which resolves the build/run dissonance.

But the resolution wasn't complete. Instant Pascal was a trainer, not a construction tool. As compiler writers know, you don't have a real production tool until you can build the tool using itself, and this could not be done.

2. Factor Out Sequence

Background

My first computer job in 1956 was as a technical support (they called it "applied science") trainee in the IBM Cleveland branch office, supporting an IBM 650 installation at Cleveland Pneumatic Tool, manufacturer of the landing gear for the Boeing 707. The 650 would spend much of its time chugging away at big matrix inversions, using the Bell Interpretive System.

Because of its low cost (relative to its predecessors) and IBM's dominant influence, the introduction of the vacuum-tube IBM 650 was probably the major impetus for moving the industrial-computation paradigm from passing punched-card decks (representing data sets such as matrices) through semiautomatic mechanical calculators (such as the IBM 602A) to stored-program, stored-data computation. Before the 650 (if you were not rich enough to have access to a 704), information processing was done with punched-card data sets. This required a different, and very interesting, mindset.

Business information processing was file processing; a file was a deck of cards, and each card was typically a record of the file. Doing anything at all required multiple passes of these decks through various punched-card machines. Each file-processing function, such as sorting, collating (merging files or separating merged files), duplicating, or printing, was embodied in a separate type of machine. And each machine (except the simplest, the sorter) was programmed by plugging wires into a plugboard. The then-contemporary equivalents of system analysts and programmers wired plugboards and scheduled card-processing runs through rooms full of punched-card machines. At the top of the punched-card-machine food chain was the tabulator or "accounting machine," which read a card deck, performed sums, printed documents such as management reports and utility bills, and (if a punch machine was connected) occasionally punched out a total card. The king of the tabulator hill was the IBM 407, whose plugboard ("control panel") was a couple of square feet in area (the contact holes were perhaps 1/3 inch apart). Almost all of the wires in a typical panel were concerned with routing data (i.e., formatting the printed page), not programming as we would think of it today. A typical wire routed one card column to one character position on the printed page. The power of the computational paradigm of the 407 (indeed, of all plugboard-programmed punched card machines) lay in what was implied, not in what was expressed by wiring in the panel.

A few years after the 650 IBM introduced the (transistor-based) 1401, which began to move the large body of business punched-card shops from control-panel wiring to stored programming, and from card decks to magnetic tape. IBM recognized that this transition required a major paradigm shift for the workers involved, and it introduced the RPG (report program generator) language to make programming look more like wiring. The original RPG was a remarkable, elegantly simple language that was uniquely successful in capturing the mindset behind the wiring panel. Like wiring, almost all information in RPG specified data formatting. None of the sequential control normally necessary for repetitively processing records was present. I studied RPG to understand why; this is what I learned.

The Underlying Card Cycle

RPG is a language for processing files. In the course of executing a program you read one or two input files and write one output file, with possibly a second output file where you write input records that have not been processed because of error conditions. File processing is one big loop. (A loop is a program segment that is executed repeatedly.) Each time through the loop you process a record. In a billing application, for example, the input file contains customer transactions, sorted and grouped by customer number. The customer name and address information occurs once, at the beginning of each group of customer transaction cards. You read in this customer header card, then the transaction records, maintaining running totals. There might be "control breaks" (changes in the value of an identifying input field) which signal the need to write a subtotal line if, for example, a subaccount number (also sorted) changes. Finally, when there is a top-level control break (the customer number changes) totals are printed (and punched) out and a new customer is begun. This kind of application is what the 407 did well.

The 407 has a standard card cycle, as follows. In one "read" cycle one card is transferred from the input hopper to the first read station, the card in the first read station is moved to the second read station, and the card in the second read station is moved to the output hopper. Each of the eighty columns of the two read stations connects to one or two hubs (contacts) in the wiring panel. The first read station is used only for comparing control fields, and a special comparator section of the control panel accepts column data from the two read stations and emits control signals when there is a difference; this is how control breaks are discovered. Card data are processed from the second read station; these hubs can be wired to accumulators (mechanical adders) and printer positions, for example. The control signals from the comparators can be used to initiate special "total" cycles, in which cards are not moved, but accumulators are read out and cleared, and total printing occurs. There can be several levels of total cycles, corresponding to different control/subtotal levels, and sequencing from one level of total cycle to the next, and back to the read cycle, is automatic.

The Lesson

RPG is not a general-purpose programming language; this is the basis of its power. Underlying every RPG program is the 407 sequencing algorithm, hard-coded into the software so that nothing needs to be specified by the programmer. The RPG programming language uses fixed-format cards whose fields recall the sections of the 407 control panel, so that programming is a fill-in-the-blanks operation. For example, if columns 47-53 of a transaction card holds the part number, which is used as a control field for control level 2, that's all you have to say.

Versions of RPG after the first were corrupted by the pressure to turn it into a general-purpose programming language. Fortunately I was able to study it before its elegance disappeared, and generalize from its lessons. In a few cases I was able to design extremely simple languages for production programming of specific devices (for example, a Monroe billing machine), based on the following generalizations of the lessons of RPG.

The universe of all applications can be partitioned into classes according to the underlying loop (repetitive processing sequence) which characterizes each application class. For example, file processing operations are characterized by an RPG/407-like loop. Similarly, event-driven GUI applications are characterized by an underlying event detection and distribution loop that starts the response to a user-interface event..
For each class of applications there is at least one powerful development language design that hides this underlying processing sequence. (The processing sequence can be invisible because it is common to all programs in the class.)
What remains of an application specification after factoring out (abstracting) the underlying processing sequence is a small, almost static, modification of that underlying processing sequence; this is the "programming language." (Visual GUI-building tool such as Visual Basic do this for event-driven languages. After the event detection and distribution process has been factored out, what is largely left is a set of static specifications of windows and subwindows. What sequential processing remains (event-processing routines) has been partitioned by subwindow and event within subwindow, so that these event-processing routines are largely independent).
A measure of success of language design according to this paradigm is how little sequence is left in the "programming language" after the underlying processing sequence has been factored out. Thus, static is good.

3. Distinguish Applications from Algorithms

Another lesson from RPG is that there is a distinction between languages for algorithms vs. languages for applications. RPG's productivity as a tool lay in the fact that it was not a general-purpose programming language. It has been my observation that the most productive application development languages minimize the demands on the developer to devise sequential strategies. Hence the power of RPG and Visual Basic, both of which clearly fall in the static application, and not the sequential algorithm, camp.

Having come upon this distinction, my practical bent led me into the static application camp. This decision gave me welcome permission to stop worrying about languages such as C and C++. It turns out (and contemporary development tools make this painfully obvious) that if you approach the writing of event-driven GUI applications (i.e., those used in Macintosh, Windows, and Web browsers) coming from the perspective of algorithmic languages like C or C++, the job is devilishly complicated. I felt that there had to be a more comfortable model for these applications.

One of my more recent realizations has been that the job of a static application language is not to eliminate algorithms, but to hide them. In the static wiring language described in my patent, it turns out that there are two distinct kinds of algorithms to be hidden: algorithms hidden inside reusable data types (for example, discount rules associated with customer and product business objects) and algorithms hidden inside reusable application components (for example, the behavior of a user-interface list box). These two kinds of algorithms are logically distinct, and the wiring language treats them differently. The fact that algorithm programming is confined to reusable objects means that, once these reusable objects become mature, programming is not necessary for their use. Programming of algorithms continues to be done by programming professionals, but building of applications can be done by a much larger group of application developers. Thus, as Conway's law has suggested, there are parallels between the types of our artifacts and the types of the people who use these artifacts.