The Greatest Unix Programs

The author of the article, Douglas McIlroy, is an American mathematician, engineer, and programmer. He is best known for developing the pipeline in the Unix operating system, the principles of component-oriented programming, and several original utilities: spell, diff, sort, join, speak, tr.

Sometimes you come across really wonderful programs. After rummaging through my memory, I compiled a list of real Unix gems over the years. Basically, these are quite rare and not so necessary programs. But what makes them stand out is their originality. I can’t even imagine that I myself came up with the idea of ​​any of them.

Share what programs you are also so impressed?

PDP-7 Unix

For starters, the PDP-7 Unix system itself. Its simplicity and power made me move from a powerful mainframe to a tiny machine. It's the quintessential hierarchical file system, separate shell, and user-level process control that Multics on the mainframe has been unable to achieve after hundreds of man-years of development. Unix's shortcomings (such as the file system's record structure) were just as instructive and liberating as its innovations (such as shell I/O redirection).

dc

Robert Morris' Variable Precision Desktop Calculator Math Library used inverse error analysis to determine the precision required at each step to achieve a user-specified result precision. At the 1968 NATO Software Engineering Conference, in my report on software components, I proposed reference procedures that could produce any desired accuracy, but I did not know how to put them into practice. dc is still the only program I know of that can do this.

typo

Typo arranges words in text according to their similarity to the rest of the text. Misspellings like 'hte' tend to be at the end of the list. Robert Morris proudly said that the program would work equally well for any language. Although typo doesn't help you find phonetic errors, it was a real boon for all typesetters, and it did a lot of good before the much less interesting but more accurate dictionary spell checker came along.

Typo is just as unexpected on the inside as it is on the outside. The similarity measurement algorithm is based on the frequency of occurrence of trigrams, which are counted in a 26Γ—26Γ—26 array. The tiny memory barely had enough space for one-byte counters, so a scheme was implemented to compress large numbers into small counters. To avoid overflow, the counters were updated on a probabilistic basis, maintaining an estimate of the logarithm of the counter value.

eqn

With the advent of phototypesetting, it became possible, but terribly tedious, to print classical mathematical notation. Lorinda Cherry decided to develop a higher-level description language, and soon Brian Kernigan joined her. Their brilliant move was to put the oral tradition into writing, so eqn was remarkably easy to learn. The first mathematical expression language preprocessor of its kind, eqn hasn't been improved much since then.

struct

Brenda Baker started developing her Fortan-to-Ratfor converter against the advice of her boss, me. I thought that this might lead to a special reordering of the original text. It will be free of statement numbers, but otherwise no more readable than well-structured Fortran code. Brenda proved me wrong. She discovered that every Fortran program has a canonically structured form. The programmers preferred the canonical form, rather than what they themselves originally wrote.

pascal

The syntax diagnostics in the compiler created by Sue Graham's group at Berkeley was the most helpful I've ever seenβ€”and it was done automatically. On a syntax error, the compiler prompts you to insert a token to continue parsing. No attempt to explain what is wrong. With this compiler, I learned Pascal in one evening without any manual at hand.

Hidden inside the WWB (Writer's Workbench) module parts Lorinda Cherry determines parts of speech for words in an English text based on only a small dictionary, spelling and grammar rules. Based on this annotation, the WWB program displays stylometric indicators of the text, such as the prevalence of adjectives, subordinate clauses and complex sentences. When Lorinda was interviewed on NBC's Today and talked about the innovative grammar check in WWB texts, it was the first mention of Unix on television.

gooseberry

Al Aho expected his deterministic regular expression resolver to overtake Ken's classic non-deterministic resolver. Unfortunately, the latter was already completing a pass through complex regular expressions, while egrep built his own deterministic automation. To still win this race, Al Aho got around the curse of the exponential growth of the state table of the automaton by inventing a way to build on the fly only those entries in the table that are actually visited during recognition.

crabs

Luca Cardelli's charming meta-program for the Blit windowing system released virtual crabs that roamed the empty screen space, biting off the edges of active windows more and more.

Some general thoughts

Although it is not visible from the outside, theory and algorithms played a decisive role in the creation of most of these programs: typo, dc, struct, pascal, egrep. In fact, it is the unusual application of the theory that is most surprising.

Nearly half of the list β€” pascal, struct, parts, eqn β€” were originally written by women, far outnumbering the demographic of women in computer science.

Douglas McIlroy
March, 2020


Source: habr.com

Add a comment