Discussion: standard UNIX utilities that few people used and use now
A week ago, Douglas McIlroy, developer of the UNIX pipeline and originator of the concept of "component-oriented programming", ΡΠ°ΡΡΠΊΠ°Π·Π°Π» about interesting and unusual UNIX programs that are not widely used. The publication launched an active discussion on Hacker News. We have collected the most interesting and will be glad if you join the discussion.
UNIX-like operating systems have a standard set of text formatting tools. Utility typo allows you to view the document for typographical errors and hapaxes - words that occur only once in the material. Interestingly, the program for finding typos does not use dictionaries. It relies only on the information in the file and performs a frequency analysis on trigrams (a sequence of three characters). At the same time, all necessary counters are stored in a 26x26x26 array. According to Douglas McIlroy, this amount of memory was barely enough for several one-byte counters. Therefore, in order to save money, they were written in logarithmic form.
Today, typo has been replaced by more modern and accurate dictionary-based spell checkers. However, the instrument is still remembered - a few years ago, an enthusiast presented implementation of typo in Go. The repository is still being updated.
Another tool for working with documents from the 80s is the package Writer's Workbench by Lorinda Cherry and Nina McDonald of Bell Labs. In its composition included tools for identifying parts of speech and document style, searching for tautologies and unnecessarily complex sentences. Utilities were developed as an aid to students, and at one time they used students at Colorado State University in the USA. But by the early 7s, Writer's Workbench had been forgotten because it wasn't included in Version XNUMX Unix. However, this tool continued the path of imitators - for example, Grammar for IBM PC.
UNIX also has standard tools that make it easier to work with formulas. There is a language preprocessor for processing mathematical expressions eqn. It is notable for the fact that in order to display the formula, it is enough for the developer to describe it in simple words and symbols. Keywords allow you to shift mathematical symbols vertically and horizontally, change their size and other parameters. If you pass the string to the utility:
sum from { k = 1 } to N { k sup 2 }
The output will generate the following formula:
In the 1980sβ1990s eqn helped IT professionals write software manuals. But later it was replaced by the LaTeX system, which uses even Habr. But eqn is the first tool of its kind to remain part of UNIX-like operating systems.
Work with files
In the topic thread, Hacker News residents noted several rarely used utilities for working with files. One of them wascomm to compare them. This is a simplified analogue diff, sharpened for work in scripts. His wrote Richard Stallman himself, along with David MacKenzie.
The output of the program consists of three columns. The first column contains values ββthat are unique for the first file, and the second column contains values ββthat are unique for the second file. The third column includes the total values. For comm to work correctly, the compared documents must be lexically sorted. Therefore, one of the residents of the site proposed work with the utility in the following form:
comm <(sort fileA.txt) <(sort fileB.txt)
Comm is useful for checking the spelling of words. It is enough to compare them with the reference dictionary document. Considering the subtleties associated with the need to sort files, there are viewthat Stallman and McKenzie wrote their utility exclusively for this use case.
Also a panelist on HN said operator capabilities Easterwhich were not obvious to him. It allows you to interleave data streams or split one stream into two columns when outputting:
One of the users noticedthat often not the most optimal solutions are used to perform these simple operations: starting from fmt, ex and ending mlr Ρ jot ΠΈ rs.
What standard features of UNIX-like operating systems have you discovered?