Matryoshka Si. Layer system of the program language

Let's try to imagine chemistry without Mendeleev's Periodic Table (1869). How many elements had to be kept in mind, and in random order ... (Then - 60.)

To do this, it is enough to think about one or several programming languages ​​\uXNUMXb\uXNUMXbat once. Same feelings, same creative mess.

And now we can relive the feelings of XNUMXth-century chemists when they were offered all their knowledge, and a little more, in one Periodic Table.

Matryoshka Si. Layer system of the program language


The book "Matryoshka Si. The layered system of the program language” presents all units of the C language at a glance. This allows you to organize them, correct outdated information, and even clarify the very concept of the program.

Today, information on programming needs to be systematized even more than chemical elements 150 years ago.

The first need is teaching. Mendeleev started creating his own system when he faced the question - from which element to start lecturing: O, H, N, He, Au ... At the same time, it was easier for him - he taught chemistry to the best - students of St. Petersburg University. And programming is already being taught at school and will soon begin in kindergarten.

The second need is a scientific approach. With the help of the Periodic system, new elements were discovered and information about old ones was corrected. She helped in creating a model of the atom (1911). And so on.

The third need is to clarify the concept of a program.

Modern programming with one foot is stuck in the 50s. Back then, programs were simple, but machines and machine languages ​​were complex, so everything revolved around machines and languages.

Now the opposite is true: programs are complex and primary, languages ​​are simple and secondary. This is called the applied approach, which seems to be familiar to everyone. But students and developers continue to be convinced that everything is the same.

Which brings us back to Privatdozent Mendeleev's first lecture. What to say to freshmen? Where is the truth? That is the question.

The book “Matryoshka Si. Layered system of the program language. Moreover, it is addressed not only to students, but also to trained programmers, since it is they, that is, we, who must seek the truth and turn the worldview around.

What follows is a summary of the book.

1. Introduction

In 1969, the C language was created, which became the foundational programming language and has remained so for 50 years. Why is it so? First of all, because C is applied the language that gave the program humane view instead Machinery. This achievement was consolidated by languages ​​from the C family: C++, JavaScript, PHP, Java, C# and others. Secondly, it is a short and beautiful language.

However, the C language itself is usually mixed with machine assembler, thereby complicating and distorting its perception. The other extreme is the imposition of a certain "philosophy" into the language load: procedural, object, functional, compiled, interpreted, typed, and so on. This adds emotion, but does not help to better describe the language.

The truth is in the middle, and for the C language it is strictly in the middle between philosophical and machine perception.

The C language is not independent, it obeys the usual written language, and at the same time it controls the assembly language itself. This position describes Speech model of the program, according to which the program is divided into three subordinate types: speech, code, command. The C language is responsible for the second, code view.

By determining the place of the language in the program, you can arrange information about it, which makes Layer system of the program language, representing the C language in the spirit of the Mendeleev system - on one page.

The system is built taking into account commonality of applied languagesarising from their verbal subordination. One set of C-Matryoshka units allows you to describe and compare different languages ​​by creating a series of C-Matryoshkas: C++, PHP, JavaScript, C#, MySQL, Python, and so on. It is worthy and correct that different languages ​​are described by units of the fundamental language.

2. CHAPTER 1. Speech model of the program. Understandable C

The first chapter presents program speech model, reflecting the applied approach. According to him, the program has three obvious sequential types:

  1. speech - direct speech of a programmer solving a problem,
  2. code - encoding the solution into a mathematical form in the C language (or any other)
  3. and command - direct machine commands.

The speech model explains why C is a simple and understandable language. C is built in the image and likeness of human speech familiar to us.

The first type of program is the programmer's direct speech. Speech corresponds to human thinking. With the help of speech, novice programmers write programs - first in Russian, then step by step translating actions into a code language. And that's exactly how the C language was created.

The programmer's conclusions, expressed in speech, are converted into a coded numerical form. This transformation should be called reflection, since speech and code are of the same nature (reflection - birth - genus). This is quite obvious if we compare the speech (left) and code (right) types of the program.

Matryoshka Si. Layer system of the program language

It is curious that reflection occurs very simply - with just two kinds of expressions.

However, the modern description of the C language (from 1978) does not contain a sufficient list of names either for describing the language in general, or for the reflection problem in particular. Therefore, we are forced to get creative and introduce these names.

The choice of words must be precise and clear. This required a special approach, succinctly expressed as the strict use of the native language. For the English it would be English, but we are not English. So we use what we have and we will try to speak Russian.

Reflection is performed by two kinds of expressions:

  1. computation (Vh) - reflects the change in the properties of the subject. The property of an object is expressed by a number, then the action on the property is the action on the number - an operation.
  2. subordination (Pch) - reflects a change in the order of actions. The prototype of Pch is a verbal complex sentence, therefore, most types of Pch begin with subordinating conjunctions “if”, “otherwise”, “bye”, “for”. Other types of PCs complement them.

By the way, can you believe that in the description of C there is no name for expressions calculation - they are called simply "expressions"? After that, the absence of a name and association for the genus subordination, and indeed the paucity of names, definitions and generalizations, will no longer surprise. This is because the famous C/R (The C Language, Kernighan/Ritchie, 1978) is not a description, but a guide to using the language.

However, I would still like to have a description of the language. Therefore it is offered Layer system of the program language.

3. CHAPTER 2. Layer system. Short C

Any description should be precise and as brief as possible. In the case of a program language, a head-on description is difficult.

Here we have a program. It is made up of modules. Modules consist of subroutines and collections (structure). Subprograms consist of separate expressions: declarations, calculations, subordinations. There are ten kinds of subordinations. Subordinations connect sublevels and subroutines. There are also several announcements. However, announcements are included not only in subroutines and sublevels, but also in modules and collections. And most expressions are made up of words that are so difficult to describe that they are usually just given in two lists - the original and derived words, which you will get acquainted with throughout the study and use of the language. Let's add punctuation marks and a number of other expressions to this.

In such a presentation, it is not easy to understand who stood on whom.

A direct hierarchical approach to describing a language would be overly complex. The search for detours leads to a description of the language based on its speech nature and command side. Thus, the Layer System was born, partially coinciding with the Periodic System of Mendeleev, which is also ply. As it turned out 42 years after its publication (1869), the periodicity of the system is associated with electronic layers (1911, Bohr-Rutherford model of the atom). Also, the Layered and Periodic systems have in common the tabular placement of all units on one page.

The description of the units of the language is short - only 10 kinds of expressions and 8 kinds of other units, as well as informative and visual. Although unusual for a first acquaintance.

Language units are divided into 6 levels:

  1. squads - table rows
  2. departments - special groups of genera (parts of the first line)
  3. genus - cells (main level of division)
  4. superspecies - species separators (rare level)
  5. views - formulas of units at the bottom of the cell or separately
  6. samples - the units themselves (only for words)

Word patterns describe dictionary - a separate subsystem, composed of the same six levels.

The speech component of the C language is quite obvious, although it still deserves a description. But the command part of the language is just connected with the management of compilation, during which a third type of program is created - a command one. Here we come to the most exciting side of the C language - beauty.

4. SUBSEQUENT CHAPTERS. Handsome Xi

The C language is at the heart of modern programming. Why? First, by virtue of the greatest correspondence of speech. Secondly, because it nicely bypassed the limitations of machine processing of numbers.

What exactly did Xi suggest? image and layer.

The word "image" is a translation of the English word "type", which comes from the Greek "prototype" - "prototype". In Russian, the word "type" does not convey the cornerstone of the concept being expressed, moreover, it is mixed with the official meaning "kind".

Initially, the image solved a purely machine problem of calculation, and then became a runway for the birth of object languages.

The layer immediately solved several problems - both machine-made and applied. Therefore, consideration will begin with a single-tasking image and move on to the multi-tasking layer.

One of the unpleasant features of historical programming is that most concepts, including the basic ones, are given without definitions. “The programming language (the name of the rivers) has an integer and a floating-point type of numbers ...” - and scratched further. It is not necessary to define what a “type” (image) is, because the authors themselves do not fully understand this and hush it up “for clarity”. If they are pinned to the wall, they will give a vague and useless definition. It helps a lot to hide behind foreign words: for Russian authors - behind English words (type), for English authors - behind French (subroutine), Greek (polymorphism), Latin (encapsulation) or their combinations (ad-hoc polymorphism).

But this is not our destiny. Our choice is definitions with a raised visor in pure Russian.

Form

Form is a pre-formative name of a value that defines 1) its own properties of the value and 2) the selection of operations for the value.

The word "type" (kind) corresponds to the first part of the definition: "intrinsic properties of the quantity". But the main meaning of the image is in the second part: "the selection of operations for quantities."

The starting point for introducing an image in the C language is a simple calculation, such as the operation of addition.

Paper mathematics, written by hand or typed, does not make much distinction between kinds of numbers, usually assuming them to be real. Therefore, their processing operations are unambiguous.

machine mathematics strictly divides numbers into integers and fractions. Different types of numbers are stored in memory in different ways and processed by different processor instructions. For example, the instructions for adding integers and fractional numbers are two different instructions corresponding to two different processor nodes. But the command for adding integer and fractional arguments is missing.

Applied Mathematics, that is, the C language, separates the types of numbers, but combines operations: addition for integers and / or fractional numbers is written with one action sign.

A clear definition of the concept of image allows us to definitely talk about two other concepts: value и операция.

Value and operation

value is the number being processed.

Operation - processing the values ​​of the initial values ​​(arguments) to obtain the final number (total).

The value and the operation are related. Each operation is a value because it has a numeric total. And each value is the result of transferring the value to/from the processor register, that is, the result of the operation. Despite this relationship, the main thing is the possibility of their separate description, albeit with the repetition of one word in different sections of the dictionary, which happens in ma3.

The machine approach divided all the numbers used by the programmer by teams и data. Previously, both those and others were just numbers, for example, commands were written in numerical codes. However, in applied languages, commands have ceased to be numbers and have become in words и signs of action. Only “data” remained as numbers, but to continue to call them so ridiculous, because when moving from a machine to a mathematical point of view, numbers are quantities that are divided by the original (data) and final (sought). "Unknown datum" is going to sound silly.

The teams were also divided into two types of actions: mathematical and service. Mathematical actions - operations. We'll get to the staff later.

In C languages, the usual paper and machine unambiguous, or single, mathematical operations almost without exception become multiple.

Multiple operations are several operations of the same name with different types of arguments and different, similar in meaning, actions.

Integer arguments correspond to an integer operation, fractional arguments to a fractional one. This difference is especially evident in the division operation, when the expression 1/2 gives a total of 0, not 0,5. Such a notation does not comply with the rules of paper mathematics, but the C language does not strive to comply with them (unlike Fortran) - it plays its own way. applied rules.

In the case of mixing integer and fractional numbers, the only correct one is included coercion of argument values - selective transformation of a value from one image to another. Indeed, when adding an integer and a fractional number, the result is a fractional one, so the image of the operation picks up the operation of converting an integer argument to a fractional value.

A number of operations remain plural, single. Such operations are defined for only one kind of arguments: the remainder of the division are integer arguments, stacking (bitwise operations) are natural integers. Ma3 indicates the multiplicity of operations by signs (#^) indicating the patterns for which the operation is defined. This is an important but previously overlooked property of each operation.

All functions are arbitrary unit operations. The exception is the operators - parenthesis functionsbuilt into the language (original operations).

Assistance to ecological value-chain and product innovation among SMEs

Assistance to ecological value-chain and product innovation among SMEs — action accompanying the operation.

If we consider the operation as the main action, then we can distinguish two accompanying ones that provide the operation and differ from it. These are 1) variable control and 2) subordination. This action is called assistance.

Here we need to digress and say separately about Russian translations of programming textbooks. A new word was introduced in the text of the C/R to record actions statement (expression) which attempted to separate the concepts of a machine instruction into different actions: 1) operation, 2) declaration, and 3) subordination (called "control constructs"). This attempt was buried by Russian translators, replacing the "expression" with the word "operator", which:

  1. became synonymous with the machine word "command",
  2. turned out to be a synonym for the phrase "sign of action",
  3. and also received an unlimited number of additional values. That is, it turned into a kind of English article "uh ...".

Consider related activities, or promoting.

Variable control

Variable control (UE) — creation/deletion of variable cells.
UE occurs implicitly when declaring a variable, which is already written for a different reason - to indicate the image of the value. Only one kind is explicitly managed additional variables using the malloc() and free() functions.

It should be noted that implicit actions are more convenient to write, since they do not require writing anything at all, but more difficult to understand - they are more difficult to take into account and interpret.

Subordination

Subordination — connection/disconnection of layer sections.

The C language offered a different, applied way to control the order of actions than assembly language - subordination. It reflects and develops a verbal complex sentence with a clear division into the main part (sentence subordination) and the subordinate part (sections sublevel / subprogram).

Both declaration and submission are built entirely on the concept layer.

Layer

Layer is a limited single-level selective set of expressions.

The layer explicitly and implicitly took on several tasks at once:

  1. streamlining the program
  2. limiting the visibility of names (implicitly),
  3. managing variables (memory cells) (implicitly),
  4. definition of subordinate sections for submission,
  5. definitions of functions and selections, and others.

In machine languages ​​there was no concept of a layer, so it did not appear in K / R either, and if something was not there, then introducing it in subsequent books would be heresy and freethinking. Therefore, the concept of a layer did not appear at all, although it is extremely useful and quite obvious.

Without a layer, it is impossible to briefly and clearly explain many of the actions and rules of the program. For example, why a goto as simple as three kopecks is bad, and a tricky while is good. One can only swear helplessly, as Dijkstra did (“Programmers’ skill is an inverse function of how often goto statements appear in their programs.” In short, only goats use goto. The level of justification is God.) True, it’s not so bad if your books in general, they are not obliged to explain anything, but, as we have already said, this is not our destiny.

By the way, it can be assumed that Dan Ritchie left goto precisely as a key to finding some unnamed concept, because there was no need or beauty in the goto expression. But there was a need for a simple and understandable explanation of the new principles of the language, which Richie himself did not want to give, and which are just based on the concept layer.

Deviation

Deviation — changing the normal properties of the new name.

The most important deviation is just related to the layer properties of the program, and is described by one word "static", which has a different meaning in each of the types of layer.

5. LAST CHAPTER. Generality of applied languages

Application languages ​​are figurative languages ​​(having an image, "typed"). They are based on the explicit or implicit use of the image. And here again a contradiction appears: an explicit image is more understandable, but less convenient, and vice versa.

Matryoshka Si. Layer system of the program language

(The table layout has not yet been delivered, so the table is given as a picture.)

After C, the development of applied languages ​​followed the path of increasing their figurativeness. The most important for understanding high-imageness is the direct descendant of C - the C ++ language. He develops the idea of ​​an arbitrary selection of operations to quantities and embodies it on the basis of the C expression selection, which receives a new name - an object. However, C++ is not as concise and expressive as C due to the overload of new collections and associated rules. By the way, let's talk about "overload".

Overloading and polymorphism

The word "overload" is an obsolete machine term for creating multiple operations.

Machine (system) programmers multiplicity operations could well be annoying: “What does this sign (+) mean: addition of integers, addition of fractional numbers, or even an offset ?! They don't write like that these days!" Hence the negative connotation of the chosen word ("bust", "tired"). For the application programmer, multiple operations are a cornerstone, a major achievement, and a legacy of the C language that is so natural that it is often unconscious.

In C++ multiplicity extended not only to the original operations, but also to functions - both separate and combined into classes - methods. Along with multiple methods came the possibility of redefining them in extended classes, which was vaguely called "polymorphism". The combination of polymorphism and overload gave an explosive mixture that broke up into two polymorphisms: "true" and "ad-hoc". You can understand this only in spite of the assigned names. The road to ad is paved with foreign names.

A declaration of the form "overload" is better expressed in the word additional announcement - adding a declaration of the function of the same name with arguments of a different image.

A declaration of the form "polymorphism" is better called the word re-announcement — an overriding declaration in a new extension layer of the function of the same name with arguments of the same image.

Then it will be easy to figure out that the methods of the same name of different images (arguments) - are additionally announced, and one image - re-announce.

Russian words decide.

Runway

Consideration of the concepts of highly figurative languages ​​confirms the importance of a clear definition of fundamental concepts. With properly defined C, learning highly figurative languages ​​will be easy and enjoyable.

This is especially important for implicit highly figurative languages (PHP, JavaScript). For them, the importance of objects (composite images) becomes even higher than in C++, but the very concept of an image becomes implicit and elusive. From the point of view of convenience, they have become simpler, but from the point of view of understanding, they have become more difficult.

Therefore, you should start learning programming languages ​​from the C language and move on in the order in which the languages ​​of the C family appear.

The same goes for describing languages. Different languages ​​have a similar or smaller set of unit genders than C does. The number of species and patterns can differ in both directions: C++ has more species than C, JavaScript has fewer.

The MySQL language deserves special mention. It would seem that there is nothing in common, but it is perfectly described by Matryoshka, and getting to know him becomes faster and easier. Which is important, given its importance to the web, the dining-road of modern programming. And where MySQL, there and other SQL. Well, all sorts of Fortran-Pascal-Pythons are also described by Matryoshka, as soon as the hands reach.

So, great things are waiting for us - an applied description of the C language and a single description of the languages ​​\uXNUMXb\uXNUMXbthat follow it. “Our goals are clear, the tasks are defined. Get to work, comrades! (Stormy, prolonged applause, turning into an ovation. Everyone rises.) "

Your opinions will be listened to with great attention, your help in creating a nesting dolls site will be accepted with great gratitude. More complete information about the book is on the site, cleverly hidden in Matryoshka Si.

Source: habr.com

Add a comment