Skip to content

Support KuMir #2759

@edukisto

Description

@edukisto

Language

“КуМир” (KuMir) is a Russian-based, ALGOL-like educational programming language. KuMir is an acronym for the “Kit of training worlds”. (In particular, in KuMir you can control a robot that moves along grids, which are called worlds.) The literal translation of “кумир” is “idol”.

In 1985 Andrey Yershov made a series of publications on teaching programming in Soviet schools. In those publications, Yershov proposed a deliberately unnamed, Russian-based, ALGOL-like algorithmic language. Yershov used the language in his textbook in the same year.

The reserved words of the language are:

  • алг (proc, a procedure name), арг (formal argument), все (esac or fi, end of condition), выбор (case in case...in), для (for), до (to in for...to), если (if), знач (the Result variable in Pascal), иначе (else), кон (end of program), кц (od, end of loop), нач (begin of program), не (not), нц (do, begin of loop), от (from in for...from...to), пока (while), при (in in case...in and until in for...until), рез (proc in arguments, a pass-by-reference parameter for storing a result), то (then in if...then), шаг (step in for...step...until);
  • data type names: вещ (real), вещ таб (array of real), дроб (fractional), дроб таб, лит (string), лит таб, лог (Boolean), лог таб, нат (natural [number]), нат таб, цел (integer), and цел таб.

Operators are и (and), или (or), *, ** (power), +, -, /, <, (later <=), =, (later <>), >, (later >=).

Punctuation marks are ", (, ), ,, ;, :, :=, [, ].

In 1990 Yershov’s algorithmic language was used in a textbook by Anatoly Koushnirenko, Gennady Lebedev, and Rudolf Svoren. They had

  • added several reserved words: ввод (input), вывод (output), дано (precondition), исп (begin module), надо (postcondition), нс (newline), раз (an alias for for...step 1), утв (assertion);
  • added | as a comment operator;
  • added Boolean values да (true) and нет (false);
  • added ' (same as ");
  • added сим (character) and сим таб data types;
  • dropped дроб, дроб таб, нат, and нат таб data types.

Koushnirenko’s dialect is named “School algorithmic language”. In 1990–1996 InfoMir Co. created KuMir IDE for several platforms including DOS and Mac OS 7. These implementations

  • support the выход reserved word (break in Pascal);
  • don’t require space before таб.

Since mid-2000s KuMir (along with C, Basic, Pascal, and Python) has been used on the State exams in junior (9th grade) and senior (11th grade) middle schools in Russia. Although called [Yershov’s/Koushnirenko’s] algorithmic language (“алгоритмический язык”), it is definitely KuMir, because it doesn’t have spaces before таб. See the 2021 exam specification (pages 18, 21, 24, 25).

NIISI RAS supports a Qt-based IDE for KuMir. The IDE is included in “Альт Образование”, a Linux distribution for middle and vocational schools.

NIISI RAS

  • added аргрез (an alias for арг рез), ВКЛЮЧИТЬ (include), всё (an alias for все), дс, использовать (use), кон_исп (an alias for кон исп), кц_при (an alias for кц при), пауза (breakpoint), стоп (exit in Pascal);
  • added data type names: компл (complex number), сканкод (scan code), файл (file), цвет (color).

General rules

KuMir is case-sensitive.

Source code file names have the extension kum.

Since this is an educational language, it is desirable to retain the ability to use the default color scheme:

  • comments should be highlighted in gray;
  • constants (numbers, strings and Boolean values) should be highlighted in light-blue;
  • data type names should be highlighted in brown;
  • names should be highlighted in blue.

Names

Identifiers can consist of digits, @, _, Latin and Cyrillic letters of any case, ! (undocumented), and spaces (/\s/) except \n and \r. Leading and trailing spaces are truncated. Adjacent inner spaces are merged. Identifiers with and without inner spaces are not equivalent. I. e., a␣b, a␣␣b, and a␣␣␣b are equivalent, whereas a␣b and ab are not (␣ represents a single space).

In fact, the modern KuMir compiler allows absolutely any character inside an identifier, besides [\x00-\x1f\x22-\x2f\x3a-\x3f\x5b-\x5e\x60\x7b-\x7e]. The first character can be represented as [^\x00-\x20\x22-\x3f\x5b-\x5e\x60\x7b-\x7e], because an identifier cannot begin with a numeric digit. The subsequent part can be represented as [^\x00-\x1f\x22-\x2f\x3a-\x3f\x5b-\x5e\x60\x7b-\x7e]+.

It is impossible to distinguish a non-parameterized function call from a variable name.

A single не keyword can interrupt a Boolean variable name on non-last position. For example, не a b c, a не b c, a b не c is equivalent to the inversion of the a b c variable.

Strings

Strings must be surrounded by single (') or double (") quotes. The quotes are not interchangeable.

No escaping and no interpolation allowed. Newlines (\n and \r; U+2028 and U+2029 are not newlines) are prohibited inside strings. If you want to insert a newline into a string, concatenate the string with the нс constant.

Data type names

вещ, лит, лог, сим, цел are common data type names. You can combine these reserved words with the таб reserved word to make an array. Only [\x20]* are allowed between a type name and таб, i. e. вещтаб, вещ␣таб, вещ␣␣таб, etc.

компл, сканкод, файл, цвет are data types, that are defined in standard modules (aka executors). You can’t combine them with таб. These words are not reserved until you import the corresponding module.

Numeric constants

The only difference from C-like numbers is the prefix indicating a hexadecimal number ($). We should search for names before searching for numbers, because names can contain digits surrounded by spaces.

Additional resources

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions