Name

console-terminal-emulator — emulate a real terminal using a pseudo-terminal

Synopsis

console-terminal-emulator [--linux] [--sco] [--freebsd] [--netbsd] [--decvt] [--vcsa] [--inverted] {directory}

Description

console-terminal-emulator is a utility that expects file descriptor 4 to be the back end (called "master side" in older documentation) of a pseudo-terminal and the TTY environment variable to be the full device filename of the front end (called "slave side" in older documentation). (pty-get-tty(1) can be used to set this process state up.)

First it sets up the various files in directory, opening the input FIFO for reading and the display buffers for reading and writing. The FIFOs are created unconditionally, and opened in non-blocking mode. The display buffers are created if they do not exist.

It then enters a loop where it simultaneously:

  • processes all data received from the pseudo-terminal back end as terminal output, handling printing characters, control characters, escape sequences, and control sequences.

  • processes all input events from the input FIFO, sending terminal character and escape sequences to the back end.

It finishes when the back end signals hangup (i.e. when the line discipline would set modem control lines on a real serial device to signal hangup because the last front-end file descriptor is closed) and there are no more received data to output. At termination, it unlinks the file for the front end, and erases the display as if by a Clear Display control sequence.

Files

The files used are as follows:

directory/tty

A stable and predictable name for the front end of the pseudo-terminal device. If directory is /run/dev/vc1, for example, then the stable name for the terminal, for use in login services, will be /run/dev/vc1/tty. This is a link to the device, if possible, or a symbolic link if not.

Note

It is usually not. On Linux, the pseudo-terminal devices are on a filesystem of their own, which would make a link a cross-filesystem link; and on BSDs the devfs driver disallows the creation of directories under /dev forcing directory to be on another filesystem anyway.
directory/input

The input FIFO, through which realizer processes send keyboard and mouse events. Events are in a uniform packet format, which the terminal emulator converts into appropriate escape sequences.

directory/display

The UTF-32, 24-bit colour, display buffer.

directory/vcsa

An 8-bit character set, 3-bit colour, display buffer that is compatible with the Linux vcsa devices for kernel virtual terminals. This is provided for compatibility with screen reader softwares.

vcsa buffer

The --vcsa command line option enables output to the directory/vcsa file.

This file is structured as per the Linux character device file of that name, as a 4-byte header giving size and cursor position information followed by a series of 2-byte character and attribute pairs in IBM PC CGA format. It is implemented mainly as a compatibility mechanism for the benefits of terminal realizing softwares that expect Linux vcsa devices, such as screen readers for the blind or partially sighted.

The terminal emulator does not recognize nor handle CSI sequences that deal with 8-bit character sets. The vcsa screen buffer is always ISO 8859-1; with Unicode code points outwith that range converted to code point 255. To handle a wider range of characters, terminal realizing softwares should use the Unicode screen buffer. Code points in the ranges 0x00 to 0x1F and 0x80 to 0x9F may appear, and terminal realizing softwares are expected to render these with some form of ordinary printing graphic.

The 24-bit RGB foreground and background colour values for character cells are mapped to 3-bit CGA by comparing the relative intensities of red, green, and blue. The "bright foreground" and "bright background" CGA attribute bits are taken from the boldface and blink terminal attributes, respectively. No attempt is made to provide MDA or VGA in monochrome mode attributes, such as underline. To handle the full range of attributes, terminal realizing softwares should use the Unicode screen buffer.

Unicode buffer

The directory/display file begins with a 16-byte header:

  1. 4-byte UCS-4 Byte Order Mark in host byte order.

  2. 2-byte width in host byte order.

  3. 2-byte height in host byte order.

  4. 2-byte cursor X position in host byte order.

  5. 2-byte cursor Y position in host byte order.

  6. cursor glyph type byte (upper nybble reserved).

  7. cursor attributes byte (upper nybble reserved).

  8. screen flags and pointer attributes byte, upper and lower nybble.

  9. Reserved byte.

That is followed by a series of 16-byte records, one per character cell, containing:

  1. Foreground alpha value byte.

  2. Foreground red value byte.

  3. Foreground green value byte.

  4. Foreground blue value byte.

  5. Background alpha value byte.

  6. Background red value byte.

  7. Background green value byte.

  8. Background blue value byte.

  9. 4-byte UCS-4 value in host byte order.

  10. 2-byte attributes.

  11. Reserved bytes.

Unassigned code points, reserved code points, control code points, combining code points, and zero-width code points may appear, and terminal realizing softwares are expected to render these with some form of ordinary printing graphic. For forwards and backwards compatibility, reserved bits in records should be written as zeroes, ignored when read, and preserved when copied.

FIFO input protocol

The directory/input FIFO receives a sequence of 4-byte messages. To avoid message tearing, realizers must ensure that they do not write messages using multiple system calls. A message is a 32-bit word in host byte order. The most significant byte denotes the message type and the interpretation of the remainder of the message.

0x00nnnnnn

A null message. This is ignored

0x01cccccc

Unicode character. The UCS-4 code point is cccccc, This is UTF-8 encoded and sent through to the terminal line discipline as terminal input. If bracketed paste has been switched on, the control sequence denoting end of paste is sent beforehand.

0x02kkkkmm

A system keypad key. The key number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. System keys are ignored by the terminal emulator.

0x04kkkkmm

The absolute horizontal (x axis) position of the mouse. The column number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x05kkkkmm

The absolute vertical (y axis) position of the mouse. The row number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x06kknnmm

The absolute depth (z axis) position of the mouse. The row number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x07kkbbmm

A mouse button. The button number is kk, its state is bb, and mm is a set of bitflags indicating the current state of modifier keys. The well-known buttons are numbered in DEC VT order: left, middle, right, side. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x08kknnmm

A mouse wheel motion. The wheel number is kk, the (signed) amount by which the wheel is scrolled is nn, and mm is a set of bitflags indicating the current state of modifier keys. If bracketed paste has been switched on, the control sequence denoting end of paste is sent before any generated mouse report.

0x09cccccc

Pasted Unicode character. The UCS-4 code point is cccccc, This is UTF-8 encoded and sent through to the terminal line discipline as terminal input. If bracketed paste has been switched on, the control sequence denoting start of paste is sent beforehand. If the character is ESC or CSI, the control sequence denoting end of paste is sent afterward.

0x0Ckkkkmm

A consumer device key. The key number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. Consumer device keys are ignored by the terminal emulator, because of a lack of appropriate escape sequences and control sequences for representing them.

0x0Ekkkkmm

An extended key. The key number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. The terminal emulator sends the equivalent control sequence through to the line discipline as terminal input. If bracketed paste has been switched on, the control sequence denoting end of paste is sent beforehand.

0x0Fkkkkmm

A function key. The function key number is kkkk and mm is a set of bitflags indicating the current state of modifier keys. The terminal emulator sends the equivalent control sequence through to the line discipline as terminal input. If bracketed paste has been switched on, the control sequence denoting end of paste is sent beforehand.

0x11cccccc

Unicode accelerator character. The UCS-4 code point is cccccc, This is UTF-8 encoded and sent through to the terminal line discipline as terminal input, prefixed with ESC. If bracketed paste has been switched on, the control sequence denoting end of paste is sent beforehand.

The modifier flags represent abstract level2, level3, control, group2, and super modifiers. (See ISO 9995 for the concepts of levels and groups; super is an abstract "Command" or "GUI" modifier.) The input protocol does not comprise standalone events for modifier keys or associated modifier lock keys.

The numbering of system, extended, and consumer keys largely follows the ID numbers used for keys in the USB HID protocol, with some exceptions, and a number of additions for keys that are not present in USB. Importantly, keys that generate ordinary Unicode characters are sent as a Unicode character message and not as an extended key with the USB keycode.

Security

console-terminal-emulator requires no superuser privileges and is designed to be run entirely under the aegis of a dedicated unprivileged user account. It only requires write and search access to directory and need not have owner access to it. Conversely, only the emulator process needs write access to directory, as it is the only thing expected to create files there.

All created display buffer files have permissions rw-r-----. All created input FIFO files have permissions rw--w----. All display buffer files and the input FIFO file have their group IDs explicitly set to the effective GID of the emulator process. The emulator process itself has owner access to these files, and their owner ID is the effective UID of the emulator process.

Usually directory will be set-group-ID to a group different to the effective group ID of the emulator process. Changing the groups of the display buffer files and input FIFO file to the effective group ID of the emulator process thus distinguishes group access to those files in particular, allowing one to add ordinary users to the effective GID of the emulator process in order to give them direct realizer access to the emulated terminal without (thereby) granting them (group) access to anything else in directory.

Erasing the display buffer files at (non-abend) termination ensures that (absent system backups, log-structured filesystems, and low-level data recovery) old terminal display content cannot be read out of a display buffer. For best results, place these files on a temporary filesystem, set whatever options the temporary filesystem has (if any) for erasing backing storage at unmount, and exclude the temporary filesystem from backups.

Terminal output

console-terminal-emulator emulates the character sequence processing logic of a hardware terminal, taking the data received from the back end of the pseudo-terminal and translating them into modifications to the screen buffer data files.

All character data are first decoded from UTF-8 to a stream of Unicode code points. The decoding is directly from UTF-8 to the code points; in particular, UTF-16 is not employed as an intermediary step. If a UTF-8 sequence decodes to one of the reserved code points used for UTF-16, no special treatment is given, and the code point is treated as any other.

The decoder has to do something with invalid UTF-8 encodings, including overlong and incomplete character sequences. Whatever code point they decode to, such characters are not processed as control characters or as parts of control/escape sequences. They abort any control/escape sequence that they interrupt, and if not incomplete encodings they are printed as ordinary printing characters even if they are the code points for control characters. This behaviour should not be relied upon, and programs should not send such UTF-8 sequences to a terminal.

Control character processing

Characters in the "Cc" ("Other, Control") Unicode code point category (a.k.a. "C0" and "C1" characters) are control characters. They are always processed, even in the middle of escape or control sequences. All control characters are no-ops except for the following:

CR

Carriage return. Move to column 0.

NEL

Newline. Move to column 0 and one row down.

LF, VT, FF, and IND

Linefeed/Vertical tab/Index. Move one row down, remaining in the current column.

RI

Reverse index. Move one row up, remaining in the current column.

TAB

Horizontal tab. Move to the next tabstop, or the last column.

BS

Backspace. Nondestructively move to the previous column, stopping at the first column.

DEL

Delete. Delete the character at the cursor position, moving the remainder of the row to the left and padding the final column with a space.

HTS

Horizontal tab set. Set a tabstop at the current column.

CAN

Cancel. Cancel any control/escape sequence currently in progress.

ESC

Escape. Cancel any control/escape sequence currently in progress and begin an escape sequence.

CSI

Control sequence introducer. Cancel any control/escape sequence currently in progress and begin a control sequence.

Note

BPH is a no-op because the terminal emulator does not have enough information about word breaks or context to handle soft hyphenation. CCH is a no-op because ECMA-48 leaves it unspecified what the "data preceding it in the data stream" are, and thus it is possible to interpret this in several ways, from BS plus ECH 1 (that only considers printable output as "preceding data") to CAN (because control sequences are data, too).

Escape sequences

Escape sequences are multiple-character sequences comprising:

  • the ESC character, an optional intermediate character in the range U+0020 to U+002F, and a single final character in the range U+0040 to U+007E.

    Most such escape sequences for real terminals are ISO 2022 character set switching sequences, which the terminal emulator has no need of since it uses UTF-8 natively, and so does not support. The only supported escape sequences are:

    DECALN

    the DEC VT extension that fills the display with the letter "E" as a screen alignment test

    S7C1T

    set the terminal emulator to send C1 characters in control sequences encoded with the ECMA-48 7-bit extensions (as below)

    S8C1T

    set the terminal emulator to send C1 characters in control sequences as plain single C1 code points (but UTF-8 encoded)

  • the ESC character and a single final character in the range U+0040 to U+005F.

    This is the two-character 7-bit mechanism of ECMA-48 section 5. It is strictly speaking unneeded given that the terminal emulator is 8-bit clean and employs UTF-8 as standard. The entire U+0080 to U+009F control character range is accessible via this mechanism. So NEL can be emitted as either the single character U+0085 or as the two-character sequence U+001B U+0045. (Of course, U+0085 encodes to two characters in UTF-8.) Similarly, CSI can be emitted as either the single character U+009B or as the two-character sequence U+001B U+005B.

Control sequences

Per ECMA-48, control sequences are multiple-character sequences comprising the CSI character followed by parameter characters in the range U+0030 to U+003F followed optionally by an intermediate character in the range U+0020 to U+002F and terminated by a single final character in the range U+0040 to U+007E.

Parameters comprise the digits, semi-colon (U+003B), and colon (U+003A), forming up to 16 semi-colon separated sub-sequences; with an initial character in the range U+003C to U+003F denoting a DEC vendor-private control sequence that is an extension to ECMA-48. Each sub-sequence is up to 16 colon separated digit sequences. More than 16 parameters or sub-parameters simply causes trailing ones to be discarded. As an extension to DEC, the terminal allows vendor-private characters anywhere in the parameter character sequence, rather than only in the initial position, ignoring all but the last one. However, this is largely to simplify implementation and should not be relied upon.

The digit sequences are parameters to the action, and zero-length digit sequences are a parameter with the value zero (or, in some cases, 1 or 2). In most cases, a control sequence with N parameters is equivalent to N such control sequences in order each with one of the parameters.

ISO 8613-6/ITU T.416 SGR colour extensions make use of this mechanism, with colour values being encoded as sub-parameters behind a leading 38 or 48 sub-parameter. That standard explicitly states in section 13.1.8 that "Pe" values are separated by character "3/10" (i.e. U+003A). Such SGR sequences are unambiguous, since it is not possible to mistake a sub-parameter specifying a colour value for an SGR attribute code: CSI 1;4;38:5:14;48:2:0:224:3:7m.

As extensions, the row and column counts to the DTTerm DECSLPP 8 control sequence can also be expressed using sub-parameters.

As explained at length in the console-control-sequence(1) user manual, the RIS control sequence has been proscribed for decades. This terminal emulator implements the DECSTR (Soft Terminal Reset) control sequence, which should always be used instead of RIS.

There is one control sequence that is peculiar to this terminal emulator. DEC Private Mode #1369 is so-called "square" mode. If it is off, all Unicode characters classified as "full-width" or "wide" have an extra space printed after them. This is for the benefit of realizers that have oblong cells that cannot hold full-width or wide glyphs. It defaults to on, where no such special processing occurs.

Colours and attributes

The terminal emulator maintains a current set of character attributes, a 24-bit RGB foregrond colour, and a 24-bit RGB background colour. These are combined with character code points when writing to the character cells of each screen buffer. Each character cell thus has its own, independent, attributes and 24-bit foreground and background colours.

How indexed and direct colours are implemented

The 8 ECMA-48 standard colours set by the SGR control sequences 30–37 and 40–47, and the 8 AIXTerm colours set by the SGR control sequences 90–97 and 100–107, map from their 3-bit RGB colours to 24-bit RGB in the conventional manner; which is mostly an RGB 1-bit value mapping to an RGB 8-bit value 127 (standard colours being "dark") or 255 (AIXTerm colours being "bright") when set on and to 0 when set off. Special exceptions are made for bright black (which would be the same as dark black otherwise) mapping to #7F7F7F and dark white (which would in turn be the same as bright black) mapping to #BFBFBF; and for dark blue mapping to #4B0082 ("Web Indigo").

The terminal emulator also supports the ISO 8613-6/ITU T.416 SGR colour extensions, that are often misattributed to xterm:

  • The indexed colour extension (SGR 38:5 and SGR 48:5) has the conventional mapping of 16 "standard" colours, a 24 value grayscale, and a 6×6×6 colour cube.

    Note

    Colours #000000 and #FFFFFF occur three times each in this conventional map, and the saturated reds and greens occur twice.
  • The RGB direct colour extension (SGR 38:2 and SGR 48:2) simply sets the 24-bit RGB values directly.

Erasure

When characters or lines are deleted (resulting in erased characters or lines being scrolled in), or characters or lines or the screen are explicitly erased, the background colour used for the "filler" that is placed in the newly blanked cells defaults to the current background colour. This is "background colour erase" and is the default for DEC VT520 terminals as well as the only option on Linux and BSD kernel virtual terminals. By changing the "DECECM" private mode setting, it can be switched to "screen colour erase", which uses the default terminal background colour (as set by SGR 0 and SGR 49) rather than the current one.

Erasure is not the same as writing a space with the screen/background colour. Writing SPC is affected by the current attribute; and whilst boldface, italic, and faint have no effect on a SPC glyph (it having no foreground pixels), underline and reverse video do. Erasure always has all attributes set off, and thus erased cells are never underlined or on reverse video.

Note

For why this matters, see the explanation of faulty_inverse_erase in TerminalCapabilities(3).

Screen mode ("dark"/"light")

The --inverted command line option determines the initial state of the "DECSCNM" private mode setting, defaulting to off (i.e. "dark"). Use this for a default colour pair akin to a Sun SPARC machine's virtual terminal console, or GUI terminal emulators.

Note

Exactly how this takes effect depends from what realizer is being used. This is a separate flag in the Unicode output buffer, left up to realizers to handle. Commonly, realizers will XOR it with the reverse video attribute, inverting the sense of reverse video for the whole screen. However, for other choices, see the explanation of faulty_inverse_erase in TerminalCapabilities(3) for what other terminal emulators do with the DECSCNM flag.

Printing characters

All other characters are "printing" characters.

Characters in the "Cf" ("Other, Format") and "Mn" ("Mark, Non-spacing") Unicode code point categories are simply discarded.

Note

This includes U+00AD (soft hyphen). See the discussion of the BPH control character.

Characters in the "Me" ("Mark, Enclosing") Unicode code point category overstrike the character at the current cursor position without advancing.

All other printing characters are printed as-is, using the currently set attributes, foreground colour, and background colour.

Pending line wrap, and autmatic scrolling

After each character is printed, the cursor position is advanced.

By default, the emulator has automatic right margins turned on. Conceptually, automatic right margins means that writing a character in the last column automatically returns to the first column and moves down a row, a line wrap. If automatic right margins are turned off, writing a character in the last column does not move down a row or return to the first column. In practice, things are slightly more complex if automatic margins are turned on. Rather than a wrap happening there and then when the character is written to the last column, instead a pending wrap is flagged. If the cursor is then explicitly moved, the pending wrap is cancelled. Otherwise, when the next graphic is to be printed the pending wrap is enacted immediately beforehand.

The purpose of pending wrap is to allow full-screen TUI programs to write right up to the lower-right-hand corner without scrolling the screen. Programs must be aware of and careful about its effect. In the pending state, the terminal cursor is in a different position to where an application would have expected it to immediately wrap to; and using relative cursor motions in that state, including BackSpace and Next Line, needs to account for this.

By default, the emulator also has scrolling turned on, so that moving down from the last row scrolls the buffer up and moving up from the first row scrolls the buffer down. If scrolling is turned off, moving down from the last row or moving up from the first row have no effect. Scrolling only applies to cursor advancement by printing characters or the Newline, Index, or Reverse Index control characters. It does not apply to cursor motion control sequences.

Terminal input

Terminal input operates in terms of a stream of input events, comprising Unicode characters or special keys.

The abstract keyboard

The terminal emulator has no dealing in keyboard maps and exactly how keystrokes translate to Unicode characters; which are the province of a realizer.

The terminal emulator has no dealing in keyboard modifier state tracking; nor, similarly, does it deal in using modifiers to change dual-mode keypads. It has no dealings in numlock, capslock, shiftlock and these are not part of the abstract keyboard that is used in the input protocol. Rather, dealing with this is entirely and solely within the remits of realizers. For capslock, numlock, and shiftlock, the interactions of the locks and shifts is handled before the input stream is translated into the abstract keyboard.

Consequently, for the calculator keypad keys and the cursor/editing keypad keys on the abstract keyboard there are exactly two behaviours: "application mode" and "normal mode". There are no sub-states of "normal mode" controlled by numlock, albeit that in "normal mode" whatever the realizer transmits as an accompanying modifier state may be reflected in generated control sequences using the DEC VT augmentations to DECFNK employed by the VT420 onwards. (Numeric lock is a function internal to most keyboard hardwares anyway on PS/2 and USB keyboards, with many keyboards sending different scan codes according to the value of the NumLock status LED. Dealing with it lies in the part of the system that speaks to hardware.) These sub-modes are provided by realizers and their various keyboard mapping systems, and numlock and the associated shifting are handled before the input stream is translated into the abstract keyboard.

Realizers also perform any substitution of PF1 to PF5 on the calculator keypad (which are distinct, extended, keys) for the F1 to F5 function keys. This is a substitution that originates with DEC VTs, where at reset F1 to F5 were in a "local" mode and did not transmit characters, resulting in applications developers relying upon the calculator keypad keys instead, and hence terminal emulator softwares performing the same substtution for application compatibility. (In later model DEC VTs these function keys could all be reconfigured to "host mode" and would transmit DECFNK control sequences like any other function key.) The input protocol distinguishes the function keys from the calculator keypad keys, and a message denoting F1 to F5 denotes those function keys in their "host" mode, not the calculator keypad keys.

Control sequences for mouse

The DEC status reports report that a locator is always present and available, and that it is a mouse device.

Two mouse protocols are supported, the xterm Private Mode #1006 protocol and the DEC VT Locator protocol. These correspond to the ttymouse=sgr and ttymouse=dec settings in vim, respectively. These are superior to the unsupported xterm Private Mode #9 ("xterm.X10"), #1005 ("xterm.UTF-8"), and #1015 ("urxvt") protocols, for several reasons including avoiding encoding ambiguities and display dimension limits, and supersede them.

In the DECPM #1006 protocol, the click-only (DECPM #1000), button motion (DECPM #1002), and all events (DECPM #1003) modes are all available. The mouse grabber mode (DECPM #1001) is not.

In the DEC VT Locator protocol, only character coordinates are available, the terminal emulator having no dealings in displays; they being the province of a realizer and possibly not even being pixel-addressible.

Control sequences for keyboard

The DEC status reports report that a keyboard is always present, with an unknown country layout.

Extended keys and function keys cause the terminal emulator to send control sequences through the line discipline to terminal input. What control sequences are sent depends from what emulation type the terminal emulator is set to: SCO, Linux, FreeBSD, NetBSD, or DEC VT. No emulated terminal type defines control sequences for all keys, or distinct control sequences where it does define them.

  • DEC VTs only define control sequences for function keys from 1 to 20 and have no defined control sequences for function keys outwith that range. These function keys, and keys on the editing keypad, are encoded as the DECFNK control sequence. In particular, note that function keys F1 to F5 have documented DECFNK numbers (given in the VT420 and VT525 programmers' references), should a realizer actually send those keys.

    DEC VTs employ SS3 sequences for the PF1 to PF5 keys on the calculator keypad. Structurally, a single shift cannot carry parameterized modifier information. (XTerm is faulty in this regard.)

  • The SCO kernel virtual terminal defines SCO FNK control sequences for function keys from 1 to 48. (These clash with some control sequences defined by other terminal type families, in particular XTerm.)

    • PF1 to PF5 do not exist in a real SCO KVT, which does not replace F1 to F5.

    • For F1 to F12 the level 2 shift and control modifiers are encoded into the function key number, and the SCO KVT's original SCO FNK control sequences (which have no parameters) are emitted. For example: CSI U is F9. CSI ^ is Shift+Control+F9.

    • For F13 to F48 the SCO FNK control sequences are extended to have parameters. The function key numbers are used as-is, and the modifiers are instead encoded into the second sub-parameter of the control sequence, in similar form to DECFNK (except zero-based). The first sub-parameter is a (usually 1) repeat count, again similar to DECFNK. For example: CSI 1:5U is Shift+Control+F9.

  • The FreeBSD kernel virtual terminal only has function keys from 1 to 12. Because of incomplete functionality in its Teken library, it uses a mongrel admixture of DECFNK and SCO FNK sequences.

    • PF1 to PF4 (which replace F1 to F4, of course) always generate SS3 sequences, irrespective of modifiers.

    • Unmodified F1 to F12 are transmitted as DECFNK control sequences, the result of a hardwired map in the Teken library itself.

    • Modified F1 to F12 are transmitted as SCO FNK control sequences, the result of the default contents of keyboard maps set up by the kbdcontrol(1) tool.

  • The Linux kernel virtual terminal only defines control sequences for function keys from 1 to 20 and has no defined control sequences for function keys outwith that range. It employs DECFNK for actual function keys.

    The Linux KVT employs Linux FNK for the PF1 to PF5 keys on the calculator keypad. Linux FNK is not conformant with the ECMA-48 rules for control sequences and easy to erroneously confuse with the CSI control sequences for cursor keys, and should be avoided if possible.

The terminal emulator may fall back to the ECMA-48 FNK sequences, for function key numbers and modifier chords which have no DECFNK, Linux FNK, or SCO FNK control sequences. As an extension, the first sub-parameter to FNK encodes modifiers, in similar form to DECFNK (except zero-based). For examples: FNK 1:0 is F1. FNK 9:5 is Shift+Control+F9.

Control sequences for pasted input

The Private Mode #2004 controls whether bracketed paste is switched on. When it is, sequences of successive pasted characters are bracketed by DECFNK sequences denoting paste on and paste off.

To prevent actual pasted character sequences from resembling DECFNK, paste is switched off after every pasted ESC or CSI character. If the next character is a pasted one, paste will then be switched back on again.

Differences from terminals and documented standards

The terminal emulator does not replicate all features of a real hardware terminal. Its goal is to provide a workalike for (the TUI parts of the) the virtual terminals that are/were built in to the Linux and BSD operating system kernels. There is no support for the historical features of real terminal hardwares such as attached printers, page switching, status lines, XON/XOFF modem flow control, programmable function keys, alternative (WYSE/TVI) control/escape sequences, direct auxiliary serial device I/O, and graphics modes.

Rather, the terminal emulator is aimed at handling the outputs of TUI programs that use the linux terminal type (for the Linux kernel virtual terminals), the pcvtXX terminal type (for NetBSD kernel virtual terminals), the pccon terminal type (for OpenBSD kernel virtual terminals), or the cons25 and teken terminal types (for FreeBSD kernel virtual terminals). (These are the terminal types set by vc-get-tty(1).)

The terminal emulator also has no dealing in the things that are the domain of separate realizing tools. The modular nature of user-space virtual terminals means that the terminal emulator has no knowledge of the actual devices used to realize the terminal, and that there can be zero or many realizers for any given virtual terminal. There is no support for font definitions, window titles, 256-colour VGA palettes, VGA-specific overscan and underline, VESA power, screen-savers, or multi-mode keypads. In response to device status report requests, it always responds with fixed information about a set of pseudo-devices.

The handling of U+00AD differs from some other terminal emulation programs, which are based upon code written by Markus Kuhn that treats U+00AD unconditionally as an spacing character and explicitly overrides its actual "Mn" (non-spacing) categorization in Unicode.

The terminal emulator is 8-bit clean and employs UTF-8 as standard. Therefore, the terminal emulator has no need for mechanisms to switch 8-bit code pages amongst multiple character sets. There is no ISO 2022 support.

The two-character 7-bit mechanism of ECMA-48 (section 5) is not only present, but is more completely implemented than in several kernel virtual terminal emulators, which usually only implement the 7-bit aliases for CSI and OSC.

Both ECMA-48 and the DEC VT520 Video Terminal Programmer Information [EK-VT520-RM] reference are straightforward and clear about numeric parameters to cursor motion and screen editing control sequences: a value of 0 is given no special meaning and just means zero rows/columns/repeats/whatever. There is a full explanation of this in Annex E of ECMA-48:1986 and Annex F of ECMA-48:1991. Strictly speaking, that is not the behaviour of the Linux or FreeBSD kernel virtual terminal emulators, both of which apply the old 1976 rule; nor is it the behaviour of old DEC VTs.

This is one place where the terminal emulator deviates from the aforementioned goal, in favour of conforming with the ECMA-48 standard. Per the standard, it defaults to zero meaning zero. It allows this to be controlled with Mode #22, even though using this mode to switch back to the old behaviour was deprecated in ECMA-48:1991. Nowadays, several common GUI terminal emulators obey the zero-means-zero rule; and at the time of writing this manual it has been 34 years since the standard was changed.

Neither ECMA-48 nor the DEC VT520 Video Terminal Programmer Information [EK-VT520-RM] reference document a quirk of most DEC VT-family emulators: holding line wrap pending. (This was only ever documented by DEC in internal documentation that was marked "company confidential".) Actual implementations, including the FreeBSD kernel virtual terminal emulator, have this mechanism. This is largely undocumented behaviour, but it is behaviour that many programs (including the prompt display of the Z Shell, for example) rely upon. So it is implemented here.

There is a long history of terminal emulators getting ISO 8613-6/ITU T.416 SGR 38 and SGR 48 wrong. Terminal emulators have variously forgotten the colour space selector in sub-parameter 2 of SGR 38:2 and SGR 48:2. A long-standing misunderstanding/misreading on the parts of many people has led, moreover, to most programs (from terminfo to the Linux kernel terminal emulator) sending and expecting semi-colons instead of colons, leading to ambiguous SGR sequences: CSI 1;4;38;5;14;48;2;0;224;3;7m. For compatibility, the terminal emulator understands these non-standard and ambiguous control sequences, converting (all remaining) parameters into sub-parameters to resolve ambiguity; but the standard control sequences are preferred. Please fix this 25-year-old bug if your application has it.

There is a degree of variance amongst kernel virtual terminals that can cause problems if the TERM environment variable (and hence termcap/terminfo terminal type) used by applications connected to the front end of the terminal does not match the emulation being employed by the terminal emulator. As explained in TERM(5), it is an error to use the wrong value for this variable, that does not match the terminal emulator's emulation. (This is an error called out in the XTerm FAQ.)

Note

None of the emulator modes in this terminal emulator, and none of the kernel virtual terminals being emulated, are the "xterm" terminal type.

DEC VT

This mode is selected by the --decvt command-line option. It matches (a subset of) a DEC VT in its native VT mode.

The termcap/terminfo records being matched are the "vt520" terminal type family. Variances:

  • A DEC VT sets no tabstops in response to the RIS and DECSTR control sequences. The Linux and FreeBSD kernel terminal emulators both do, however; so for compatibility so too does this terminal emulator. (As explained earlier, do not use RIS.)

  • A DEC VT does not produce distinct application mode control sequences for the asterisk, minus, and slash graphic keys on the calculator keypad. This terminal emulator does, giving them SS3-shifted j, m, and o in application calculator keypad mode.

SCO

This mode is selected by the --sco command-line option. With variances as laid out here, it matches the SCO kernel virtual terminal, the FreeBSD KVT with the Teken library in CONS25 mode, the old WSCONS FreeBSD KVT from version 8 and earlier, and (a subset of) a DEC VT in its SCO Console mode.

For F1 to F12 the level 2 shift and control modifiers are encoded into the function key number, and the SCO KVT's original SCO FNK control sequences (which have no parameters) are emitted. For F13 to F48 the SCO FNK control sequences are extended to have parameters. The function key numbers are used as-is, and the modifiers are instead encoded into the second sub-parameter of the control sequence, in similar form to DECFNK (except zero-based). The first sub-parameter is a (usually 1) repeat count, again similar to DECFNK.

The termcap/terminfo records being matched are the "cons" terminal type family (including its height variants). Variances:

  • The SCO KVT does not implement application/normal modes for the calculator or cursor keypads. This terminal emulator does.

  • The SCO KVT does not make a distinction between the calculator keypad keys (with numeric lock off) and the matching editing/cursor keypad keys. This terminal emulator follows suit in its SCO mode, even though the input protocol actually does.

  • The SCO KVT does not ever produce control sequences for the graphic keys on the calculator keypad (i.e. plus, minus, enter, slash, and asterisk). It always produces the graphic characters. This terminal emulator produces SS3-shifted characters when that keypad is in application mode.

  • The SCO KVT always yields the DEL character for the Del keys on the editing and calculator keypads. In addition to providing application mode for the calculator keypad, this terminal emulator provides in normal keypad mode the switchable Del behaviour of a DEC VT (via DEC Private Mode #1037).

  • The SCO KVT always yields the BS character for the Backspace key on the main keypad. This terminal emulator provides the switchable Backspace behaviour of a DEC VT (via DEC Private Mode #67).

  • The SCO KVT does not produce control sequences when calculator keypad keys (in normal mode) or cursor keypad keys are used with the Alt modifier. This terminal emulator adds modifier parameters to the CSI control sequences.

  • The SCO KVT does not support equals or (ABNT2) comma on the calculator keypad. This terminal emulator does, giving them SS3-shifted X and l in application calculator keypad mode.

  • The SCO KVT does not support actual F13 and upwards function keys. A DEC VT in its SCO Console mode falls back to its native VT mode, using DECFNK, for F13 to F20. This terminal emulator instead falls back to an extended SCO FNK mechanism, as described earlier.

FreeBSD (Teken)

This mode is selected by the --freebsd command-line option, and is the default when the terminal emulator is compiled for FreeBSD. With variances as laid out here, it matches the FreeBSD 9 and later kernel virtual terminal, that uses the Teken library, in its native Teken mode.

The termcap/terminfo records being matched are the "teken" terminal type family (including its colouring variants). This is is now the proper terminal type for the FreeBSD kernel virtual terminal. Importantly, it is not the "xterm" or "cons25" terminal type families; and it is an error to think that Teken is XTerm or even a subset of it.

Variances:

  • The FreeBSD KVT does not report modifiers for keys on the editing, cursor, or calculator keypads. This terminal emulator does.

  • The FreeBSD KVT does not implement application/normal modes for the calculator keypad. This terminal emulator does.

  • Except for Del, the FreeBSD KVT does not make a distinction between the calculator keypad keys (with numeric lock off) and the matching editing/cursor keypad keys. This terminal emulator follows suit in its Teken mode, even though the input protocol actually does.

  • The FreeBSD KVT does not ever produce control sequences for the graphic keys on the calculator keypad (i.e. plus, minus, enter, slash, and asterisk). It always produces the graphic characters. This terminal emulator produces SS3-shifted characters when that keypad is in application mode.

  • The FreeBSD KVT always yields the DEL character for the Del key on the calculator keypad, and always yields DEKFNK for the Del key on the editing keypad. In addition to providing application mode for the calculator keypad, this terminal emulator provides in normal keypad mode the switchable Del behaviour of a DEC VT (via DEC private mode #1037).

  • The FreeBSD KVT always yields the BS character for the Backspace key on the main keypad. This terminal emulator provides the switchable Backspace behaviour of a DEC VT (via DEC Private Mode #67).

  • The FreeBSD KVT does not produce control sequences when calculator keypad keys (in normal mode) or cursor keypad keys are used with the Alt modifier. This terminal emulator adds modifier parameters to the CSI control sequences.

  • The FreeBSD KVT does not support equals or (ABNT2) comma on the calculator keypad. This terminal emulator does, giving them SS3 sequences with final characters X and l in application calculator keypad mode.

NetBSD ("vt100")

This mode is selected by the --netbsd command-line option, and is the default when the terminal emulator is compiled for NetBSD. With variances as laid out here, it matches the NetBSD kernel virtual terminal in its "vt100" mode.

Note

It is a misnomer to name this "vt100", as NetBSD does. A true DEC VT100 series does not match this emulation. It did not do ECMA-48 colour, for example.

The termcap/terminfo records being matched are the "wsvt" terminal type family (including its height variants). Variances:

  • The NetBSD KVT does not report modifiers for keys on the editing, cursor, or calculator keypads. This terminal emulator does.

  • The NetBSD KVT does not implement application/normal modes for the cursor and calculator keypads. This terminal emulator does.

  • The NetBSD KVT does not make a distinction between the calculator keypad keys (with numeric lock off) and the matching editing/cursor keypad keys. This terminal emulator follows suit in its NetBSD mode, even though the input protocol actually does.

  • The NetBSD KVT does not ever produce control sequences for the graphic keys on the calculator keypad (i.e. plus, minus, enter, slash, and asterisk). It always produces the graphic characters. This terminal emulator produces SS3-shifted characters when that keypad is in application mode.

  • The NetBSD KVT always yields the DEL character for the Del key on the calculator keypad, and always yields DEKFNK for the Del key on the editing keypad. In addition to providing application mode for the calculator keypad, this terminal emulator provides in normal keypad mode the switchable Del behaviour of a DEC VT (via DEC private mode #1037).

  • The NetBSD KVT always yields the BS character for the Backspace key on the main keypad. This terminal emulator provides the switchable Backspace behaviour of a DEC VT (via DEC Private Mode #67).

  • The NetBSD KVT does not produce control sequences when calculator keypad keys (in normal mode) or cursor keypad keys are used with the Alt modifier. This terminal emulator adds modifier parameters to the CSI control sequences.

  • The NetBSD KVT does not support equals or (ABNT2) comma on the calculator keypad. This terminal emulator does, giving them SS3 sequences with final characters X and l in application calculator keypad mode. The latter is defined by DEC VTs.

Linux

This mode is selected by the --linux command-line option, and is the default when the terminal emulator is compiled for Linux.

The termcap/terminfo records being matched are the "linux" terminal type family (including its colouring variants). Variances:

  • The Linux kernel virtual terminal sends the DECFNK sequences for FIND and SELECT for the HOME and END keys, rather than the (different) proper control sequences for those keys.

    This has two major consequences: If the terminal emulator is in Linux emulation mode, HOME and END will not be correctly recognized if the TERM environment variable does not also specify "linux"; and if the terminal emulator is in another mode but the TERM environment variable is set to "linux", FIND and SELECT will be incorrectly recognized and HOME and END will not be recognized at all.

  • The Linux KVT does not report modifiers for keys on the editing, cursor, or calculator keypads. This terminal emulator does.

  • The Linux KVT does not make a distinction between the calculator keypad keys (with numeric lock off) and the matching editing/cursor keypad keys. This terminal emulator follows suit in its Linux mode, even though the input protocol actually does.

  • The Linux KVT does not ever produce control sequences for the graphic keys on the calculator keypad (i.e. plus, minus, enter, slash, and asterisk). It always produces the graphic characters. This terminal emulator produces SS3-shifted characters when that keypad is in application mode.

  • The Linux KVT does not produce control sequences when calculator keypad keys (in normal mode) or cursor keypad keys are used with the Alt modifier. This terminal emulator adds modifier parameters to the CSI control sequences.

  • The Linux KVT does not support equals or (ABNT2) comma on the calculator keypad. This terminal emulator does, giving them SS3 sequences with final characters X and l in application calculator keypad mode. The latter is defined by DEC VTs.

See also

pty-run(1)

an I/O pump that pumps data in both directions between the back end of a pseudo-terminal and its own standard I/O

console-control-sequence(1)

a utility for emitting a range of useful control sequences

console-multiplexor(1) , console-input-method(1)

mechanisms that layer on top of a terminal emulator

Author

Jonathan de Boyne Pollard