Standard
Rules for Editing This Page
- Only registered users can edit. You can register here.
- If you have a change you want to perform, simply perform it.
- If you suspect the change requires explanation, place it in the discussion page.
- This page is for the actual standard. Discussions about the standard, open questions, etc., go in the discussion page.
Contents
Introduction
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).
When designing these guidelines, the following objectives were set, in order of decreasing priority:
- Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).
- Make the interface efficient.
- Keep the interface easy to implement.
Terms and Definitions
Acronym | Meaning |
---|---|
Bidi | bidirectional |
LTR | left-to-right |
RTL | right-to-left |
UBA | Unicode Bidi Algorithm |
UI | User Interface |
Bidi Embedding Levels | The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.
Level 0 corresponds to base LTR text. Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text. Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text. And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text. |
Caret (aka Text Cursor) | Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.
The text cursor should not be confused with the mouse cursor. Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor. |
Cursor Level | For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor. |
Keyboard Language | Language of the next character that will be entered from the keyboard. |
Logical Buffer | Buffer containing the text data in logical sequence (as opposed to visual sequence). |
Paragraph Embedding Level | Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph. |
Caret
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "line caret". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "block caret".
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "caret location". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. See open issues for discussion.
Logical vs. Visual Operations
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "logical caret".
This document also defines a new type of caret, called a "visual caret". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "logical operation", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "visual operation", and the caret after it is a visual caret.
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.
Caret Movement
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.
Positioning Using a Pointing Device
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.
This is a visual operation.
Left and Right Arrow Keys
Up and Down Arrow Keys
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.
Implementations may use the pixel location of the caret in the current line as the point by which the new caret location is calculated.
This is a visual operation.
Page Up and Down Keys
Home and End Keys
Index of Operations
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).
L/V | Operation | Section | Description |
---|---|---|---|
L | Filler | name | Just a filler operation, until we have an actual operation to place here |