https://bidiedit.lingnu.com/api.php?action=feedcontributions&user=Shachar&feedformat=atomBidiEdit - User contributions [en]2024-03-28T09:15:17ZUser contributionsMediaWiki 1.31.1https://bidiedit.lingnu.com/index.php?title=Standard&diff=70Standard2016-06-04T17:03:51Z<p>Shachar: logical to visual conversion</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer. A visual caret shall, sometimes, also have a logical position.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret. Certain operations are visual operations, which result in a visual caret, but also set the logical position.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
logical placement. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left", "right", "up" and "down" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
An ambiguous caret will happen if and only if the character before/to the left of the caret has a different BiDi level than the character after/to the right of the caret. If the caret is before the first or after the last character of the paragraph, the missing character's BiDi level is ''sor'' or ''eor'' respectively, where sor and eor are defined by the UBA. For all practical purposes, sor and eor are at either level 0 or 1, depending on the paragraph base direction (0 for left to right, 1 for right to left). The difference between the BiDi levels on both sides of the caret is called the '''level differential'''.<br />
<br />
===Ambiguous Logical Caret===<br />
An ambiguous logical caret will have two visual positions. One position is right after the character before the caret, and the other is right before the character after the caret. In this sentence, "after" and "before" do not bear their usual meaning, as visual positions are involved. Instead, "after a character" means to the right of the character if that character's BiDi level is LTR (even), or to the left of the character if that character's BiDi level is RTL (odd).<br />
<br />
Examples:<br />
<pre>english |ARABIC more english.<br />
00000000 11111100000000000000</pre><br />
<br />
Where the caret is represented by the horizontal line. The caret might, visually, be in the following positions:<br />
<pre>english |CIBARA| more english.</pre><br />
<br />
Another example. In an RTL paragraph, the following logical text:<br />
<pre>|english HEBREW more english.<br />
2222222111111112222222222221</pre><br />
<br />
The caret may be in the following positions:<br />
<br />
<pre style="text-align: right;">.more english WERBEH |english|</pre><br />
<br />
====Choosing Correct Visual Position====<br />
The visual location a specific logical position translates MUST be one of the locations determined above. Selecting which one should be done according to the following criteria in descending order of precedence:<br />
# If the last operation changed the buffer, we choose the visual position such that, if the next operation will be the same operation, the caret will be displayed where that operation takes place.<br />
#* If the last operation was to type something, position the caret where the next character will appear assuming it is of the same BiDi class as the previous one.<br />
#* If the last operation was to delete a character with backspace, position the caret so that it is next to the character to be deleted next if a backspace is hit again.<br />
# If the last operation did a logical movement with no change to the buffer (HOME, END etc.), position the caret next to the character the caret skipped over.<br />
#* If the move was to an earlier (logical) position, position the caret next to the character immediately after the caret.<br />
#* If the move was to a later position, position the caret next next to the character immediately before the caret.<br />
<br />
'''''See [[Talk:Standard#Logical_Caret_Location|open issues]] for discussion'''''<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Talk:Standard&diff=69Talk:Standard2016-06-04T16:53:59Z<p>Shachar: /* Open Issues */</p>
<hr />
<div><br />
== Open Issues ==<br />
<br />
<br />
=== Visual vs. Logical Selection ===<br />
Logical selection is the norm today. This can get very confusing very quickly. Some think that a visual selection will work better. Pro reasons:<br />
<br />
* Much less confusing.<br />
* Very very predictable.<br />
* With logical selection, there are occasions where the visual cues push you to move the mouse in the wrong direction to complete the operation you want to complete. This is true even if the end selection is unambiguous.<br />
<br />
Con reasons:<br />
<br />
* Not useful. In any case where visual and logical selections produce different results, it is the logical selection people want.<br />
* Some features are unimplementable with visual selection<br />
<br />
=== Moving characters in the logical buffer ===<br />
Some rules in the standard call for changing the logical sequence of characters "unrelated" to the current operation. The details of the move need to be worked out.<br />
<br />
We also need to decide whether such a movement can be done in a way that will not be considered, in and on its own, destructive.<br />
<br />
=== Visual display of logical caret ===<br />
We need to work out the correct algorithm to place (for display purposes) a logical cart on screen.</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Talk:Standard&diff=68Talk:Standard2016-06-04T16:52:22Z<p>Shachar: Open issues</p>
<hr />
<div><br />
== Open Issues ==<br />
<br />
<br />
=== Visual vs. Logical Selection ===<br />
Logical selection is the norm today. This can get very confusing very quickly. Some think that a visual selection will work better. Pro reasons:<br />
<br />
* Much less confusing.<br />
* Very very predictable.<br />
* With logical selection, there are occasions where the visual cues push you to move the mouse in the wrong direction to complete the operation you want to complete. This is true even if the end selection is unambiguous.<br />
<br />
Con reasons:<br />
<br />
* Not useful. In any case where visual and logical selections produce different results, it is the logical selection people want.<br />
* Some features are unimplementable with visual selection<br />
<br />
=== Moving characters in the logical buffer ===<br />
Some rules in the standard call for changing the logical sequence of characters "unrelated" to the current operation. The details of the move need to be worked out.<br />
<br />
We also need to decide whether such a movement can be done in a way that will not be considered, in and on its own, destructive.</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=67Standard2016-06-04T16:42:50Z<p>Shachar: Introduce logical positions of visual carets</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer. A visual caret shall, sometimes, also have a logical position.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret. Certain operations are visual operations, which result in a visual caret, but also set the logical position.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
logical placement. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left", "right", "up" and "down" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
An ambiguous caret will happen if and only if the character before/to the left of the caret has a different BiDi level than the character after/to the right of the caret. If the caret is before the first or after the last character of the paragraph, the missing character's BiDi level is ''sor'' or ''eor'' respectively, where sor and eor are defined by the UBA. For all practical purposes, sor and eor are at either level 0 or 1, depending on the paragraph base direction (0 for left to right, 1 for right to left). The difference between the BiDi levels on both sides of the caret is called the '''level differential'''.<br />
<br />
===Ambiguous Logical Caret===<br />
An ambiguous logical caret will have two visual positions. One position is right after the character before the caret, and the other is right before the character after the caret. In this sentence, "after" and "before" do not bear their usual meaning, as visual positions are involved. Instead, "after a character" means to the right of the character if that character's BiDi level is LTR (even), or to the left of the character if that character's BiDi level is RTL (odd).<br />
<br />
Examples:<br />
<pre>english |ARABIC more english.<br />
00000000 11111100000000000000</pre><br />
<br />
Where the caret is represented by the horizontal line. The caret might, visually, be in the following positions:<br />
<pre>english |CIBARA| more english.</pre><br />
<br />
Another example. In an RTL paragraph, the following logical text:<br />
<pre>|english HEBREW more english.<br />
2222222111111112222222222221</pre><br />
<br />
The caret may be in the following positions:<br />
<br />
<pre style="text-align: right;">.more english WERBEH |english|</pre><br />
<br />
====Choosing Correct Visual Position====<br />
The visual location a specific logical position translates MUST be one of the locations determined above. Selecting which one should be done according to the following criteria in descending order of precedence:<br />
'''''See [[Talk:Standard#Logical_Caret_Location|open issues]] for discussion'''''<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Main_Page&diff=66Main Page2014-09-27T10:49:22Z<p>Shachar: Update stale external link</p>
<hr />
<div>This site is mainly aimed at a colloborative effort to create an updated version of [https://portal.sii.org.il/heb/standardization/teken/?tid=df282b39-7281-4caa-a324-efb781ee2c93 SI5194]. The existing version can be viewed at http://imagic.weizmann.ac.il/~dov/Hebrew/logicUI24.htm<br />
<br />
A work in progress copy of the new standard can be found [[Standard|here]].<br />
<br />
==Version 0.02 Released==<br />
Version 0.02, dubbed "Still misleading release", is now available for download from our [http://sourceforge.net/projects/bidiedit/files/Version-0.02/ download page] on sourceforge. This version is a bug fix version only, fixing the Windows installer problem described for Version 0.01, as well as a few other odd bugs. It does not, yet, support BiDi, and is just a basic text editor.<br />
<br />
==Version 0.01 Released==<br />
Version 0.01, dubbed "Totally misleading release", <del>is now available for download from our download page on sourceforge</del>. This version does not, yet, support BiDi, and is just a basic text editor. It is mostly intended to make sure the core text editing functionality is complete, and to start receiving bug reports.<br />
<br />
'''WARNING''': The installer for Windows of this version takes control over the .txt file extension, which is, almost certainly, not what you want. Install version 0.02 instead!</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=File:Sf_120x30_white.gif&diff=65File:Sf 120x30 white.gif2012-04-03T19:23:33Z<p>Shachar: Sourceforge logo</p>
<hr />
<div>Sourceforge logo</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=File:Multi-highlight.png&diff=64File:Multi-highlight.png2012-04-03T19:22:24Z<p>Shachar: Image showing five dis-continuous highlight zones over one line of text with no BiDi control characters. The selection begins between the 12 and 34 on the left, and ends between the 12 and 34 on the right.</p>
<hr />
<div>Image showing five dis-continuous highlight zones over one line of text with no BiDi control characters. The selection begins between the 12 and 34 on the left, and ends between the 12 and 34 on the right.</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=63Standard2012-04-03T03:57:07Z<p>Shachar: 51 revisions</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
An ambiguous caret will happen if and only if the character before/to the left of the caret has a different BiDi level than the character after/to the right of the caret. If the caret is before the first or after the last character of the paragraph, the missing character's BiDi level is ''sor'' or ''eor'' respectively, where sor and eor are defined by the UBA. For all practical purposes, sor and eor are at either level 0 or 1, depending on the paragraph base direction (0 for left to right, 1 for right to left). The difference between the BiDi levels on both sides of the caret is called the '''level differential'''.<br />
<br />
===Ambiguous Logical Caret===<br />
An ambiguous logical caret will have two visual positions. One position is right after the character before the caret, and the other is right before the character after the caret. In this sentence, "after" and "before" do not bear their usual meaning, as visual positions are involved. Instead, "after a character" means to the right of the character if that character's BiDi level is LTR (even), or to the left of the character if that character's BiDi level is RTL (odd).<br />
<br />
Examples:<br />
<pre>english |ARABIC more english.<br />
00000000 11111100000000000000</pre><br />
<br />
Where the caret is represented by the horizontal line. The caret might, visually, be in the following positions:<br />
<pre>english |CIBARA| more english.</pre><br />
<br />
Another example. In an RTL paragraph, the following logical text:<br />
<pre>|english HEBREW more english.<br />
2222222111111112222222222221</pre><br />
<br />
The caret may be in the following positions:<br />
<br />
<pre style="text-align: right;">.more english WERBEH |english|</pre><br />
<br />
====Choosing Correct Visual Position====<br />
The visual location a specific logical position translates MUST be one of the locations determined above. Selecting which one should be done according to the following criteria in descending order of precedence:<br />
'''''See [[Talk:Standard#Logical_Caret_Location|open issues]] for discussion'''''<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=MediaWiki:Sf-dl-desc&diff=9MediaWiki:Sf-dl-desc2012-04-03T03:50:05Z<p>Shachar: Created page with "Download Page"</p>
<hr />
<div>Download Page</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=MediaWiki:Sf-dl-url&diff=8MediaWiki:Sf-dl-url2012-04-03T03:49:45Z<p>Shachar: Created page with "http://sourceforge.net/projects/bidiedit/files/"</p>
<hr />
<div>http://sourceforge.net/projects/bidiedit/files/</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=MediaWiki:Sf-proj-desc&diff=7MediaWiki:Sf-proj-desc2012-04-03T03:49:10Z<p>Shachar: Created page with "BidiEdit at SourceForge"</p>
<hr />
<div>BidiEdit at SourceForge</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=MediaWiki:Sf-url&diff=6MediaWiki:Sf-url2012-04-03T03:48:48Z<p>Shachar: Created page with "http://sourceforge.net/projects/bidiedit"</p>
<hr />
<div>http://sourceforge.net/projects/bidiedit</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=MediaWiki:Sidebar&diff=5MediaWiki:Sidebar2012-04-03T03:48:12Z<p>Shachar: Created page with "* navigation ** mainpage|mainpage-description ** standard|Proposed standard ** sf-dl-url|sf-dl-desc ** sf-url|sf-proj-desc ** recentchanges-url|recentchanges ** randompage-url..."</p>
<hr />
<div>* navigation<br />
** mainpage|mainpage-description<br />
** standard|Proposed standard<br />
** sf-dl-url|sf-dl-desc<br />
** sf-url|sf-proj-desc<br />
** recentchanges-url|recentchanges<br />
** randompage-url|randompage<br />
** helppage|help<br />
* SEARCH<br />
* TOOLBOX<br />
* LANGUAGES</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=MediaWiki:Navigation&diff=4MediaWiki:Navigation2012-04-03T03:47:41Z<p>Shachar: Created page with "Navigation Menu"</p>
<hr />
<div>Navigation Menu</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=MediaWiki:Lingnu-url&diff=3MediaWiki:Lingnu-url2012-04-03T03:46:24Z<p>Shachar: Created page with "http://www.lingnu.com"</p>
<hr />
<div>http://www.lingnu.com</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Main_Page&diff=2Main Page2012-04-03T03:44:23Z<p>Shachar: Transfer content from previous site</p>
<hr />
<div>This site is mainly aimed at a colloborative effort to create an updated version of [http://www.sii.org.il/488-he/SII.aspx?standard_num=1051940000 SI5194]. The existing version can be viewed at http://imagic.weizmann.ac.il/~dov/Hebrew/logicUI24.htm<br />
<br />
A work in progress copy of the new standard can be found [[Standard|here]].<br />
<br />
==Version 0.02 Released==<br />
Version 0.02, dubbed "Still misleading release", is now available for download from our [http://sourceforge.net/projects/bidiedit/files/Version-0.02/ download page] on sourceforge. This version is a bug fix version only, fixing the Windows installer problem described for Version 0.01, as well as a few other odd bugs. It does not, yet, support BiDi, and is just a basic text editor.<br />
<br />
==Version 0.01 Released==<br />
Version 0.01, dubbed "Totally misleading release", <del>is now available for download from our download page on sourceforge</del>. This version does not, yet, support BiDi, and is just a basic text editor. It is mostly intended to make sure the core text editing functionality is complete, and to start receiving bug reports.<br />
<br />
'''WARNING''': The installer for Windows of this version takes control over the .txt file extension, which is, almost certainly, not what you want. Install version 0.02 instead!</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=62Standard2010-08-19T12:23:49Z<p>Shachar: Spelling mistake at title</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
An ambiguous caret will happen if and only if the character before/to the left of the caret has a different BiDi level than the character after/to the right of the caret. If the caret is before the first or after the last character of the paragraph, the missing character's BiDi level is ''sor'' or ''eor'' respectively, where sor and eor are defined by the UBA. For all practical purposes, sor and eor are at either level 0 or 1, depending on the paragraph base direction (0 for left to right, 1 for right to left). The difference between the BiDi levels on both sides of the caret is called the '''level differential'''.<br />
<br />
===Ambiguous Logical Caret===<br />
An ambiguous logical caret will have two visual positions. One position is right after the character before the caret, and the other is right before the character after the caret. In this sentence, "after" and "before" do not bear their usual meaning, as visual positions are involved. Instead, "after a character" means to the right of the character if that character's BiDi level is LTR (even), or to the left of the character if that character's BiDi level is RTL (odd).<br />
<br />
Examples:<br />
<pre>english |ARABIC more english.<br />
00000000 11111100000000000000</pre><br />
<br />
Where the caret is represented by the horizontal line. The caret might, visually, be in the following positions:<br />
<pre>english |CIBARA| more english.</pre><br />
<br />
Another example. In an RTL paragraph, the following logical text:<br />
<pre>|english HEBREW more english.<br />
2222222111111112222222222221</pre><br />
<br />
The caret may be in the following positions:<br />
<br />
<pre style="text-align: right;">.more english WERBEH |english|</pre><br />
<br />
====Choosing Correct Visual Position====<br />
The visual location a specific logical position translates MUST be one of the locations determined above. Selecting which one should be done according to the following criteria in descending order of precedence:<br />
'''''See [[Talk:Standard#Logical_Caret_Location|open issues]] for discussion'''''<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=61Standard2010-08-17T12:11:11Z<p>Shachar: Place holder further discussion</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
An ambiguous caret will happen if and only if the character before/to the left of the caret has a different BiDi level than the character after/to the right of the caret. If the caret is before the first or after the last character of the paragraph, the missing character's BiDi level is ''sor'' or ''eor'' respectively, where sor and eor are defined by the UBA. For all practical purposes, sor and eor are at either level 0 or 1, depending on the paragraph base direction (0 for left to right, 1 for right to left). The difference between the BiDi levels on both sides of the caret is called the '''level differential'''.<br />
<br />
===Ambiguous Logical Caret===<br />
An ambiguous logical caret will have two visual positions. One position is right after the character before the caret, and the other is right before the character after the caret. In this sentence, "after" and "before" do not bear their usual meaning, as visual positions are involved. Instead, "after a character" means to the right of the character if that character's BiDi level is LTR (even), or to the left of the character if that character's BiDi level is RTL (odd).<br />
<br />
Examples:<br />
<pre>english |ARABIC more english.<br />
00000000 11111100000000000000</pre><br />
<br />
Where the caret is represented by the horizontal line. The caret might, visually, be in the following positions:<br />
<pre>english |CIBARA| more english.</pre><br />
<br />
Another example. In an RTL paragraph, the following logical text:<br />
<pre>|english HEBREW more english.<br />
2222222111111112222222222221</pre><br />
<br />
The caret may be in the following positions:<br />
<br />
<pre style="text-align: right;">.more english WERBEH |english|</pre><br />
<br />
====Choosing Correct Visual Positition====<br />
The visual location a specific logical position translates MUST be one of the locations determined above. Selecting which one should be done according to the following criteria in descending order of precedence:<br />
'''''See [[Talk:Standard#Logical_Caret_Location|open issues]] for discussion'''''<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=60Standard2010-08-17T09:46:28Z<p>Shachar: Ambiguous Logical Caret</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
An ambiguous caret will happen if and only if the character before/to the left of the caret has a different BiDi level than the character after/to the right of the caret. If the caret is before the first or after the last character of the paragraph, the missing character's BiDi level is ''sor'' or ''eor'' respectively, where sor and eor are defined by the UBA. For all practical purposes, sor and eor are at either level 0 or 1, depending on the paragraph base direction (0 for left to right, 1 for right to left). The difference between the BiDi levels on both sides of the caret is called the '''level differential'''.<br />
<br />
===Ambiguous Logical Caret===<br />
An ambiguous logical caret will have two visual positions. One position is right after the character before the caret, and the other is right before the character after the caret. In this sentence, "after" and "before" do not bear their usual meaning, as visual positions are involved. Instead, "after a character" means to the right of the character if that character's BiDi level is LTR (even), or to the left of the character if that character's BiDi level is RTL (odd).<br />
<br />
Examples:<br />
<pre>english |ARABIC more english.<br />
00000000 11111100000000000000</pre><br />
<br />
Where the caret is represented by the horizontal line. The caret might, visually, be in the following positions:<br />
<pre>english |CIBARA| more english.</pre><br />
<br />
Another example. In an RTL paragraph, the following logical text:<br />
<pre>|english HEBREW more english.<br />
2222222111111112222222222221</pre><br />
<br />
The caret may be in the following positions:<br />
<br />
<pre style="text-align: right;">.more english WERBEH |english|</pre><br />
<br />
====Choosing Correct Visual Positition====<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=59Standard2010-08-17T08:57:38Z<p>Shachar: /* Ambiguous Caret */</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
An ambiguous caret will happen if and only if the character before/to the left of the caret has a different BiDi level than the character after/to the right of the caret. If the caret is before the first or after the last character of the paragraph, the missing character's BiDi level is ''sor'' or ''eor'' respectively, where sor and eor are defined by the UBA. For all practical purposes, sor and eor are at either level 0 or 1, depending on the paragraph base direction (0 for left to right, 1 for right to left). The difference between the BiDi levels on both sides of the caret is called the '''level differential'''.<br />
<br />
===Ambiguous Logical Caret===<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=58Standard2010-08-17T08:55:56Z<p>Shachar: When is the caret ambiguous</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
An ambiguous caret will happen if and only if the character before/to the left of the caret has a different BiDi level than the character after/to the right of the caret. If the caret is before the first or after the last character of the paragraph, the missing character's BiDi level is sor or eor respectively, where sor and eor are defined by the UBA. For all practical purposes, sor and eor are at either level 0 or 1, depending on the paragraph base direction (0 for left to right, 1 for right to left). The difference between the BiDi levels on both sides of the caret is called the '''level differential'''.<br />
<br />
===Ambiguous Logical Caret===<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=57Standard2010-08-17T08:47:11Z<p>Shachar: Write section intro</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
An ambiguous caret is a caret, either logical or visual, that is not located on an unambiguous position. The practical upshot of this is that the implementation needs to choose where on screen to draw the logical caret, or when in the buffer to perform the operation for a visual caret. This section deals with how to determine the different points, and explains how to resolve the ambiguity.<br />
<br />
===Ambiguous Logical Caret===<br />
<br />
===Ambiguous Visual Caret===<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=56Standard2010-08-17T08:44:27Z<p>Shachar: Add ambiguous caret section</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
==Ambiguous Caret==<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=55Standard2010-08-16T11:38:56Z<p>Shachar: Correct number of levels defined by UBA</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 63 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and if a maximum must be defined, this is the number that SHOULD be used as the maximum for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 125 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=54Standard2010-08-15T13:35:57Z<p>Shachar: One logical one visual</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
===Logical Beginning and Visual End Points (Or Vice Versa)===<br />
For the sake of this discussion, we'll assume that the start point is a logical position, and the end point is visual position which is not an unambiguous position. The exact same considerations apply in the opposite case.<br />
<br />
The end marker MUST be set to be on the logical position that satisfies the following two conditions:<br />
* The last character in the selection range is adjacent to the visual position of the marker.<br />
* The marker does not have highlighted characters on both its sides.<br />
<br />
All visual positions have only one logical position that satisfies both these requirements. That position is the logical position to use for the end position.<br />
<br />
===Both Beginning and End Points Visual===<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=53Standard2010-08-15T13:09:46Z<p>Shachar: Clarify unambiguous positions role</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End, or as unambiguous positions), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=52Standard2010-08-14T20:41:42Z<p>Shachar: Correct number of levels defined by UBA, and explain</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64 (zero through 61, plus automatic increase defined by the "I" rules defined in section 3.3.5), and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
For the sake of this discussion, any position that is a visual point, selected using a visual operation, but can only be interpreted as having one logical position, is, for the sake of this discussion, a logical point.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=51Standard2010-08-14T16:27:25Z<p>Shachar: Add the positions definition, including "unambiguous position"</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
A position set by a logical operation is called a "'''logical position'''". Likewise, a position set by a visual operation is called a "'''visual position'''". A logical position that has only one visual representation can also be said to be a visual position that has only one logical representation. Such a position is agnostic to whether the operation that set it was a visual operation or a logical operation. We call such an agnostic position an "'''unambiguous position'''".<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64, and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
For the sake of this discussion, any position that is a visual point, selected using a visual operation, but can only be interpreted as having one logical position, is, for the sake of this discussion, a logical point.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=50Standard2010-08-14T16:21:43Z<p>Shachar: Select Begin and End Points</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64, and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Selection Beginning and End Points==<br />
While the selection itself is a logical operation, selecting a logically continuous region of text, the determination of the beginning and end points for the selection may or may not be logical, depending on the way in which they are selected. The selection is usually performed either by a mouse drag (i.e. - mouse press on the start point, drag the mouse to the end point, and depress the button) or a special movement modifier (such a shift + arrow key). The selection beginning and end positions are, individually, visual if the operation that set them there was visual and logical if the operation that set them there was logical. Again, regardless of how the beginning and end points are chosen, whether visual or logical, the selection itself is continuous in the logical buffer.<br />
<br />
For the sake of this discussion, any position that is a visual point, selected using a visual operation, but can only be interpreted as having one logical position, is, for the sake of this discussion, a logical point.<br />
<br />
===Logical Beginning and End Points===<br />
If both beginning and end points are chosen using a logical operation (such as Home/End), then everything is uniquely defined. The selection covers the entire range between the beginning point and the end point.<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=49Standard2010-08-14T10:54:36Z<p>Shachar: /* Linear Selection */</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64, and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Select Begin and End Points==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=48Standard2010-08-14T10:54:02Z<p>Shachar: detail the text used</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64, and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
The text without selection is "English עברית 1234 ומספרים more English עוד עברית 1234 מספרים and that's it."<br />
<br />
==Select Begin and End Points==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=47Standard2010-08-14T10:52:06Z<p>Shachar: Add example of five highlight zones in one line</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64, and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
The following image shows text with no BiDi control characters, three BiDi levels and five highlight sequences. The selection starts at the left numbers block, between the "12" and the "34", and ends at the right numbers block, between the "12" and the "34":<br />
[[Image:Multi-highlight.png|Demonstration of five highlight sequences in text with no BiDi control characters]]<br />
<br />
==Select Begin and End Points==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=46Standard2010-08-14T10:39:54Z<p>Shachar: Linear Selection</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Linear Selection=<br />
Selection is a process of highlighting part of the text. Selection is not, in itself, an operation. Rather, selection is a prerequisite step for other operations, such as deleting a section of text, or copying text into the clipboard.<br />
<br />
This standard only deals with the most basic form of selection there is, which is the linear selection. A linear selection is defined by a start and an end position, selecting all text in between the start and end position. Certain editors also support other forms of selections (most notable are the multi-linear and the rectangular selections). These are not covered by this standard.<br />
<br />
'''Linear Selection''' is a contiguous subsection of the logical buffer. The selection area MUST be continuous in the logical buffer. Of course, due to mixed directional runs, this means that the selection's highlight might be non-continuous. Editors SHOULD NOT constraint the number of visually continuous sequences a selection highlight might take, as real life is almost guaranteed to surprise you. If an upper bound must be used, then the following formula should be used:<br />
* The first and last line of the selection, each, might contain as many visual highlight blocks as there are BiDi levels. The UBA limits the number of allowed BiDi levels to 64, and this is the number that SHOULD be used for each line (first and last).<br />
* If the selection only spans one line, that single line might have as many as twice the number of highlight blocks less one (i.e. - a maximum of 127 highlight blocks).<br />
* If the selection spans more than two lines, all lines except the first and last will be completely and continuously highlighted.<br />
* Implementation SHOULD assume that the entire 64 levels might be used. In any case, implementation MUST NOT assume that less than three BiDi levels are used, as that is the number of BiDi levels that can be reached with no BiDi control characters at all. This allows for up to five highlight blocks in a single line.<br />
<br />
==Select Begin and End Points==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=45Standard2010-08-14T09:13:35Z<p>Shachar: Create an (empty) "selection" section</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Selection=<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=44Standard2010-08-14T08:59:57Z<p>Shachar: Page Up and Down Keys</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
The page up and page down operations SHOULD behave exactly like repeated up and down operations respectively.<br />
<br />
See the previous section to determine whether this is a visual or logical operation.<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=43Standard2010-08-14T08:55:56Z<p>Shachar: Left and Right Arrow Keys</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
Pressing the left key MUST move the caret one character to the left. Pressing the right key MUST move the caret one caret to the right.<br />
<br />
When a left arrow is pressed while the caret is already at the left most character of the line, the caret SHOULD move to the right most side of the line above or below, depending on the paragraph direction. If the paragraph is an LTR paragraph, pressing the left arrow key while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line above. Likewise, if the paragraph is an RTL paragraph, pressing the left arrow while the caret is on the left most character of the line SHOULD move the caret to the right most character of the line below the current line.<br />
<br />
The situation for right arrow on right most character is the reverse of the above. On LTR paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line below the current line. On RTL paragraphs, pressing the right arrow while the caret is on the right most character of the line SHOULD move the caret to the left most character of the line above the current line. '''''See [[Talk:Standard#Left.2FRight_arrows_on_line_boundary|open issues]] for discussion of alternatives'''''<br />
<br />
This is a visual operation<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=42Standard2010-08-14T07:56:04Z<p>Shachar: Text editors vs. word processors</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
==Text Editing vs. Word Processing==<br />
There are two broad categories for editing bidirectional text. One type is a text editor, and another is a word processor. The main difference, for our purposes, between the two is that a text editor produces pure text output, to be displayed strictly according to the Unicode Bidi algorithm, while a word processor is allowed to format the text in more liberal fashion. Also, a text editor is expected to output only and exactly what the user typed, while a word processor is allowed to shape the text using the UBA's control characters, or any other way.<br />
<br />
It should be noted that, while the standard does not mandate that word processors use the UBA to format text, use of the UBA is assumed in the text of this standard. Implementors SHOULD use the UBA to order the text, but if they do not, implementors MUST carry out the standard so that the end result is as if implementors did implement the UBA. For example, when we state that a word processor, when copying text from an RTL paragraph into an LTR paragraph, must embed that text within an RLE/PDF sequence, the word processor MAY use any other mechanism other than an RLE/PDF pair, but still MUST make sure that the pasted text is treated as a RTL embedded sequence. Also, when exporting pure text representation of the output, placing an RLE/PDF pair around the text is the only way to ensure it will be displayed correctly by text editors and other output means.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=41Standard2010-08-14T07:39:46Z<p>Shachar: minor formatting fix</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=40Standard2010-07-24T07:10:57Z<p>Shachar: Language cleanup and RFC 2119 adaptations</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose<br />
one editor model on implementors. Different implementations differ in parameters, such as whether it is allowed for the caret<br />
to reside outside the buffer space, behavior when moving past the buffer, and many others. It is not<br />
the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing.<br />
Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences carry BiDi implications. For example, some implementations move the caret to the beginning<br />
of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define<br />
what is the correct way to perform this caret move. This is not to say that the standard mandates, or even recommends,<br />
that the caret move in such a way. Only that, should it move, it should move in a certain way.<br />
<br />
==Notational Conventions==<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor is not to be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next action is going to take place. Common implementations use<br />
two types of carets. One type, which in modern UIs is usually referred to as "insert mode", is represented as a vertical<br />
line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative<br />
of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We<br />
shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two<br />
is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it<br />
often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions.<br />
Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the<br />
previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two<br />
positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the<br />
logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position is<br />
affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well<br />
defined visual position. That visual position may, under some circumstances, translate to more than one positions in the<br />
logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a<br />
visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position<br />
or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and<br />
the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual<br />
operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply<br />
a logical caret. For example, the sentence "Pressing the END key MUST bring the caret to the last character of the line"<br />
means that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on<br />
the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key SHOULD move the caret one character<br />
to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text<br />
area. Since a pointing device usually has a pixel accuracy, some rounding will be performed. Implementations SHOULD select<br />
the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations <br />
SHOULD NOT care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it<br />
at the end of the same line. The end of the line means positioning the caret right after the last character in the logical<br />
buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before<br />
the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys<br />
(command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between<br />
command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left<br />
translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left,<br />
command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
In either cases, this is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Some implementations have a key combination for moving the caret to the beginning/end of the current paragraph, or the<br />
beginning/end of the entire document.<br />
<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys MUST move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, some implementations choose to move it to the beginning/end<br />
of the text buffer. If this is the case, implementations MUST use the same as logic as in the "Start/End of Paragraph/Document"<br />
section above. If that is the case, the operation MUST NOT be visual.<br />
<br />
In the normal case, this is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=39Standard2010-07-24T06:50:21Z<p>Shachar: Add RFC2119 notational conventions</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose one editor model on implementors. Different implementations differ in all sorts of parameters, some of which include allowing the caret to reside outside the buffer space, behavior when moving past the buffer, and many other. It is not the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing. Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences do have BiDi implications. For example, some implementations move the caret to the beginning of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define what should happen then. This is not to say that the standard mandates, or even recommends, that the caret move in such a way. Only that if it moves, it should move in a certain way.<br />
<br />
=Notational Conventions=<br />
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.<br />
<br />
=Terms and Definitions=<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it at the end of the same line. The end of the line means positioning the caret right after the last character in the logical buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys (command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left, command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
This is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, the implementation may choose to move it to the beginning/end of the text buffer. If it so chooses, it must use the same as the "Start/End of Paragraph/Document" section above. If that is the case, the operation is not visual.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=38Standard2010-06-26T10:14:19Z<p>Shachar: Terminology fix</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose one editor model on implementors. Different implementations differ in all sorts of parameters, some of which include allowing the caret to reside outside the buffer space, behavior when moving past the buffer, and many other. It is not the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing. Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences do have BiDi implications. For example, some implementations move the caret to the beginning of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define what should happen then. This is not to say that the standard mandates, or even recommends, that the caret move in such a way. Only that if it moves, it should move in a certain way.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the logical buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it at the end of the same line. The end of the line means positioning the caret right after the last character in the logical buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys (command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left, command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
This is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, the implementation may choose to move it to the beginning/end of the text buffer. If it so chooses, it must use the same as the "Start/End of Paragraph/Document" section above. If that is the case, the operation is not visual.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=37Standard2010-06-26T09:57:23Z<p>Shachar: Remove instructions on how to implement the caret movement - out of scope</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose one editor model on implementors. Different implementations differ in all sorts of parameters, some of which include allowing the caret to reside outside the buffer space, behavior when moving past the buffer, and many other. It is not the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing. Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences do have BiDi implications. For example, some implementations move the caret to the beginning of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define what should happen then. This is not to say that the standard mandates, or even recommends, that the caret move in such a way. Only that if it moves, it should move in a certain way.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it at the end of the same line. The end of the line means positioning the caret right after the last character in the logical buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys (command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left, command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
This is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
If the caret is already on the first/last line of the text buffer, the implementation may choose to move it to the beginning/end of the text buffer. If it so chooses, it must use the same as the "Start/End of Paragraph/Document" section above. If that is the case, the operation is not visual.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=36Standard2010-06-26T09:54:27Z<p>Shachar: Home and End Keys, start/end paragraph</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose one editor model on implementors. Different implementations differ in all sorts of parameters, some of which include allowing the caret to reside outside the buffer space, behavior when moving past the buffer, and many other. It is not the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing. Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences do have BiDi implications. For example, some implementations move the caret to the beginning of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define what should happen then. This is not to say that the standard mandates, or even recommends, that the caret move in such a way. Only that if it moves, it should move in a certain way.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Home and End Keys==<br />
On most platforms, the "Home" key places the caret at the beginning of the line it is currently on, and "End" places it at the end of the same line. The end of the line means positioning the caret right after the last character in the logical buffer that is displayed on the current line. Likewise, the beginning of the line means positioning the caret right before the first character in the logical buffer that is displayed on the current line.<br />
<br />
On some platforms, most notably Apple Macintosh, the same functionality is achieved using a modifier over the arrow keys (command-right and command-left). On such platforms, the meaning is the same as above, except the mapping between command-right/left to home/end depends on the paragraph direction. If the paragraph is left to right, command-left translates to the behavior discussed above under "Home", and command-right to "End". If the paragraph is right to left, command-left translates to "End" and command-right to "Home". '''''See [[Talk:Standard#Home.2FEnd_on_Mac|open issues]]'''''<br />
<br />
This is a logical operation.<br />
<br />
==Start/End of Paragraph/Document==<br />
Moving to the beginning of the paragraph means positioning the caret right before the first character of the paragraph. Likewise, moving to the beginning of the document means positioning the caret right before the first character of the document.<br />
<br />
In a similar way, moving to the end of the paragraph/document means positioning the caret right after the last character of the paragraph/document.<br />
<br />
This is a logical operation.<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
Implementations may use the pixel location of the caret in the current line as the point by which the new caret location is calculated.<br />
<br />
If the caret is already on the first/last line of the text buffer, the caret should not move at all.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=35Standard2010-06-26T09:29:09Z<p>Shachar: Reorder the sub-sections to reduce forward references</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose one editor model on implementors. Different implementations differ in all sorts of parameters, some of which include allowing the caret to reside outside the buffer space, behavior when moving past the buffer, and many other. It is not the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing. Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences do have BiDi implications. For example, some implementations move the caret to the beginning of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define what should happen then. This is not to say that the standard mandates, or even recommends, that the caret move in such a way. Only that if it moves, it should move in a certain way.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Home and End Keys==<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
Implementations may use the pixel location of the caret in the current line as the point by which the new caret location is calculated.<br />
<br />
If the caret is already on the first/last line of the text buffer, the caret should not move at all.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=34Standard2010-06-26T09:26:18Z<p>Shachar: Highlight a todo so we do not forget it</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose one editor model on implementors. Different implementations differ in all sorts of parameters, some of which include allowing the caret to reside outside the buffer space, behavior when moving past the buffer, and many other. It is not the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing. Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences do have BiDi implications. For example, some implementations move the caret to the beginning of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define what should happen then. This is not to say that the standard mandates, or even recommends, that the caret move in such a way. Only that if it moves, it should move in a certain way.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. '''''See [[Talk:Standard#Caret|open issues]] for discussion'''''.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
Implementations may use the pixel location of the caret in the current line as the point by which the new caret location is calculated.<br />
<br />
If the caret is already on the first/last line of the text buffer, the caret should not move at all.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
==Home and End Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=33Standard2010-06-26T09:23:44Z<p>Shachar: Language corrections</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose one editor model on implementors. Different implementations differ in all sorts of parameters, some of which include allowing the caret to reside outside the buffer space, behavior when moving past the buffer, and many other. It is not the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing. Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences do have BiDi implications. For example, some implementations move the caret to the beginning of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define what should happen then. This is not to say that the standard mandates, or even recommends, that the caret move in such a way. Only that if it moves, it should move in a certain way.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. See [[Talk:Standard#Caret|open issues]] for discussion.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
Implementations may use the pixel location of the caret in the current line as the point by which the new caret location is calculated.<br />
<br />
If the caret is already on the first/last line of the text buffer, the caret should not move at all.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
==Home and End Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=32Standard2010-06-26T09:21:56Z<p>Shachar: Standard scope</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
==Scope==<br />
This standard deals with BiDi aspects of text editing. Every attempt has been made to make sure that it does not impose one editor model on implementors. Different editors differ all sorts of parameters, including allowing the caret to reside outside the buffer space, behavior when moving past the buffer, and many other parameters. It is not the intent of this standard to impose homogeneous behavior, but rather to resolve the BiDi aspects of text editing. Wherever possible, such differences are simply not mentioned in this standard.<br />
<br />
Sometimes, these differences do have BiDi implications. For example, some implementations move the caret to the beginning of the line when an up arrow is pressed while the caret is on the first line of the text buffer. This standard does define what should happen then. This is not to say that the standard mandates, or even recommends, that the caret move in such a way. Only that if it moves, it should move in a certain way.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. See [[Talk:Standard#Caret|open issues]] for discussion.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
Implementations may use the pixel location of the caret in the current line as the point by which the new caret location is calculated.<br />
<br />
If the caret is already on the first/last line of the text buffer, the caret should not move at all.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
==Home and End Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=31Standard2010-06-26T08:59:46Z<p>Shachar: What to do if nowhere to move to</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. See [[Talk:Standard#Caret|open issues]] for discussion.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
Implementations may use the pixel location of the caret in the current line as the point by which the new caret location is calculated.<br />
<br />
If the caret is already on the first/last line of the text buffer, the caret should not move at all.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
==Home and End Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=30Standard2010-06-26T08:58:23Z<p>Shachar: Up and Down Arrow Keys</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. See [[Talk:Standard#Caret|open issues]] for discussion.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
Pressing the "up" or "down" keys should move the caret one line up or down, respectively. The horizontal caret location is determined by the same algorithm as in "Positioning Using a Pointing Device" above.<br />
<br />
Implementations may use the pixel location of the caret in the current line as the point by which the new caret location is calculated.<br />
<br />
This is a visual operation.<br />
<br />
==Page Up and Down Keys==<br />
<br />
==Home and End Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shacharhttps://bidiedit.lingnu.com/index.php?title=Standard&diff=29Standard2010-06-26T08:51:29Z<p>Shachar: Create movement sub sections</p>
<hr />
<div>'''Rules for Editing This Page'''<br />
<br />
* Only registered users can edit. You can register [[Special:UserLogin|here]].<br />
* If you have a change you want to perform, simply perform it.<br />
** If you suspect the change requires explanation, place it in the [[Talk:Standard|discussion]] page.<br />
* This page is for the actual standard. Discussions about the standard, open questions, etc., go in the [[Talk:Standard|discussion]] page.<br />
<br />
__TOC__<br />
=Introduction=<br />
<br />
This document intends to trace the guidelines of a User Interface (in short: UI) for editing bidirectional (in short: Bidi) text. It is assumed that the user enters text in logical sequence, and that the Unicode Bidi Algorithm (in short: UBA) is used to reorder the text for presentation. We assume that the readers of this document have a working knowledge of the UBA. The UBA is described in Unicode Technical Report 9 (see http://www.unicode.org/unicode/reports/tr9).<br />
<br />
When designing these guidelines, the following objectives were set, in order of decreasing priority:<br />
<br />
# Prevent actions unexpected by the user, particularly when the action is destructive (erases one or more characters).<br />
# Make the interface efficient.<br />
# Keep the interface easy to implement.<br />
<br />
=Terms and Definitions=<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Abbreviations<br />
! Acronym<br />
! Meaning<br />
|-<br />
| Bidi || bidirectional<br />
|-<br />
| LTR || left-to-right<br />
|-<br />
| RTL || right-to-left<br />
|-<br />
| UBA || Unicode Bidi Algorithm<br />
|-<br />
| UI || User Interface<br />
|}<br />
<br />
{| cellspacing="0" border="1"<br />
|+ Definitions<br />
|-<br />
| Bidi Embedding Levels || The UBA assigns a level to each character in the logical buffer, including neutrals, which determines if it is part of LTR or RTL text, and eventually affects the presentation.<br />
<br />
Level 0 corresponds to base LTR text.<br />
<br />
Level 1 corresponds to base RTL text, or to RTL text embedded within level 0 LTR text.<br />
<br />
Level 2 corresponds to LTR text embedded within level 1 RTL text, itself possibly embedded within level 0 LTR text.<br />
<br />
And so on for higher levels. Even levels always correspond to LTR text, odd levels always correspond to RTL text.<br />
|-<br />
| Caret (aka Text Cursor)<br />
| Graphic representation of where actions like text entry or Delete are going to take effect. The caret is often displayed as a vertical bar.<br />
<br />
The text cursor should not be confused with the mouse cursor.<br />
<br />
Throughout this document, the term "caret" refers to the text cursor, and the term "cursor" refers to the mouse cursor.<br />
|-<br />
| Cursor Level || For the needs of the UI, a Bidi level is assigned also to the cursor. This level reflects the Bidi level which is expected to be assigned to the next character entered (there are cases when the actual level of the entered character will be different). The level of the cursor is manipulated by UI functions, like changing the keyboard language. It may also be affected by all functions which change the position of the cursor.<br />
|-<br />
| Keyboard Language || Language of the next character that will be entered from the keyboard.<br />
|-<br />
| Logical Buffer || Buffer containing the text data in logical sequence (as opposed to visual sequence).<br />
|-<br />
| Paragraph Embedding Level || Bidi level of text belonging to the main language used in a paragraph. This is 0 if the main language is LTR, 1 if the main language is RTL.<br />
Note: there is a one-to-one correspondence between the paragraph embedding level and the "Base Direction", which is the direction of the main language of a paragraph.<br />
|}<br />
<br />
=Caret=<br />
<br />
The caret's role is to represent, to the user, where the next operation is going to take place. Standard implementations use two types of carets. One type, which in modern UIs refers to insert mode, is represented as a vertical line drawn between characters. We shall refer to this form as a "'''line caret'''". The other, typically representative of overwrite mode, is displayed as an underline beneath the character, or as a block which highlights the character. We shall refer to it as "'''block caret'''".<br />
<br />
Some archaic implementations use a block caret for both insert and overwrite mode. Modern implementations, however, use the line caret almost exclusively. This document assumes a line caret mode unless explicitly stated otherwise.<br />
<br />
A line caret occupies zero space, and is always between two displayed characters (glyphs). We call this position the "'''caret location'''". Though the caret location is, logically, of zero width, display does need some actual width in order for the caret to display. See [[Talk:Standard#Caret|open issues]] for discussion.<br />
<br />
==Logical vs. Visual Operations==<br />
<br />
Throughout this document, there are references to a "visual caret" and a "logical caret". The distinction between the two is an important one. As written above, the caret is a visual indication of where actions take place. For Bidi text, it often happens that a single position in the logical buffer can be interpreted to refer to two (or more) visual positions. Typically, this is a result of whether the caret should be interpreted to be before the next character, or after the previous one. Likewise, a single visual position might, under some circumstances, be interpreted to refer to two positions in the logical buffer.<br />
<br />
At the time of this writing, all editors implement a caret that acts as a visual aid to indicate the position in the logical buffer. The caret has a definite and well-defined position in the logical buffer, and its visual position was affected from a variety of considerations. We call such a caret a "'''logical caret'''".<br />
<br />
This document also defines a new type of caret, called a "'''visual caret'''". A visual caret has a definite and well defined visual position. That visual position may, under some circumstances, translate to more than one positions in the logical buffer.<br />
<br />
Throughout this document, almost any operation will affect the caret. After each operation the caret will either be a visual caret or a logical caret. In other words, the operation will either leave a well defined logical caret position or a visual caret position. An operation that sets the logical caret position is called a "'''logical operation'''", and the caret after the operation is a logical caret. An operation that sets the visual caret position is called a "'''visual operation'''", and the caret after it is a visual caret.<br />
<br />
References like "before", "after", "first" and "last" rely on the order of characters in the logical buffer, and thus imply a logical caret. For example, the sentence "Pressing the END key must bring the caret to the last character of the line" mean that the caret following pressing "END" is a logical caret. Likewise, references such as "left" and "right" rely on the visual arrangement of the characters on screen. The sentence "Pressing the LEFT key should move the caret one character to the left" means that the caret following pressing "LEFT" is a visual caret.<br />
<br />
=Caret Movement=<br />
This section covers operations explicitly designed to move the caret around. All operations under this section do not change the text buffer.<br />
<br />
==Positioning Using a Pointing Device==<br />
The caret can be positioned using a pointing device. This is, typically, done by clicking with a mouse in or near the text area. Since a pointing device usually has a pixel accuracy, some rounding must be performed. Implementations should select the caret location that is nearest to the point where the pointing device asked to locate the caret. Implementations should not care whether the rounding is to the left or to the right.<br />
<br />
This is a visual operation.<br />
<br />
==Left and Right Arrow Keys==<br />
<br />
==Up and Down Arrow Keys==<br />
<br />
==Page Up and Down Keys==<br />
<br />
==Home and End Keys==<br />
<br />
=Index of Operations=<br />
<br />
This section lists all of the operations defined in the document above. Each operation lists the section in which it is defined, as well as whether it is a logical or visual operation (i.e. - whether the caret following performing it is a visual or logical caret).<br />
<br />
{| cellspacing="0" border="1"<br />
|-<br />
! L/V<br />
! Operation<br />
! Section<br />
! Description<br />
|-<br />
| L || Filler || name || Just a filler operation, until we have an actual operation to place here<br />
|}</div>Shachar