Reach level 1 conformance with UTS #18

http://unicode.org/reports/tr18/#Basic_Unicode_Support

- [x] [RL1.1 Hex Notation](http://unicode.org/reports/tr18/#RL1.1)
  _To meet this requirement, an implementation shall supply a mechanism for specifying any Unicode code point (from U+0000 to U+10FFFF), using the hexadecimal code point representation._
- [ ] [RL1.2 Properties](http://unicode.org/reports/tr18/#RL1.2)
  _To meet this requirement, an implementation shall provide at least a minimal list of properties, consisting of the following: General_Category, Script and Script_Extensions, Alphabetic, Uppercase, Lowercase, White_Space, Noncharacter_Code_Point, Default_Ignorable_Code_Point, ANY, ASCII, ASSIGNED_
- [ ] [RL1.2a Compatibility Properties](http://unicode.org/reports/tr18/#RL1.2a)
  _To meet this requirement, an implementation shall provide the properties listed in Annex C: Compatibility Properties, with the property values as listed there. Such an implementation shall document whether it is using the Standard Recommendation or POSIX-compatible properties._
- [ ] [RL1.3 Subtraction and Intersection](http://unicode.org/reports/tr18/#RL1.3)
  _To meet this requirement, an implementation shall supply mechanisms for union, intersection and set-difference of sets of characters within regular expression character class expressions._
- [ ] [RL1.4 Simple Word Boundaries](http://unicode.org/reports/tr18/#RL1.4)
  _To meet this requirement, an implementation shall extend the word boundary mechanism so that:
The class of <word_character> includes all the Alphabetic values from the Unicode character database, from UnicodeData.txt, plus the decimals (General_Category=Decimal_Number, or equivalently Numeric_Type=Decimal), and the U+200C ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER (Join_Control=True). See also Annex C: Compatibility Properties.
Nonspacing marks are never divided from their base characters, and otherwise ignored in locating boundaries._
- [ ] [RL1.5 Simple Loose Matches](http://unicode.org/reports/tr18/#RL1.5)
  _To meet this requirement, if an implementation provides for case-insensitive matching, then it shall provide at least the simple, default Unicode case-insensitive matching, and specify which properties are closed and which are not.
To meet this requirement, if an implementation provides for case conversions, then it shall provide at least the simple, default Unicode case folding._
- [ ] [RL1.6 Line Boundaries](http://unicode.org/reports/tr18/#RL1.6)
  _To meet this requirement, if an implementation provides for line-boundary testing, it shall recognize not only CRLF, LF, CR, but also NEL (U+0085), PARAGRAPH SEPARATOR (U+2029) and LINE SEPARATOR (U+2028)._
- [x] [RL1.7 Supplementary Code Points](http://unicode.org/reports/tr18/#RL1.7)
  _To meet this requirement, an implementation shall handle the full range of Unicode code points, including values from U+FFFF to U+10FFFF. In particular, where UTF-16 is used, a sequence consisting of a leading surrogate followed by a trailing surrogate shall be handled as a single code point in matching._

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reach level 1 conformance with UTS #18 #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Reach level 1 conformance with UTS #18 #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions