diacritical.txt   [plain text]


-*- coding: utf-8 -*-

This is the source of the test data used by the normalized unicode
string comparison tests.


Whole word: Ṩůḇṽḝȑšḯờṋ

Individual letters:

char    name                            NFC UCS-4      NFC UTF-8      NFD UCS-4      NFD UTF-8

Ṩ       S with dot above and below      \u1E68         \xe1\xb9\xa8   S\u0323\u0307  S\xcc\xa3\xcc\x87
ů       u with ring                     \u016F         \xc5\xaf       u\u030A        u\xcc\x8a
ḇ       b with macron below             \u1E07         \xe1\xb8\x87   b\u0331        b\xcc\xb1
ṽ       v with tilde                    \u1E7D         \xe1\xb9\xbd   v\u0303        v\xcc\x83
ḝ       e with breve and cedilla        \u1E1D         \xe1\xb8\x9d   e\u0327\u0306  e\xcc\xa7\xcc\x86
ȑ       r with double grave             \u0211         \xc8\x91       r\u030F        r\xcc\x8f
š       s with caron                    \u0161         \xc5\xa1       s\u030C        s\xcc\x8c
ḯ       i with diaeresis and acute      \u1E2F         \xe1\xb8\xaf   i\u0308\u0301  i\xcc\x88\xcc\x81
ờ       o with grave and hook           \u1EDD         \xe1\xbb\x9d   o\u031B\u0300  o\xcc\x9b\xcc\x80
ṋ       n with circumflex below         \u1E4B         \xe1\xb9\x8b   n\u032D        n\xcc\xad

Combining diacriticals:

char    name                            UCS-4          UTF-8

 ̇       dot                             \u0307         \xcc\x87
 ̣       dot below                       \u0323         \xcc\xa3
 ̊       ring                            \u030A         \xcc\x8a
 ̱       macron below                    \u0331         \xcc\xb1
 ̃       tilde                           \u0303         \xcc\x83
 ̆       breve                           \u0306         \xcc\x86
 ̧       cedilla                         \u0327         \xcc\xa7
 ̏       double grave                    \u030F         \xcc\x8f
 ̌       caron                           \u030C         \xcc\x8c
 ̈       diaeresis                       \u0308         \xcc\x88
 ́       acute                           \u0301         \xcc\x81
 ̀       grave                           \u0300         \xcc\x80
 ̛       horn                            \u031B         \xcc\x9b
 ̭       circumflex below                \u032D         \xcc\xad