The Donuts Inc. (“Donuts”) Shared Registration System (“SRS”) supports the creation of internationalized domain names (“IDNs”) that contain Unicode-supported, non-ASCII character sets.
Most domain names on the Internet are registered using ASCII characters (that is, A-Z, 0-9). However, words requiring diacritics (such as are encountered in German, Spanish, and French, for example) and words in other languages that use non-Latin scripts (such as Arabic and Chinese) cannot be displayed using ASCII. Millions of Internet users struggle with this inconvenience when navigating the Internet using non-native languages. IDNs address this frustration by enabling domain names in non-ASCII characters.
This policy for IDN registrations defines what IDNs can be supported and provides guidance on functional limitations of IDNs within the system of ICANN guidelines and other standards such as the Internationalizing Domain Names in Applications (“IDNA”) specification and Internet Engineering Task Force (“IETF”) standards.
Donuts-operated registries (each, a “Registry”) are in compliance with all of the RFC documents that comprise IDNA 2008 (RFC 5890, RFC 5891, RFC 5892, RFC 5893, and RFC 5894) as well as with the rules that define Unicode points in IDN registrations. And each Donuts-operated registry is committed to following the IETF standards as well as to supporting and deploying the latest IDN functionality as soon as possible.
The Registry’s SRS enables the creation of IDNs that contain Unicode supported non-ASCII scripts. The Registry is devoted to promoting clarity and ease in the registration process for customers interested in pursuing IDN registrations. Sections that follow in this IDN Policy describe the various guidelines the Registry has developed to handle and facilitate critical aspects of IDN registrations, including character variants and their impact on a registration, prohibited domain names and restricted Unicode code points, and protected domain names. Supported character sets, the special cases of Chinese and German variants in particular, and IDN language tags are also discussed. Finally, the complete IDN tables for the four character sets supported by the Registry are provided below as an additional resource.
Each IDN must be associated with a specific language using a language tag. You must select a two-character language tag during the registration process. The language tag you supply for any IDN application is one determining factor whether the requested IDN is supported. A requested IDN must be comprised only of code points that are present in the character set associated with the Language tag. If an application for an IDN domain is submitted without a language tag, a validation error is returned; the Language tag is not optional. At the same time, not all character sets are supported. For more details, see Supported Character Sets.
All code points within an IDN must come from the same Unicode script. The Registry does not support comingling of code points from different Unicode scripts. This is done to prevent confusable code points from appearing in the same IDN. Unicode defines a set of Unicode Scripts by assigning each code point a unique Unicode script value.
See Supported Character Sets for more information on which Unicode scripts are supported for each TLD offered by The Registry.
Supported characters sets for the TLDs offered by The Registry are determined on a per TLD basis. When an IDN registration is requested, the language tag provided by the registrar is checked against the list of languages supported by the Registry and the character-inclusion and character-variant IDN tables associated with a language. The Unicode points that comprise a registration are checked against these IDN tables to determine whether the registration is valid for a specific language. If a registration is not valid for one language, the necessary character set might still be achievable using a different language tag. The Registry is dedicated to working with applicants in achieving a solution to each IDN registration whenever possible. Please refer to the IDN tables linked at the bottom of this site for complete information on character set support for specific languages.
Handling character variants is an essential aspect of the Registry’s IDN policy and infrastructure. The following sections describe in detail how Chinese and German character variants are handled by the Registry. Other languages supported by the Registry do not impose any registration limitations.
Some Chinese and Japanese characters use different visual representations (variants) of the same character. The tables in the preceding section assumes that when one variant of a Chinese or Japanese character is registered, then other variants are blocked in that language. According to the ICANN application, in order to mitigate issues that may arise due to Chinese or Japanese variant characters, the registry will allow only a single Chinese label or Japanese label to be registered out of the set of all variant labels in the applicable language that are equivalent to it, according to the defined preferred and other variants.
The Registry will ensure effective availability check for such domain registrations, checking any proposed Chinese TLD against a canonical form of the domain name. Canonical forms of existing domain names will be stored in The Registry’s database in order to facilitate lookups.
One character in the German character table and Latin script table — the Ezset character (ß) — is handled inconsistently in specifications spanning IDNA 2003 and IDNA 2008. Under IDNA 2003, the Ezset character was converted into the string “ss”. Currently, however, it is properly encoded in punycode under IDNA 2008. The Registry supports proper punycode encoding of the Ezset character as per IDNA 2008 specifications.
ICANN’s Guidelines for the Implementation of Internationalized Domain Names lists certain characters that are not allowed in any IDN registration. For a complete listing of restricted Unicode code points for IDN, refer to Appendix B of RFC 5892, The Unicode Code Points and Internationalized Domain Names for Applications.
Scripts or characters added in Unicode versions newer than 3.2 (on which IDNA 2003 was based) may encounter interoperability issues due to the lack of software support and the Registry does not currently plan to offer registration of labels containing such scripts or characters.
There are two Unicode characters the newest encoding of which are not backward compatible with earlier versions of the IDNA specification. These are Latin Sharp S and Greek Final Sigma. These characters were previously mapped to alternate characters. The latest version of the IDNA standard does not apply this mapping and allows registries to support these two characters.
The Registry does not prohibit any specific Unicode string or sequences of Unicode characters in the registration of IDN domain names. Some IDN domain names may be reserved as specified by ICANN. Availability checks for any reserved name will return a value of “Reserved.”
The Registry does not support registration of IDNs that are not supported by one of the specific IDN character set tables published at IANA. Additionally, the IDNA2008 Specification defines rules and algorithms that prohibit certain Unicode points in IDN registrations. The Registry is in compliance with all of the documents that comprise the IDNA2008 standard.
IDN tables in use by the Registry may be found at http://www.iana.org/domains/idn-tables.
Additional information can be found at https://donuts.zendesk.com/hc/en-us/articles/115000146586-IDN-TABLES.
October 17, 2017 by Andrea Jordan
Comments Off on IDN Policy