Difference between revisions of "RFC8753"

From RFC-Wiki
(Created page with " Internet Engineering Task Force (IETF) J. Klensin Request for Comments: 8753 Updates: 5892...")
 
Line 1: Line 1:
 

 

 
 
  
 
Internet Engineering Task Force (IETF)                        J. Klensin
 
Internet Engineering Task Force (IETF)                        J. Klensin
Line 8: Line 6:
 
Category: Standards Track                                        Netnod
 
Category: Standards Track                                        Netnod
 
ISSN: 2070-1721                                              April 2020
 
ISSN: 2070-1721                                              April 2020
 
  
 
  Internationalized Domain Names for Applications (IDNA) Review for New
 
  Internationalized Domain Names for Applications (IDNA) Review for New
                            Unicode Versions
+
                        Unicode Versions
  
 
Abstract
 
Abstract
  
  The standards for Internationalized Domain Names in Applications
+
The standards for Internationalized Domain Names in Applications
  (IDNA) require a review of each new version of Unicode to determine
+
(IDNA) require a review of each new version of Unicode to determine
  whether incompatibilities with prior versions or other issues exist
+
whether incompatibilities with prior versions or other issues exist
  and, where appropriate, to allow the IETF to decide on the trade-offs
+
and, where appropriate, to allow the IETF to decide on the trade-offs
  between compatibility with prior IDNA versions and compatibility with
+
between compatibility with prior IDNA versions and compatibility with
  Unicode going forward.  That requirement, and its relationship to
+
Unicode going forward.  That requirement, and its relationship to
  tables maintained by IANA, has caused significant confusion in the
+
tables maintained by IANA, has caused significant confusion in the
  past.  This document makes adjustments to the review procedure based
+
past.  This document makes adjustments to the review procedure based
  on experience and updates IDNA, specifically RFC 5892, to reflect
+
on experience and updates IDNA, specifically RFC 5892, to reflect
  those changes and to clarify the various relationships involved.  It
+
those changes and to clarify the various relationships involved.  It
  also makes other minor adjustments to align that document with
+
also makes other minor adjustments to align that document with
  experience.
+
experience.
  
 
Status of This Memo
 
Status of This Memo
  
  This is an Internet Standards Track document.
+
This is an Internet Standards Track document.
  
  This document is a product of the Internet Engineering Task Force
+
This document is a product of the Internet Engineering Task Force
  (IETF).  It represents the consensus of the IETF community.  It has
+
(IETF).  It represents the consensus of the IETF community.  It has
  received public review and has been approved for publication by the
+
received public review and has been approved for publication by the
  Internet Engineering Steering Group (IESG).  Further information on
+
Internet Engineering Steering Group (IESG).  Further information on
  Internet Standards is available in Section 2 of RFC 7841.
+
Internet Standards is available in Section 2 of RFC 7841.
  
  Information about the current status of this document, any errata,
+
Information about the current status of this document, any errata,
  and how to provide feedback on it may be obtained at
+
and how to provide feedback on it may be obtained at
  https://www.rfc-editor.org/info/rfc8753.
+
https://www.rfc-editor.org/info/rfc8753.
  
 
Copyright Notice
 
Copyright Notice
  
  Copyright (c) 2020 IETF Trust and the persons identified as the
+
Copyright (c) 2020 IETF Trust and the persons identified as the
  document authors.  All rights reserved.
+
document authors.  All rights reserved.
  
  This document is subject to BCP 78 and the IETF Trust's Legal
+
This document is subject to BCP 78 and the IETF Trust's Legal
  Provisions Relating to IETF Documents
+
Provisions Relating to IETF Documents
  (https://trustee.ietf.org/license-info) in effect on the date of
+
(https://trustee.ietf.org/license-info) in effect on the date of
  publication of this document.  Please review these documents
+
publication of this document.  Please review these documents
  carefully, as they describe your rights and restrictions with respect
+
carefully, as they describe your rights and restrictions with respect
  to this document.  Code Components extracted from this document must
+
to this document.  Code Components extracted from this document must
  include Simplified BSD License text as described in Section 4.e of
+
include Simplified BSD License text as described in Section 4.e of
  the Trust Legal Provisions and are provided without warranty as
+
the Trust Legal Provisions and are provided without warranty as
  described in the Simplified BSD License.
+
described in the Simplified BSD License.
  
 
Table of Contents
 
Table of Contents
  
  1.  Introduction
+
1.  Introduction
  2.  Brief History of IDNA Versions, the Review Requirement, and RFC
+
2.  Brief History of IDNA Versions, the Review Requirement, and RFC
          5982
+
        5982
  3.  The Review Model
+
3.  The Review Model
    3.1.  Review Model Part I: Algorithmic Comparison
+
  3.1.  Review Model Part I: Algorithmic Comparison
    3.2.  Review Model Part II: New Code Point Analysis
+
  3.2.  Review Model Part II: New Code Point Analysis
  4.  IDNA Assumptions and Current Practice
+
4.  IDNA Assumptions and Current Practice
  5.  Derived Tables Published by IANA
+
5.  Derived Tables Published by IANA
  6.  Editorial Clarification to RFC 5892
+
6.  Editorial Clarification to RFC 5892
  7.  IANA Considerations
+
7.  IANA Considerations
  8.  Security Considerations
+
8.  Security Considerations
  9.  References
+
9.  References
    9.1.  Normative References
+
  9.1.  Normative References
    9.2.  Informative References
+
  9.2.  Informative References
  Appendix A.  Summary of Changes to RFC 5892
+
Appendix A.  Summary of Changes to RFC 5892
  Appendix B.  Background and Rationale for Expert Review Procedure
+
Appendix B.  Background and Rationale for Expert Review Procedure
          for New Code Point Analysis
+
        for New Code Point Analysis
  Acknowledgments
+
Acknowledgments
  Authors' Addresses
+
Authors' Addresses
  
1.  Introduction
+
== Introduction ==
  
  The standards for Internationalized Domain Names in Applications
+
The standards for Internationalized Domain Names in Applications
  (IDNA) require a review of each new version of Unicode to determine
+
(IDNA) require a review of each new version of Unicode to determine
  whether incompatibilities with prior versions or other issues exist
+
whether incompatibilities with prior versions or other issues exist
  and, where appropriate, to allow the IETF to decide on the trade-offs
+
and, where appropriate, to allow the IETF to decide on the trade-offs
  between compatibility with prior IDNA versions and compatibility with
+
between compatibility with prior IDNA versions and compatibility with
  Unicode [Unicode] going forward.  That requirement, and its
+
Unicode [Unicode] going forward.  That requirement, and its
  relationship to tables maintained by IANA, has caused significant
+
relationship to tables maintained by IANA, has caused significant
  confusion in the past (see Section 3 and Section 4 for additional
+
confusion in the past (see Section 3 and Section 4 for additional
  discussion of the question of appropriate decisions and the history
+
discussion of the question of appropriate decisions and the history
  of these reviews).  This document makes adjustments to the review
+
of these reviews).  This document makes adjustments to the review
  procedure based on nearly a decade of experience and updates IDNA,
+
procedure based on nearly a decade of experience and updates IDNA,
  specifically the document that specifies the relationship between
+
specifically the document that specifies the relationship between
  Unicode code points and IDNA derived properties [RFC5892], to reflect
+
Unicode code points and IDNA derived properties [RFC5892], to reflect
  those changes and to clarify the various relationships involved.
+
those changes and to clarify the various relationships involved.
  
  This specification does not change the requirement that registries at
+
This specification does not change the requirement that registries at
  all levels of the DNS tree take responsibility for the labels they
+
all levels of the DNS tree take responsibility for the labels they
  insert in the DNS, a level of responsibility that requires allowing
+
insert in the DNS, a level of responsibility that requires allowing
  only a subset of the code points and strings allowed by the IDNA
+
only a subset of the code points and strings allowed by the IDNA
  protocol itself.  That requirement is discussed in more detail in a
+
protocol itself.  That requirement is discussed in more detail in a
  companion document [RegRestr].
+
companion document [RegRestr].
  
  Terminology note: In this document, "IDNA" refers to the current
+
Terminology note: In this document, "IDNA" refers to the current
  version as described in RFC 5890 [RFC5890] and subsequent documents
+
version as described in RFC 5890 [RFC5890] and subsequent documents
  and sometimes known as "IDNA2008".  Distinctions between it and the
+
and sometimes known as "IDNA2008".  Distinctions between it and the
  earlier version are explicit only where they are necessary for
+
earlier version are explicit only where they are necessary for
  understanding the relationships involved, e.g., in Section 2.
+
understanding the relationships involved, e.g., in Section 2.
  
2.  Brief History of IDNA Versions, the Review Requirement, and RFC 5982
+
== Brief History of IDNA Versions, the Review Requirement, and RFC 5982 ==
  
  The original, now-obsolete, version of IDNA, commonly known as
+
The original, now-obsolete, version of IDNA, commonly known as
  "IDNA2003" [RFC3490] [RFC3491], was defined in terms of a profile of
+
"IDNA2003" [RFC3490] [RFC3491], was defined in terms of a profile of
  a collection of IETF-specific tables [RFC3454] that specified the
+
a collection of IETF-specific tables [RFC3454] that specified the
  usability of each Unicode code point with IDNA.  Because the tables
+
usability of each Unicode code point with IDNA.  Because the tables
  themselves were normative, they were intrinsically tied to a
+
themselves were normative, they were intrinsically tied to a
  particular version of Unicode.  As Unicode evolved, the IDNA2003
+
particular version of Unicode.  As Unicode evolved, the IDNA2003
  standard would have required the creation of a new profile for each
+
standard would have required the creation of a new profile for each
  new version of Unicode, or the tables would have fallen further and
+
new version of Unicode, or the tables would have fallen further and
  further behind.
+
further behind.
  
  When IDNA2003 was superseded by the current version, known as
+
When IDNA2003 was superseded by the current version, known as
  IDNA2008 [RFC5890], a different strategy, one that was property-based
+
IDNA2008 [RFC5890], a different strategy, one that was property-based
  rather than table-based, was adopted for a number of reasons, of
+
rather than table-based, was adopted for a number of reasons, of
  which the reliance on normative tables was not dominant [RFC4690].
+
which the reliance on normative tables was not dominant [RFC4690].
  In the IDNA2008 model, the use of normative tables was replaced by a
+
In the IDNA2008 model, the use of normative tables was replaced by a
  set of procedures and rules that operated on Unicode properties
+
set of procedures and rules that operated on Unicode properties
  [Unicode-properties] and a few internal definitions to determine the
+
[Unicode-properties] and a few internal definitions to determine the
  category and status, and hence an IDNA-specific "derived property",
+
category and status, and hence an IDNA-specific "derived property",
  for any given code point.  Those rules are, in principle, independent
+
for any given code point.  Those rules are, in principle, independent
  of Unicode versions.  They can be applied to any version of Unicode,
+
of Unicode versions.  They can be applied to any version of Unicode,
  at least from approximately version 5.0 forward, to yield an
+
at least from approximately version 5.0 forward, to yield an
  appropriate set of derived properties.  However, the working group
+
appropriate set of derived properties.  However, the working group
  that defined IDNA2008 recognized that not all of the Unicode
+
that defined IDNA2008 recognized that not all of the Unicode
  properties were completely stable and that, because the criteria for
+
properties were completely stable and that, because the criteria for
  new code points and property assignment used by the Unicode
+
new code points and property assignment used by the Unicode
  Consortium might not precisely align with the needs of IDNA, there
+
Consortium might not precisely align with the needs of IDNA, there
  were possibilities of incompatible changes to the derived property
+
were possibilities of incompatible changes to the derived property
  values.  More specifically, there could be changes that would make
+
values.  More specifically, there could be changes that would make
  previously disallowed labels valid, previously valid labels
+
previously disallowed labels valid, previously valid labels
  disallowed, or that would be disruptive to IDNA's defining rule
+
disallowed, or that would be disruptive to IDNA's defining rule
  structure.  Consequently, IDNA2008 provided for an expert review of
+
structure.  Consequently, IDNA2008 provided for an expert review of
  each new version of Unicode with the possibility of providing
+
each new version of Unicode with the possibility of providing
  exceptions to the rules for particular new code points, code points
+
exceptions to the rules for particular new code points, code points
  whose properties had changed, and newly discovered issues with the
+
whose properties had changed, and newly discovered issues with the
  IDNA2008 collection of rules.  When problems were identified, the
+
IDNA2008 collection of rules.  When problems were identified, the
  reviewer was expected to notify the IESG.  The assumption was that
+
reviewer was expected to notify the IESG.  The assumption was that
  the IETF would review the situation and modify IDNA2008 as needed,
+
the IETF would review the situation and modify IDNA2008 as needed,
  most likely by adding exceptions to preserve backward compatibility
+
most likely by adding exceptions to preserve backward compatibility
  (see Section 3.1).
+
(see Section 3.1).
  
  For the convenience of the community, IDNA2008 also provided that
+
For the convenience of the community, IDNA2008 also provided that
  IANA would maintain copies of calculated tables resulting from each
+
IANA would maintain copies of calculated tables resulting from each
  review, showing the derived properties for each code point.  Those
+
review, showing the derived properties for each code point.  Those
  tables were expected to be helpful, especially to those without the
+
tables were expected to be helpful, especially to those without the
  facilities to easily compute derived properties themselves.
+
facilities to easily compute derived properties themselves.
  Experience with the community and those tables has shown that they
+
Experience with the community and those tables has shown that they
  have been confused with the normative tables of IDNA2003: the
+
have been confused with the normative tables of IDNA2003: the
  IDNA2008 tables published by IANA have never been normative, and
+
IDNA2008 tables published by IANA have never been normative, and
  statements about IDNA2008 being out of date with regard to some
+
statements about IDNA2008 being out of date with regard to some
  Unicode version because the IANA tables have not been updated are
+
Unicode version because the IANA tables have not been updated are
  incorrect or meaningless.
+
incorrect or meaningless.
  
3.  The Review Model
+
== The Review Model ==
  
  While the text has sometimes been interpreted differently, IDNA2008
+
While the text has sometimes been interpreted differently, IDNA2008
  actually calls for two types of review when a new Unicode version is
+
actually calls for two types of review when a new Unicode version is
  introduced.  One is an algorithmic comparison of the set of derived
+
introduced.  One is an algorithmic comparison of the set of derived
  properties calculated from the new version of Unicode to the derived
+
properties calculated from the new version of Unicode to the derived
  properties calculated from the previous one to determine whether
+
properties calculated from the previous one to determine whether
  incompatible changes have occurred.  The other is a review of newly
+
incompatible changes have occurred.  The other is a review of newly
  assigned code points to determine whether any of them require special
+
assigned code points to determine whether any of them require special
  treatment (e.g., assignment of what IDNA2008 calls contextual rules)
+
treatment (e.g., assignment of what IDNA2008 calls contextual rules)
  and whether any of them violate any of the assumptions underlying the
+
and whether any of them violate any of the assumptions underlying the
  IDNA2008 derived property calculations.  Any of the cases of either
+
IDNA2008 derived property calculations.  Any of the cases of either
  review might require either per-code point exceptions or other
+
review might require either per-code point exceptions or other
  adjustments to the rules for deriving properties that are part of RFC
+
adjustments to the rules for deriving properties that are part of RFC
  5892.  The subsections below provide a revised specification for the
+
5892.  The subsections below provide a revised specification for the
  review procedure.
+
review procedure.
  
  Unless the IESG or the designated expert team concludes that there
+
Unless the IESG or the designated expert team concludes that there
  are special problems or unusual circumstances, these reviews will be
+
are special problems or unusual circumstances, these reviews will be
  performed only for major Unicode versions (those numbered NN.0, e.g.,
+
performed only for major Unicode versions (those numbered NN.0, e.g.,
  12.0) and not for minor updates (e.g., 12.1).
+
12.0) and not for minor updates (e.g., 12.1).
  
  As can be seen in the detailed descriptions in the following
+
As can be seen in the detailed descriptions in the following
  subsections, proper review will require a team of experts that has
+
subsections, proper review will require a team of experts that has
  both broad and specific skills in reviewing Unicode characters and
+
both broad and specific skills in reviewing Unicode characters and
  their properties in relation to both the written standards and
+
their properties in relation to both the written standards and
  operational needs.  The IESG will need to appoint experts who can
+
operational needs.  The IESG will need to appoint experts who can
  draw on the broader community to obtain the necessary skills for
+
draw on the broader community to obtain the necessary skills for
  particular situations.  See the IANA Considerations (Section 7) for
+
particular situations.  See the IANA Considerations (Section 7) for
  details.
+
details.
  
3.1.  Review Model Part I: Algorithmic Comparison
+
=== Review Model Part I: Algorithmic Comparison ===
  
  Section 5.1 of RFC 5892 is the description of the process for
+
Section 5.1 of RFC 5892 is the description of the process for
  creating the initial IANA tables.  It is noteworthy that, while it
+
creating the initial IANA tables.  It is noteworthy that, while it
  can be read as strongly implying new reviews and new tables for
+
can be read as strongly implying new reviews and new tables for
  versions of Unicode after 5.2, it does not explicitly specify those
+
versions of Unicode after 5.2, it does not explicitly specify those
  reviews or, e.g., the timetable for completing them.  It also
+
reviews or, e.g., the timetable for completing them.  It also
  indicates that incompatibilities are to be "flagged for the IESG" but
+
indicates that incompatibilities are to be "flagged for the IESG" but
  does not specify exactly what the IESG is to do about them and when.
+
does not specify exactly what the IESG is to do about them and when.
  For reasons related to the other type of review and discussed below,
+
For reasons related to the other type of review and discussed below,
  only one review was completed, documented [RFC6452], and a set of
+
only one review was completed, documented [RFC6452], and a set of
  corresponding new tables installed.  That review, which was for
+
corresponding new tables installed.  That review, which was for
  Unicode 6.0, found only three incompatibilities; the consensus was to
+
Unicode 6.0, found only three incompatibilities; the consensus was to
  ignore them (not create exceptions in IDNA2008) and to remain
+
ignore them (not create exceptions in IDNA2008) and to remain
  consistent with computations based on current (Unicode 6.0)
+
consistent with computations based on current (Unicode 6.0)
  properties rather than preserving backward compatibility within IDNA.
+
properties rather than preserving backward compatibility within IDNA.
  The 2018 review (for Unicode 11.0 and versions in between it and 6.0)
+
The 2018 review (for Unicode 11.0 and versions in between it and 6.0)
  [IDNA-Unicode12] also concluded that Unicode compatibility, rather
+
[IDNA-Unicode12] also concluded that Unicode compatibility, rather
  than IDNA backward compatibility, should be maintained.  That
+
than IDNA backward compatibility, should be maintained.  That
  decision was partially driven by the long period between reviews and
+
decision was partially driven by the long period between reviews and
  the concern that table calculations by others in the interim could
+
the concern that table calculations by others in the interim could
  result in unexpected incompatibilities if derived property
+
result in unexpected incompatibilities if derived property
  definitions were then changed.  See Section 4 for further discussion
+
definitions were then changed.  See Section 4 for further discussion
  of these preferences.
+
of these preferences.
  
3.2.  Review Model Part II: New Code Point Analysis
+
=== Review Model Part II: New Code Point Analysis ===
  
  The second type of review, which is not clearly explained in RFC
+
The second type of review, which is not clearly explained in RFC
  5892, is intended to identify cases in which newly added or recently
+
5892, is intended to identify cases in which newly added or recently
  discovered problematic code points violate the design assumptions of
+
discovered problematic code points violate the design assumptions of
  IDNA, to identify defects in those assumptions, or to identify
+
IDNA, to identify defects in those assumptions, or to identify
  inconsistencies (from an IDNA perspective) with Unicode commitments
+
inconsistencies (from an IDNA perspective) with Unicode commitments
  about assignment, properties, and stability of newly added code
+
about assignment, properties, and stability of newly added code
  points.  One example of this type of review was the discovery of new
+
points.  One example of this type of review was the discovery of new
  code points after Unicode 7.0 that were potentially visually
+
code points after Unicode 7.0 that were potentially visually
  equivalent, in the same script, to previously available code point
+
equivalent, in the same script, to previously available code point
  sequences [IAB-Unicode7-2015] [IDNA-Unicode7].
+
sequences [IAB-Unicode7-2015] [IDNA-Unicode7].
  
  Because multiple perspectives on Unicode and writing systems are
+
Because multiple perspectives on Unicode and writing systems are
  required, this review will not be successful unless it is done by a
+
required, this review will not be successful unless it is done by a
  team.  Finding one all-knowing expert is improbable, and a single
+
team.  Finding one all-knowing expert is improbable, and a single
  expert is unlikely to produce an adequate analysis.  Rather than any
+
expert is unlikely to produce an adequate analysis.  Rather than any
  single expert being the sole source of analysis, the designated
+
single expert being the sole source of analysis, the designated
  expert (DE) team needs to understand that there will always be gaps
+
expert (DE) team needs to understand that there will always be gaps
  in their knowledge, to know what they don't know, and to work to find
+
in their knowledge, to know what they don't know, and to work to find
  the expertise that each review requires.  It is also important that
+
the expertise that each review requires.  It is also important that
  the DE team maintains close contact with the Area Directors (ADs) and
+
the DE team maintains close contact with the Area Directors (ADs) and
  that the ADs remain aware of the team's changing needs, examining and
+
that the ADs remain aware of the team's changing needs, examining and
  adjusting the team's membership over time, with periodic
+
adjusting the team's membership over time, with periodic
  reexamination at least annually.  It should also be recognized that,
+
reexamination at least annually.  It should also be recognized that,
  if this review identifies a problem, that problem is likely to be
+
if this review identifies a problem, that problem is likely to be
  complex and/or involve multiple trade-offs.  Actions to deal with it
+
complex and/or involve multiple trade-offs.  Actions to deal with it
  are likely to be disruptive (although perhaps not to large
+
are likely to be disruptive (although perhaps not to large
  communities of users), or to leave security risks (opportunities for
+
communities of users), or to leave security risks (opportunities for
  attacks and inadvertent confusion as expected matches do not occur),
+
attacks and inadvertent confusion as expected matches do not occur),
  or to cause excessive reliance on registries understanding and taking
+
or to cause excessive reliance on registries understanding and taking
  responsibility for what they are registering [RFC5894] [RegRestr].
+
responsibility for what they are registering [RFC5894] [RegRestr].
  The latter, while a requirement of IDNA, has often not worked out
+
The latter, while a requirement of IDNA, has often not worked out
  well in the past.
+
well in the past.
  
  Because resolution of problems identified by this part of the review
+
Because resolution of problems identified by this part of the review
  may take some time even if that resolution is to add additional
+
may take some time even if that resolution is to add additional
  contextual rules or to disallow one or more code points, there will
+
contextual rules or to disallow one or more code points, there will
  be cases in which it will be appropriate to publish the results of
+
be cases in which it will be appropriate to publish the results of
  the algorithmic review and to provide IANA with corresponding tables,
+
the algorithmic review and to provide IANA with corresponding tables,
  with warnings about code points whose status is uncertain until there
+
with warnings about code points whose status is uncertain until there
  is IETF consensus about how to proceed.  The affected code points
+
is IETF consensus about how to proceed.  The affected code points
  should be considered unsafe and identified as "under review" in the
+
should be considered unsafe and identified as "under review" in the
  IANA tables until final derived properties are assigned.
+
IANA tables until final derived properties are assigned.
  
4.  IDNA Assumptions and Current Practice
+
== IDNA Assumptions and Current Practice ==
  
  At the time the IDNA2008 documents were written, the assumption was
+
At the time the IDNA2008 documents were written, the assumption was
  that, if new versions of Unicode introduced incompatible changes, the
+
that, if new versions of Unicode introduced incompatible changes, the
  Standard would be updated to preserve backward compatibility for
+
Standard would be updated to preserve backward compatibility for
  users of IDNA.  For most purposes, this would be done by adding to
+
users of IDNA.  For most purposes, this would be done by adding to
  the table of exceptions associated with Rule G [RFC5892a].
+
the table of exceptions associated with Rule G [RFC5892a].
  
  This has not been the practice in the reviews completed subsequent to
+
This has not been the practice in the reviews completed subsequent to
  Unicode 5.2, as discussed in Section 3.  Incompatibilities were
+
Unicode 5.2, as discussed in Section 3.  Incompatibilities were
  identified in Unicode 6.0 [RFC6452] and in the cumulative review
+
identified in Unicode 6.0 [RFC6452] and in the cumulative review
  leading to tables for Unicode 11.0 [IDNA-Unicode11].  In all of those
+
leading to tables for Unicode 11.0 [IDNA-Unicode11].  In all of those
  cases, the decision was made to maintain compatibility with Unicode
+
cases, the decision was made to maintain compatibility with Unicode
  properties rather than with prior versions of IDNA.
+
properties rather than with prior versions of IDNA.
  
  If an algorithmic review detects changes in Unicode after version
+
If an algorithmic review detects changes in Unicode after version
  12.0 that would break compatibility with derived properties
+
12.0 that would break compatibility with derived properties
  associated with prior versions of Unicode or changes that would
+
associated with prior versions of Unicode or changes that would
  preserve compatibility within IDNA at the cost of departing from
+
preserve compatibility within IDNA at the cost of departing from
  current Unicode specifications, those changes must be captured in
+
current Unicode specifications, those changes must be captured in
  documents expected to be published as Standards Track RFCs so that
+
documents expected to be published as Standards Track RFCs so that
  the IETF can review those changes and maintain a historical record.
+
the IETF can review those changes and maintain a historical record.
  
  The community has now made decisions and updated tables for Unicode
+
The community has now made decisions and updated tables for Unicode
  6.0 [RFC6452], done catch-up work between it and Unicode 11.0
+
6.0 [RFC6452], done catch-up work between it and Unicode 11.0
  [IDNA-Unicode11], and completed the review and tables for Unicode
+
[IDNA-Unicode11], and completed the review and tables for Unicode
  12.0 [IDNA-Unicode12].  The decisions made in those cases were driven
+
12.0 [IDNA-Unicode12].  The decisions made in those cases were driven
  by preserving consistency with Unicode and Unicode property changes
+
by preserving consistency with Unicode and Unicode property changes
  for reasons most clearly explained by the IAB [IAB-Unicode-2018].
+
for reasons most clearly explained by the IAB [IAB-Unicode-2018].
  These actions were not only at variance with the language in RFC 5892
+
These actions were not only at variance with the language in RFC 5892
  but were also inconsistent with commitments to the registry and user
+
but were also inconsistent with commitments to the registry and user
  communities to ensure that IDN labels that were once valid under
+
communities to ensure that IDN labels that were once valid under
  IDNA2008 would remain valid, and previously invalid labels would
+
IDNA2008 would remain valid, and previously invalid labels would
  remain invalid, except for those labels that were invalid because
+
remain invalid, except for those labels that were invalid because
  they contained unassigned code points.
+
they contained unassigned code points.
  
  This document restores and clarifies that original language and
+
This document restores and clarifies that original language and
  intent: absent extremely strong evidence on a per-code point basis
+
intent: absent extremely strong evidence on a per-code point basis
  that preserving the validity status of possible existing (or
+
that preserving the validity status of possible existing (or
  prohibited) labels would cause significant harm, Unicode changes that
+
prohibited) labels would cause significant harm, Unicode changes that
  would affect IDNA derived properties are to be reflected in IDNA
+
would affect IDNA derived properties are to be reflected in IDNA
  exceptions that preserve the status of those labels.  There is one
+
exceptions that preserve the status of those labels.  There is one
  partial exception to this principle.  If the new code point analysis
+
partial exception to this principle.  If the new code point analysis
  (see Section 3.2) concludes that some code points or collections of
+
(see Section 3.2) concludes that some code points or collections of
  code points should be further analyzed, those code points, and labels
+
code points should be further analyzed, those code points, and labels
  including them, should be considered unsafe and used only with
+
including them, should be considered unsafe and used only with
  extreme caution because the conclusions of the analysis may change
+
extreme caution because the conclusions of the analysis may change
  their derived property values and status.
+
their derived property values and status.
  
5.  Derived Tables Published by IANA
+
== Derived Tables Published by IANA ==
  
  As discussed above, RFC 5892 specified that derived property tables
+
As discussed above, RFC 5892 specified that derived property tables
  be provided via an IANA registry.  Perhaps because most IANA
+
be provided via an IANA registry.  Perhaps because most IANA
  registries are considered normative and authoritative, that registry
+
registries are considered normative and authoritative, that registry
  has been the source of considerable confusion, including the
+
has been the source of considerable confusion, including the
  incorrect assumption that the absence of published tables for
+
incorrect assumption that the absence of published tables for
  versions of Unicode later than 6.0 meant that IDNA could not be used
+
versions of Unicode later than 6.0 meant that IDNA could not be used
  with later versions.  That position was raised in multiple ways, not
+
with later versions.  That position was raised in multiple ways, not
  all of them consistent, especially in the ICANN context
+
all of them consistent, especially in the ICANN context
  [ICANN-LGR-SLA].
+
[ICANN-LGR-SLA].
  
  If the changes specified in this document are not successful in
+
If the changes specified in this document are not successful in
  significantly mitigating the confusion about the status of the tables
+
significantly mitigating the confusion about the status of the tables
  published by IANA, serious consideration should be given to
+
published by IANA, serious consideration should be given to
  eliminating those tables entirely.
+
eliminating those tables entirely.
  
6.  Editorial Clarification to RFC 5892
+
== Editorial Clarification to RFC 5892 ==
  
  This section updates RFC 5892 to provide fixes for known applicable
+
This section updates RFC 5892 to provide fixes for known applicable
  errata and omissions.  In particular, verified RFC Editor Erratum
+
errata and omissions.  In particular, verified RFC Editor Erratum
  3312 [Err3312] provides a clarification to Appendix A and A.1 in RFC
+
3312 [Err3312] provides a clarification to Appendix A and A.1 in RFC
  5892.  That clarification is incorporated below.
+
5892.  That clarification is incorporated below.
  
  1.  In Appendix A, add a new paragraph after the paragraph that
+
1.  In Appendix A, add a new paragraph after the paragraph that
      begins "The code point...".  The new paragraph should read:
+
    begins "The code point...".  The new paragraph should read:
  
      |  For the rule to be evaluated to True for the label, it MUST be
+
    |  For the rule to be evaluated to True for the label, it MUST be
      |  evaluated separately for every occurrence of the code point in
+
    |  evaluated separately for every occurrence of the code point in
      |  the label; each of those evaluations must result in True.
+
    |  the label; each of those evaluations must result in True.
  
  2.  In Appendix A.1, replace the "Rule Set" by
+
2.  In Appendix A.1, replace the "Rule Set" by
  
          Rule Set:
+
        Rule Set:
            False;
+
          False;
            If Canonical_Combining_Class(Before(cp)) .eq. Virama
+
          If Canonical_Combining_Class(Before(cp)) .eq. Virama
                  Then True;
+
                Then True;
            If cp .eq. \u200C And
+
          If cp .eq. \u200C And
                    RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*cp
+
                RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*cp
              (Joining_Type:T)*(Joining_Type:{R,D})) Then True;
+
            (Joining_Type:T)*(Joining_Type:{R,D})) Then True;
  
7.  IANA Considerations
+
== IANA Considerations ==
  
  For the algorithmic review described in Section 3.1, the IESG is to
+
For the algorithmic review described in Section 3.1, the IESG is to
  appoint a designated expert [RFC8126] with appropriate expertise to
+
appoint a designated expert [RFC8126] with appropriate expertise to
  conduct the review and to supply derived property tables to IANA.  As
+
conduct the review and to supply derived property tables to IANA.  As
  provided in Section 5.2 of the Guidelines for Writing IANA
+
provided in Section 5.2 of the Guidelines for Writing IANA
  Considerations [RFC8126], the designated expert is expected to
+
Considerations [RFC8126], the designated expert is expected to
  consult additional sources of expertise as needed.  For the code
+
consult additional sources of expertise as needed.  For the code
  point review, the expertise will be supplied by an IESG-designated
+
point review, the expertise will be supplied by an IESG-designated
  expert team as discussed in Section 3.2 and Appendix B.  In both
+
expert team as discussed in Section 3.2 and Appendix B.  In both
  cases, the experts should draw on the expertise of other members of
+
cases, the experts should draw on the expertise of other members of
  the community as needed.  In particular, and especially if there is
+
the community as needed.  In particular, and especially if there is
  no overlap of the people holding the various roles, coordination with
+
no overlap of the people holding the various roles, coordination with
  the IAB-appointed liaison to the Unicode Consortium will be essential
+
the IAB-appointed liaison to the Unicode Consortium will be essential
  to mitigate possible errors due to confusion.
+
to mitigate possible errors due to confusion.
  
  As discussed in Section 5, IANA has modified the IDNA tables
+
As discussed in Section 5, IANA has modified the IDNA tables
  collection [IANA-IDNA-Tables] by identifying them clearly as non-
+
collection [IANA-IDNA-Tables] by identifying them clearly as non-
  normative, so that a "current" or "correct" version of those tables
+
normative, so that a "current" or "correct" version of those tables
  is not implied, and by pointing to this document for an explanation.
+
is not implied, and by pointing to this document for an explanation.
  IANA has published tables supplied by the IETF for all Unicode
+
IANA has published tables supplied by the IETF for all Unicode
  versions through 11.0, retaining all older versions and making them
+
versions through 11.0, retaining all older versions and making them
  available.  Newer tables will be constructed as specified in this
+
available.  Newer tables will be constructed as specified in this
  document and then made available by IANA.  IANA has changed the title
+
document and then made available by IANA.  IANA has changed the title
  of that registry from "IDNA Parameters", which is misleading, to
+
of that registry from "IDNA Parameters", which is misleading, to
  "IDNA Rules and Derived Property Values".
+
"IDNA Rules and Derived Property Values".
  
  The "Note" in that registry says:
+
The "Note" in that registry says:
  
  |  IDNA does not require that applications and libraries, either for
+
|  IDNA does not require that applications and libraries, either for
  |  registration/storage or lookup, support any particular version of
+
|  registration/storage or lookup, support any particular version of
  |  Unicode.  Instead, they are required to use derived property
+
|  Unicode.  Instead, they are required to use derived property
  |  values based on calculations associated with whatever version of
+
|  values based on calculations associated with whatever version of
  |  Unicode they are using elsewhere in the application or library.
+
|  Unicode they are using elsewhere in the application or library.
  |  For the convenience of application and library developers and
+
|  For the convenience of application and library developers and
  |  others, the IETF has supplied, and IANA maintains, derived
+
|  others, the IETF has supplied, and IANA maintains, derived
  |  property tables for several version of Unicode as listed below.
+
|  property tables for several version of Unicode as listed below.
  |  It should be stressed that these are not normative in that, in
+
|  It should be stressed that these are not normative in that, in
  |  principle, an application can do its own calculations and these
+
|  principle, an application can do its own calculations and these
  |  tables can change as IETF understanding evolves.  By contrast, the
+
|  tables can change as IETF understanding evolves.  By contrast, the
  |  list of code points requiring contextual rules and the associated
+
|  list of code points requiring contextual rules and the associated
  |  rules are normative and should be treated as updates to the list
+
|  rules are normative and should be treated as updates to the list
  |  in RFC 5892.
+
|  in RFC 5892.
  
  As long as the intent is preserved, the text of that note may be
+
As long as the intent is preserved, the text of that note may be
  changed in the future at IANA's discretion.
+
changed in the future at IANA's discretion.
  
  IANA's attention is called to the introduction, in Section 3.2, of a
+
IANA's attention is called to the introduction, in Section 3.2, of a
  temporary "under review" category to the PVALID, DISALLOWED, etc.,
+
temporary "under review" category to the PVALID, DISALLOWED, etc.,
  entries in the tables.
+
entries in the tables.
  
8.  Security Considerations
+
== Security Considerations ==
  
  Applying the procedures described in this document and understanding
+
Applying the procedures described in this document and understanding
  of the clarifications it provides should reduce confusion about IDNA
+
of the clarifications it provides should reduce confusion about IDNA
  requirements.  Because past confusion has provided opportunities for
+
requirements.  Because past confusion has provided opportunities for
  bad behavior, the effect of these changes should improve Internet
+
bad behavior, the effect of these changes should improve Internet
  security to at least some small extent.
+
security to at least some small extent.
  
  Because of the preference to keep the derived property value stable
+
Because of the preference to keep the derived property value stable
  (as specified in RFC 5892 and discussed in Section 4), the algorithm
+
(as specified in RFC 5892 and discussed in Section 4), the algorithm
  used to calculate those derived properties does change as explained
+
used to calculate those derived properties does change as explained
  in Section 3.  If these changes are not taken into account, the
+
in Section 3.  If these changes are not taken into account, the
  derived property value will change, and the implications might have
+
derived property value will change, and the implications might have
  negative consequences, in some cases with security implications.  For
+
negative consequences, in some cases with security implications.  For
  example, changes in the calculated derived property value for a code
+
example, changes in the calculated derived property value for a code
  point from either DISALLOWED to PVALID or from PVALID to DISALLOWED
+
point from either DISALLOWED to PVALID or from PVALID to DISALLOWED
  can cause changes in label interpretation that would be visible and
+
can cause changes in label interpretation that would be visible and
  confusing to end users and might enable attacks.
+
confusing to end users and might enable attacks.
  
9.  References
+
== References ==
  
9.1.  Normative References
+
=== Normative References ===
  
  [IANA-IDNA-Tables]
+
[IANA-IDNA-Tables]
              IANA, "IDNA Rules and Derived Property Values",
+
          IANA, "IDNA Rules and Derived Property Values",
              <https://www.iana.org/assignments/idna-tables>.
+
          <https://www.iana.org/assignments/idna-tables>.
  
  [RFC5892]  Faltstrom, P., Ed., "The Unicode Code Points and
+
[RFC5892]  Faltstrom, P., Ed., "The Unicode Code Points and
              Internationalized Domain Names for Applications (IDNA)",
+
          Internationalized Domain Names for Applications (IDNA)",
              RFC 5892, DOI 10.17487/RFC5892, August 2010,
+
          RFC 5892, DOI 10.17487/RFC5892, August 2010,
              <https://www.rfc-editor.org/info/rfc5892>.
+
          <https://www.rfc-editor.org/info/rfc5892>.
  
  [RFC5892a] Faltstrom, P., Ed., "The Unicode Code Points and
+
[RFC5892a] Faltstrom, P., Ed., "The Unicode Code Points and
              Internationalized Domain Names for Applications (IDNA)",
+
          Internationalized Domain Names for Applications (IDNA)",
              Section 2.7, RFC 5892, August 2010,
+
          Section 2.7, RFC 5892, August 2010,
              <https://www.rfc-editor.org/rfc/rfc5892.txt>.
+
          <https://www.rfc-editor.org/rfc/rfc5892.txt>.
  
  [RFC8126]  Cotton, M., Leiba, B., and T. Narten, "Guidelines for
+
[RFC8126]  Cotton, M., Leiba, B., and T. Narten, "Guidelines for
              Writing an IANA Considerations Section in RFCs", BCP 26,
+
          Writing an IANA Considerations Section in RFCs", BCP 26,
              RFC 8126, DOI 10.17487/RFC8126, June 2017,
+
          RFC 8126, DOI 10.17487/RFC8126, June 2017,
              <https://www.rfc-editor.org/info/rfc8126>.
+
          <https://www.rfc-editor.org/info/rfc8126>.
  
  [Unicode]  The Unicode Consortium, "The Unicode Standard (Current
+
[Unicode]  The Unicode Consortium, "The Unicode Standard (Current
              Version)", <http://www.unicode.org/versions/latest/>.  The
+
          Version)", <http://www.unicode.org/versions/latest/>.  The
              link given will always access the current version of the
+
          link given will always access the current version of the
              Unicode Standard, independent of its version number or
+
          Unicode Standard, independent of its version number or
              date.
+
          date.
  
  [Unicode-properties]
+
[Unicode-properties]
              The Unicode Consortium, "The Unicode Standard Version
+
          The Unicode Consortium, "The Unicode Standard Version
              11.0", Section 3.5, 2018,
+
          11.0", Section 3.5, 2018,
              <https://www.unicode.org/versions/Unicode11.0.0/>.
+
          <https://www.unicode.org/versions/Unicode11.0.0/>.
  
9.2.  Informative References
+
=== Informative References ===
  
  [Err3312]  RFC Errata, Erratum ID 3312, RFC 5892,
+
[Err3312]  RFC Errata, Erratum ID 3312, RFC 5892,
              <https://www.rfc-editor.org/errata/eid3312>.
+
          <https://www.rfc-editor.org/errata/eid3312>.
  
  [IAB-Unicode-2018]
+
[IAB-Unicode-2018]
              Internet Architecture Board (IAB), "IAB Statement on
+
          Internet Architecture Board (IAB), "IAB Statement on
              Identifiers and Unicode", 15 March 2018,
+
          Identifiers and Unicode", 15 March 2018,
              <https://www.iab.org/documents/correspondence-reports-
+
          <https://www.iab.org/documents/correspondence-reports-
              documents/2018-2/iab-statement-on-identifiers-and-
+
          documents/2018-2/iab-statement-on-identifiers-and-
              unicode/>.
+
          unicode/>.
  
  [IAB-Unicode7-2015]
+
[IAB-Unicode7-2015]
              Internet Architecture Board (IAB), "IAB Statement on
+
          Internet Architecture Board (IAB), "IAB Statement on
              Identifiers and Unicode 7.0.0", 11 February 2015,
+
          Identifiers and Unicode 7.0.0", 11 February 2015,
              <https://www.iab.org/documents/correspondence-reports-
+
          <https://www.iab.org/documents/correspondence-reports-
              documents/2015-2/iab-statement-on-identifiers-and-unicode-
+
          documents/2015-2/iab-statement-on-identifiers-and-unicode-
              7-0-0/>.
+
          7-0-0/>.
  
  [ICANN-LGR-SLA]
+
[ICANN-LGR-SLA]
              Internet Corporation for Assigned Names and Numbers
+
          Internet Corporation for Assigned Names and Numbers
              (ICANN), "Proposed IANA SLAs for Publishing LGRs/IDN
+
          (ICANN), "Proposed IANA SLAs for Publishing LGRs/IDN
              Tables", 10 June 2019, <https://www.icann.org/public-
+
          Tables", 10 June 2019, <https://www.icann.org/public-
              comments/proposed-iana-sla-lgr-idn-tables-2019-06-10-en>.
+
          comments/proposed-iana-sla-lgr-idn-tables-2019-06-10-en>.
  
  [IDNA-Unicode7]
+
[IDNA-Unicode7]
              Klensin, J. and P. Faltstrom, "IDNA Update for Unicode 7.0
+
          Klensin, J. and P. Faltstrom, "IDNA Update for Unicode 7.0
              and Later Versions", Work in Progress, Internet-Draft,
+
          and Later Versions", Work in Progress, Internet-Draft,
              draft-klensin-idna-5892upd-unicode70-05, 8 October 2017,
+
          draft-klensin-idna-5892upd-unicode70-05, 8 October 2017,
              <https://tools.ietf.org/html/draft-klensin-idna-5892upd-
+
          <https://tools.ietf.org/html/draft-klensin-idna-5892upd-
              unicode70-05>.
+
          unicode70-05>.
  
  [IDNA-Unicode11]
+
[IDNA-Unicode11]
              Faltstrom, P., "IDNA2008 and Unicode 11.0.0", Work in
+
          Faltstrom, P., "IDNA2008 and Unicode 11.0.0", Work in
              Progress, Internet-Draft, draft-faltstrom-unicode11-08, 11
+
          Progress, Internet-Draft, draft-faltstrom-unicode11-08, 11
              March 2019, <https://tools.ietf.org/html/draft-faltstrom-
+
          March 2019, <https://tools.ietf.org/html/draft-faltstrom-
              unicode11-08>.
+
          unicode11-08>.
  
  [IDNA-Unicode12]
+
[IDNA-Unicode12]
              Faltstrom, P., "IDNA2008 and Unicode 12.0.0", Work in
+
          Faltstrom, P., "IDNA2008 and Unicode 12.0.0", Work in
              Progress, Internet-Draft, draft-faltstrom-unicode12-00, 11
+
          Progress, Internet-Draft, draft-faltstrom-unicode12-00, 11
              March 2019, <https://tools.ietf.org/html/draft-faltstrom-
+
          March 2019, <https://tools.ietf.org/html/draft-faltstrom-
              unicode12-00>.
+
          unicode12-00>.
  
  [RegRestr] Klensin, J. and A. Freytag, "Internationalized Domain
+
[RegRestr] Klensin, J. and A. Freytag, "Internationalized Domain
              Names in Applications (IDNA): Registry Restrictions and
+
          Names in Applications (IDNA): Registry Restrictions and
              Recommendations", Work in Progress, Internet-Draft, draft-
+
          Recommendations", Work in Progress, Internet-Draft, draft-
              klensin-idna-rfc5891bis-05, 29 August 2019,
+
          klensin-idna-rfc5891bis-05, 29 August 2019,
              <https://tools.ietf.org/html/draft-klensin-idna-
+
          <https://tools.ietf.org/html/draft-klensin-idna-
              rfc5891bis-05>.
+
          rfc5891bis-05>.
  
  [RFC1766]  Alvestrand, H., "Tags for the Identification of
+
[RFC1766]  Alvestrand, H., "Tags for the Identification of
              Languages", RFC 1766, DOI 10.17487/RFC1766, March 1995,
+
          Languages", RFC 1766, DOI 10.17487/RFC1766, March 1995,
              <https://www.rfc-editor.org/info/rfc1766>.
+
          <https://www.rfc-editor.org/info/rfc1766>.
  
  [RFC3282]  Alvestrand, H., "Content Language Headers", RFC 3282,
+
[RFC3282]  Alvestrand, H., "Content Language Headers", RFC 3282,
              DOI 10.17487/RFC3282, May 2002,
+
          DOI 10.17487/RFC3282, May 2002,
              <https://www.rfc-editor.org/info/rfc3282>.
+
          <https://www.rfc-editor.org/info/rfc3282>.
  
  [RFC3454]  Hoffman, P. and M. Blanchet, "Preparation of
+
[RFC3454]  Hoffman, P. and M. Blanchet, "Preparation of
              Internationalized Strings ("stringprep")", RFC 3454,
+
          Internationalized Strings ("stringprep")", RFC 3454,
              DOI 10.17487/RFC3454, December 2002,
+
          DOI 10.17487/RFC3454, December 2002,
              <https://www.rfc-editor.org/info/rfc3454>.
+
          <https://www.rfc-editor.org/info/rfc3454>.
  
  [RFC3490]  Faltstrom, P., Hoffman, P., and A. Costello,
+
[RFC3490]  Faltstrom, P., Hoffman, P., and A. Costello,
              "Internationalizing Domain Names in Applications (IDNA)",
+
          "Internationalizing Domain Names in Applications (IDNA)",
              RFC 3490, DOI 10.17487/RFC3490, March 2003,
+
          RFC 3490, DOI 10.17487/RFC3490, March 2003,
              <https://www.rfc-editor.org/info/rfc3490>.
+
          <https://www.rfc-editor.org/info/rfc3490>.
  
  [RFC3491]  Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
+
[RFC3491]  Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
              Profile for Internationalized Domain Names (IDN)",
+
          Profile for Internationalized Domain Names (IDN)",
              RFC 3491, DOI 10.17487/RFC3491, March 2003,
+
          RFC 3491, DOI 10.17487/RFC3491, March 2003,
              <https://www.rfc-editor.org/info/rfc3491>.
+
          <https://www.rfc-editor.org/info/rfc3491>.
  
  [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
+
[RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
              10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
+
          10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
              2003, <https://www.rfc-editor.org/info/rfc3629>.
+
          2003, <https://www.rfc-editor.org/info/rfc3629>.
  
  [RFC4690]  Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and
+
[RFC4690]  Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and
              Recommendations for Internationalized Domain Names
+
          Recommendations for Internationalized Domain Names
              (IDNs)", RFC 4690, DOI 10.17487/RFC4690, September 2006,
+
          (IDNs)", RFC 4690, DOI 10.17487/RFC4690, September 2006,
              <https://www.rfc-editor.org/info/rfc4690>.
+
          <https://www.rfc-editor.org/info/rfc4690>.
  
  [RFC5646]  Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying
+
[RFC5646]  Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying
              Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646,
+
          Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646,
              September 2009, <https://www.rfc-editor.org/info/rfc5646>.
+
          September 2009, <https://www.rfc-editor.org/info/rfc5646>.
  
  [RFC5890]  Klensin, J., "Internationalized Domain Names for
+
[RFC5890]  Klensin, J., "Internationalized Domain Names for
              Applications (IDNA): Definitions and Document Framework",
+
          Applications (IDNA): Definitions and Document Framework",
              RFC 5890, DOI 10.17487/RFC5890, August 2010,
+
          RFC 5890, DOI 10.17487/RFC5890, August 2010,
              <https://www.rfc-editor.org/info/rfc5890>.
+
          <https://www.rfc-editor.org/info/rfc5890>.
  
  [RFC5894]  Klensin, J., "Internationalized Domain Names for
+
[RFC5894]  Klensin, J., "Internationalized Domain Names for
              Applications (IDNA): Background, Explanation, and
+
          Applications (IDNA): Background, Explanation, and
              Rationale", RFC 5894, DOI 10.17487/RFC5894, August 2010,
+
          Rationale", RFC 5894, DOI 10.17487/RFC5894, August 2010,
              <https://www.rfc-editor.org/info/rfc5894>.
+
          <https://www.rfc-editor.org/info/rfc5894>.
  
  [RFC6452]  Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code
+
[RFC6452]  Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code
              Points and Internationalized Domain Names for Applications
+
          Points and Internationalized Domain Names for Applications
              (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452,
+
          (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452,
              November 2011, <https://www.rfc-editor.org/info/rfc6452>.
+
          November 2011, <https://www.rfc-editor.org/info/rfc6452>.
  
 
Appendix A.  Summary of Changes to RFC 5892
 
Appendix A.  Summary of Changes to RFC 5892
  
  Other than the editorial correction specified in Section 6, all of
+
Other than the editorial correction specified in Section 6, all of
  the changes in this document are concerned with the reviews for new
+
the changes in this document are concerned with the reviews for new
  versions of Unicode and with the IANA Considerations in Section 5 of
+
versions of Unicode and with the IANA Considerations in Section 5 of
  [RFC5892], particularly Section 5.1 of [RFC5892].  Whether the
+
[RFC5892], particularly Section 5.1 of [RFC5892].  Whether the
  changes are substantive or merely clarifications may be somewhat in
+
changes are substantive or merely clarifications may be somewhat in
  the eye of the beholder, so the list below should not be assumed to
+
the eye of the beholder, so the list below should not be assumed to
  be comprehensive.  At a very high level, this document clarifies that
+
be comprehensive.  At a very high level, this document clarifies that
  two types of review were intended and separates them for clarity.
+
two types of review were intended and separates them for clarity.
  This document also restores the original (but so far unobserved)
+
This document also restores the original (but so far unobserved)
  default for actions when code point derived properties change.  For
+
default for actions when code point derived properties change.  For
  this reason, this document effectively replaces Section 5.1 of
+
this reason, this document effectively replaces Section 5.1 of
  [RFC5892] and adds or changes some text so that the replacement makes
+
[RFC5892] and adds or changes some text so that the replacement makes
  better sense.
+
better sense.
  
  Changes or clarifications that may be considered important include:
+
Changes or clarifications that may be considered important include:
  
  *  Separated the new Unicode version review into two explicit parts
+
*  Separated the new Unicode version review into two explicit parts
      and provided for different review methods and, potentially,
+
  and provided for different review methods and, potentially,
      asynchronous outcomes.
+
  asynchronous outcomes.
  
  *  Specified a DE team, not a single designated expert, for the code
+
*  Specified a DE team, not a single designated expert, for the code
      point review.
+
  point review.
  
  *  Eliminated the de facto requirement for the (formerly single)
+
*  Eliminated the de facto requirement for the (formerly single)
      designated expert to be the same person as the IAB's liaison to
+
  designated expert to be the same person as the IAB's liaison to
      the Unicode Consortium, but called out the importance of
+
  the Unicode Consortium, but called out the importance of
      coordination.
+
  coordination.
  
  *  Created the "Status" field in the IANA tables to inform the
+
*  Created the "Status" field in the IANA tables to inform the
      community about specific potentially problematic code points.
+
  community about specific potentially problematic code points.
      This change creates the ability to add information about such code
+
  This change creates the ability to add information about such code
      points before IETF review is completed instead of having the
+
  points before IETF review is completed instead of having the
      review process hold up the use of the new Unicode version.
+
  review process hold up the use of the new Unicode version.
  
  *  In part because Unicode is now on a regular one-year cycle rather
+
*  In part because Unicode is now on a regular one-year cycle rather
      than producing major and minor versions as needed, to avoid
+
  than producing major and minor versions as needed, to avoid
      overloading the IETF's internationalization resources, and to
+
  overloading the IETF's internationalization resources, and to
      avoid generating and storing IANA tables for trivial changes
+
  avoid generating and storing IANA tables for trivial changes
      (e.g., the single new code point in Unicode 12.1), the review
+
  (e.g., the single new code point in Unicode 12.1), the review
      procedure is applied only to major versions of Unicode unless
+
  procedure is applied only to major versions of Unicode unless
      exceptional circumstances arise and are identified.
+
  exceptional circumstances arise and are identified.
  
 
Appendix B.  Background and Rationale for Expert Review Procedure for
 
Appendix B.  Background and Rationale for Expert Review Procedure for
            New Code Point Analysis
+
          New Code Point Analysis
  
  The expert review procedure for new code point analysis described in
+
The expert review procedure for new code point analysis described in
  Section 3.2 is somewhat unusual compared to the examples presented in
+
Section 3.2 is somewhat unusual compared to the examples presented in
  the Guidelines for Writing IANA Considerations [RFC8126].  This
+
the Guidelines for Writing IANA Considerations [RFC8126].  This
  appendix explains that choice and provides the background for it.
+
appendix explains that choice and provides the background for it.
  
  Development of specifications to support use of languages and writing
+
Development of specifications to support use of languages and writing
  systems other than English (and Latin script) -- so-called
+
systems other than English (and Latin script) -- so-called
  "internationalization" or "i18n" -- has always been problematic in
+
"internationalization" or "i18n" -- has always been problematic in
  the IETF, especially when requirements go beyond simple coding of
+
the IETF, especially when requirements go beyond simple coding of
  characters (e.g., RFC 3629 [RFC3629]) or simple identification of
+
characters (e.g., RFC 3629 [RFC3629]) or simple identification of
  languages (e.g., RFC 3282 [RFC3282] and the earlier RFC 1766
+
languages (e.g., RFC 3282 [RFC3282] and the earlier RFC 1766
  [RFC1766]).  A good deal of specialized knowledge is required,
+
[RFC1766]).  A good deal of specialized knowledge is required,
  knowledge that comes from multiple fields and that requires multiple
+
knowledge that comes from multiple fields and that requires multiple
  perspectives.  The work is not obviously more complex than routing,
+
perspectives.  The work is not obviously more complex than routing,
  especially if one assumes that routing work requires a solid
+
especially if one assumes that routing work requires a solid
  foundation in graph theory or network optimization, or than security
+
foundation in graph theory or network optimization, or than security
  and cryptography, but people working in those areas are drawn to the
+
and cryptography, but people working in those areas are drawn to the
  IETF and people from the fields that bear on internationalization
+
IETF and people from the fields that bear on internationalization
  typically are not.
+
typically are not.
  
  As a result, we have often thought we understood a problem, generated
+
As a result, we have often thought we understood a problem, generated
  a specification or set of specifications, but then have been
+
a specification or set of specifications, but then have been
  surprised by unanticipated (by the IETF) issues.  We then needed to
+
surprised by unanticipated (by the IETF) issues.  We then needed to
  tune and often revise our specification.  The language tag work that
+
tune and often revise our specification.  The language tag work that
  started with RFC 1766 is a good example of this: broader
+
started with RFC 1766 is a good example of this: broader
  considerations and requirements led to later work and a much more
+
considerations and requirements led to later work and a much more
  complex and finer-grained system [RFC5646].
+
complex and finer-grained system [RFC5646].
  
  Work on IDNs further increased the difficulties because many of the
+
Work on IDNs further increased the difficulties because many of the
  decisions that led to the current version of IDNA require
+
decisions that led to the current version of IDNA require
  understanding the DNS, its constraints, and, to at least some extent,
+
understanding the DNS, its constraints, and, to at least some extent,
  the commercial market of domain names, including various ICANN
+
the commercial market of domain names, including various ICANN
  efforts.
+
efforts.
  
  The net result of these factors is that it is extremely unlikely that
+
The net result of these factors is that it is extremely unlikely that
  the IESG will ever find a designated expert whose knowledge and
+
the IESG will ever find a designated expert whose knowledge and
  understanding will include everything that is required.
+
understanding will include everything that is required.
  
  Consequently, Section 7 and other discussions in this document
+
Consequently, Section 7 and other discussions in this document
  specify a DE team that is expected to have the broad perspective,
+
specify a DE team that is expected to have the broad perspective,
  expertise, and access to information and community in order to review
+
expertise, and access to information and community in order to review
  new Unicode versions and to make consensus recommendations that will
+
new Unicode versions and to make consensus recommendations that will
  serve the Internet well.  While we anticipate that the team will have
+
serve the Internet well.  While we anticipate that the team will have
  one or more leaders, the structure of the team differs from the
+
one or more leaders, the structure of the team differs from the
  suggestions given in Section 5.2 of the Guidelines for Writing IANA
+
suggestions given in Section 5.2 of the Guidelines for Writing IANA
  Considerations [RFC8126] since neither the team's formation nor its
+
Considerations [RFC8126] since neither the team's formation nor its
  consultation is left to the discretion of the designated expert, nor
+
consultation is left to the discretion of the designated expert, nor
  is the designated expert solely accountable to the community.  A team
+
is the designated expert solely accountable to the community.  A team
  that contains multiple perspectives is required, the team members are
+
that contains multiple perspectives is required, the team members are
  accountable as a group, and any nontrivial recommendations require
+
accountable as a group, and any nontrivial recommendations require
  team consensus.  This also differs from the common practice in the
+
team consensus.  This also differs from the common practice in the
  IETF of "review teams" from which a single member is selected to
+
IETF of "review teams" from which a single member is selected to
  perform a review: the principle for these reviews is team effort.
+
perform a review: the principle for these reviews is team effort.
  
 
Acknowledgments
 
Acknowledgments
  
  This document was inspired by extensive discussions within the I18N
+
This document was inspired by extensive discussions within the I18N
  Directorate of the IETF Applications and Real-Time (ART) area in the
+
Directorate of the IETF Applications and Real-Time (ART) area in the
  first quarter of 2019 about sorting out the reviews for Unicode 11.0
+
first quarter of 2019 about sorting out the reviews for Unicode 11.0
  and 12.0.  Careful reviews by Joel Halpern and text suggestions from
+
and 12.0.  Careful reviews by Joel Halpern and text suggestions from
  Barry Leiba resulted in some clarifications.
+
Barry Leiba resulted in some clarifications.
  
  Thanks to Christopher Wood for catching some editorial errors that
+
Thanks to Christopher Wood for catching some editorial errors that
  persisted until rather late in the document's life cycle and to
+
persisted until rather late in the document's life cycle and to
  Benjamin Kaduk for catching and raising a number of questions during
+
Benjamin Kaduk for catching and raising a number of questions during
  Last Call.  Some of the issues they raised have been reflected in the
+
Last Call.  Some of the issues they raised have been reflected in the
  document; others did not appear to be desirable modifications after
+
document; others did not appear to be desirable modifications after
  further discussion, but the questions were definitely worth raising
+
further discussion, but the questions were definitely worth raising
  and discussing.
+
and discussing.
  
 
Authors' Addresses
 
Authors' Addresses
  
  John C Klensin
+
John C Klensin
  1770 Massachusetts Ave, Ste 322
+
1770 Massachusetts Ave, Ste 322
  Cambridge, MA 02140
+
Cambridge, MA 02140
  United States of America
+
United States of America
 
 
  Phone: +1 617 245 1457
 
 
  
 +
Phone: +1 617 245 1457
 +
  
  Patrik Fältström
+
Patrik Fältström
  Netnod
+
Netnod
  Greta Garbos Väg 13
+
Greta Garbos Väg 13
  SE-169 40 Solna
+
SE-169 40 Solna
  Sweden
+
Sweden
  
  Phone: +46 70 6059051
+
Phone: +46 70 6059051
+

Revision as of 13:11, 27 September 2020



Internet Engineering Task Force (IETF) J. Klensin Request for Comments: 8753 Updates: 5892 P. Fältström Category: Standards Track Netnod ISSN: 2070-1721 April 2020

Internationalized Domain Names for Applications (IDNA) Review for New
                        Unicode Versions

Abstract

The standards for Internationalized Domain Names in Applications (IDNA) require a review of each new version of Unicode to determine whether incompatibilities with prior versions or other issues exist and, where appropriate, to allow the IETF to decide on the trade-offs between compatibility with prior IDNA versions and compatibility with Unicode going forward. That requirement, and its relationship to tables maintained by IANA, has caused significant confusion in the past. This document makes adjustments to the review procedure based on experience and updates IDNA, specifically RFC 5892, to reflect those changes and to clarify the various relationships involved. It also makes other minor adjustments to align that document with experience.

Status of This Memo

This is an Internet Standards Track document.

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc8753.

Copyright Notice

Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction 2. Brief History of IDNA Versions, the Review Requirement, and RFC

       5982

3. The Review Model

 3.1.  Review Model Part I: Algorithmic Comparison
 3.2.  Review Model Part II: New Code Point Analysis

4. IDNA Assumptions and Current Practice 5. Derived Tables Published by IANA 6. Editorial Clarification to RFC 5892 7. IANA Considerations 8. Security Considerations 9. References

 9.1.  Normative References
 9.2.  Informative References

Appendix A. Summary of Changes to RFC 5892 Appendix B. Background and Rationale for Expert Review Procedure

       for New Code Point Analysis

Acknowledgments Authors' Addresses

Introduction

The standards for Internationalized Domain Names in Applications (IDNA) require a review of each new version of Unicode to determine whether incompatibilities with prior versions or other issues exist and, where appropriate, to allow the IETF to decide on the trade-offs between compatibility with prior IDNA versions and compatibility with Unicode [Unicode] going forward. That requirement, and its relationship to tables maintained by IANA, has caused significant confusion in the past (see Section 3 and Section 4 for additional discussion of the question of appropriate decisions and the history of these reviews). This document makes adjustments to the review procedure based on nearly a decade of experience and updates IDNA, specifically the document that specifies the relationship between Unicode code points and IDNA derived properties [RFC5892], to reflect those changes and to clarify the various relationships involved.

This specification does not change the requirement that registries at all levels of the DNS tree take responsibility for the labels they insert in the DNS, a level of responsibility that requires allowing only a subset of the code points and strings allowed by the IDNA protocol itself. That requirement is discussed in more detail in a companion document [RegRestr].

Terminology note: In this document, "IDNA" refers to the current version as described in RFC 5890 [RFC5890] and subsequent documents and sometimes known as "IDNA2008". Distinctions between it and the earlier version are explicit only where they are necessary for understanding the relationships involved, e.g., in Section 2.

Brief History of IDNA Versions, the Review Requirement, and RFC 5982

The original, now-obsolete, version of IDNA, commonly known as "IDNA2003" [RFC3490] [RFC3491], was defined in terms of a profile of a collection of IETF-specific tables [RFC3454] that specified the usability of each Unicode code point with IDNA. Because the tables themselves were normative, they were intrinsically tied to a particular version of Unicode. As Unicode evolved, the IDNA2003 standard would have required the creation of a new profile for each new version of Unicode, or the tables would have fallen further and further behind.

When IDNA2003 was superseded by the current version, known as IDNA2008 [RFC5890], a different strategy, one that was property-based rather than table-based, was adopted for a number of reasons, of which the reliance on normative tables was not dominant [RFC4690]. In the IDNA2008 model, the use of normative tables was replaced by a set of procedures and rules that operated on Unicode properties [Unicode-properties] and a few internal definitions to determine the category and status, and hence an IDNA-specific "derived property", for any given code point. Those rules are, in principle, independent of Unicode versions. They can be applied to any version of Unicode, at least from approximately version 5.0 forward, to yield an appropriate set of derived properties. However, the working group that defined IDNA2008 recognized that not all of the Unicode properties were completely stable and that, because the criteria for new code points and property assignment used by the Unicode Consortium might not precisely align with the needs of IDNA, there were possibilities of incompatible changes to the derived property values. More specifically, there could be changes that would make previously disallowed labels valid, previously valid labels disallowed, or that would be disruptive to IDNA's defining rule structure. Consequently, IDNA2008 provided for an expert review of each new version of Unicode with the possibility of providing exceptions to the rules for particular new code points, code points whose properties had changed, and newly discovered issues with the IDNA2008 collection of rules. When problems were identified, the reviewer was expected to notify the IESG. The assumption was that the IETF would review the situation and modify IDNA2008 as needed, most likely by adding exceptions to preserve backward compatibility (see Section 3.1).

For the convenience of the community, IDNA2008 also provided that IANA would maintain copies of calculated tables resulting from each review, showing the derived properties for each code point. Those tables were expected to be helpful, especially to those without the facilities to easily compute derived properties themselves. Experience with the community and those tables has shown that they have been confused with the normative tables of IDNA2003: the IDNA2008 tables published by IANA have never been normative, and statements about IDNA2008 being out of date with regard to some Unicode version because the IANA tables have not been updated are incorrect or meaningless.

The Review Model

While the text has sometimes been interpreted differently, IDNA2008 actually calls for two types of review when a new Unicode version is introduced. One is an algorithmic comparison of the set of derived properties calculated from the new version of Unicode to the derived properties calculated from the previous one to determine whether incompatible changes have occurred. The other is a review of newly assigned code points to determine whether any of them require special treatment (e.g., assignment of what IDNA2008 calls contextual rules) and whether any of them violate any of the assumptions underlying the IDNA2008 derived property calculations. Any of the cases of either review might require either per-code point exceptions or other adjustments to the rules for deriving properties that are part of RFC 5892. The subsections below provide a revised specification for the review procedure.

Unless the IESG or the designated expert team concludes that there are special problems or unusual circumstances, these reviews will be performed only for major Unicode versions (those numbered NN.0, e.g., 12.0) and not for minor updates (e.g., 12.1).

As can be seen in the detailed descriptions in the following subsections, proper review will require a team of experts that has both broad and specific skills in reviewing Unicode characters and their properties in relation to both the written standards and operational needs. The IESG will need to appoint experts who can draw on the broader community to obtain the necessary skills for particular situations. See the IANA Considerations (Section 7) for details.

Review Model Part I: Algorithmic Comparison

Section 5.1 of RFC 5892 is the description of the process for creating the initial IANA tables. It is noteworthy that, while it can be read as strongly implying new reviews and new tables for versions of Unicode after 5.2, it does not explicitly specify those reviews or, e.g., the timetable for completing them. It also indicates that incompatibilities are to be "flagged for the IESG" but does not specify exactly what the IESG is to do about them and when. For reasons related to the other type of review and discussed below, only one review was completed, documented [RFC6452], and a set of corresponding new tables installed. That review, which was for Unicode 6.0, found only three incompatibilities; the consensus was to ignore them (not create exceptions in IDNA2008) and to remain consistent with computations based on current (Unicode 6.0) properties rather than preserving backward compatibility within IDNA. The 2018 review (for Unicode 11.0 and versions in between it and 6.0) [IDNA-Unicode12] also concluded that Unicode compatibility, rather than IDNA backward compatibility, should be maintained. That decision was partially driven by the long period between reviews and the concern that table calculations by others in the interim could result in unexpected incompatibilities if derived property definitions were then changed. See Section 4 for further discussion of these preferences.

Review Model Part II: New Code Point Analysis

The second type of review, which is not clearly explained in RFC 5892, is intended to identify cases in which newly added or recently discovered problematic code points violate the design assumptions of IDNA, to identify defects in those assumptions, or to identify inconsistencies (from an IDNA perspective) with Unicode commitments about assignment, properties, and stability of newly added code points. One example of this type of review was the discovery of new code points after Unicode 7.0 that were potentially visually equivalent, in the same script, to previously available code point sequences [IAB-Unicode7-2015] [IDNA-Unicode7].

Because multiple perspectives on Unicode and writing systems are required, this review will not be successful unless it is done by a team. Finding one all-knowing expert is improbable, and a single expert is unlikely to produce an adequate analysis. Rather than any single expert being the sole source of analysis, the designated expert (DE) team needs to understand that there will always be gaps in their knowledge, to know what they don't know, and to work to find the expertise that each review requires. It is also important that the DE team maintains close contact with the Area Directors (ADs) and that the ADs remain aware of the team's changing needs, examining and adjusting the team's membership over time, with periodic reexamination at least annually. It should also be recognized that, if this review identifies a problem, that problem is likely to be complex and/or involve multiple trade-offs. Actions to deal with it are likely to be disruptive (although perhaps not to large communities of users), or to leave security risks (opportunities for attacks and inadvertent confusion as expected matches do not occur), or to cause excessive reliance on registries understanding and taking responsibility for what they are registering [RFC5894] [RegRestr]. The latter, while a requirement of IDNA, has often not worked out well in the past.

Because resolution of problems identified by this part of the review may take some time even if that resolution is to add additional contextual rules or to disallow one or more code points, there will be cases in which it will be appropriate to publish the results of the algorithmic review and to provide IANA with corresponding tables, with warnings about code points whose status is uncertain until there is IETF consensus about how to proceed. The affected code points should be considered unsafe and identified as "under review" in the IANA tables until final derived properties are assigned.

IDNA Assumptions and Current Practice

At the time the IDNA2008 documents were written, the assumption was that, if new versions of Unicode introduced incompatible changes, the Standard would be updated to preserve backward compatibility for users of IDNA. For most purposes, this would be done by adding to the table of exceptions associated with Rule G [RFC5892a].

This has not been the practice in the reviews completed subsequent to Unicode 5.2, as discussed in Section 3. Incompatibilities were identified in Unicode 6.0 [RFC6452] and in the cumulative review leading to tables for Unicode 11.0 [IDNA-Unicode11]. In all of those cases, the decision was made to maintain compatibility with Unicode properties rather than with prior versions of IDNA.

If an algorithmic review detects changes in Unicode after version 12.0 that would break compatibility with derived properties associated with prior versions of Unicode or changes that would preserve compatibility within IDNA at the cost of departing from current Unicode specifications, those changes must be captured in documents expected to be published as Standards Track RFCs so that the IETF can review those changes and maintain a historical record.

The community has now made decisions and updated tables for Unicode 6.0 [RFC6452], done catch-up work between it and Unicode 11.0 [IDNA-Unicode11], and completed the review and tables for Unicode 12.0 [IDNA-Unicode12]. The decisions made in those cases were driven by preserving consistency with Unicode and Unicode property changes for reasons most clearly explained by the IAB [IAB-Unicode-2018]. These actions were not only at variance with the language in RFC 5892 but were also inconsistent with commitments to the registry and user communities to ensure that IDN labels that were once valid under IDNA2008 would remain valid, and previously invalid labels would remain invalid, except for those labels that were invalid because they contained unassigned code points.

This document restores and clarifies that original language and intent: absent extremely strong evidence on a per-code point basis that preserving the validity status of possible existing (or prohibited) labels would cause significant harm, Unicode changes that would affect IDNA derived properties are to be reflected in IDNA exceptions that preserve the status of those labels. There is one partial exception to this principle. If the new code point analysis (see Section 3.2) concludes that some code points or collections of code points should be further analyzed, those code points, and labels including them, should be considered unsafe and used only with extreme caution because the conclusions of the analysis may change their derived property values and status.

Derived Tables Published by IANA

As discussed above, RFC 5892 specified that derived property tables be provided via an IANA registry. Perhaps because most IANA registries are considered normative and authoritative, that registry has been the source of considerable confusion, including the incorrect assumption that the absence of published tables for versions of Unicode later than 6.0 meant that IDNA could not be used with later versions. That position was raised in multiple ways, not all of them consistent, especially in the ICANN context [ICANN-LGR-SLA].

If the changes specified in this document are not successful in significantly mitigating the confusion about the status of the tables published by IANA, serious consideration should be given to eliminating those tables entirely.

Editorial Clarification to RFC 5892

This section updates RFC 5892 to provide fixes for known applicable errata and omissions. In particular, verified RFC Editor Erratum 3312 [Err3312] provides a clarification to Appendix A and A.1 in RFC 5892. That clarification is incorporated below.

1. In Appendix A, add a new paragraph after the paragraph that

   begins "The code point...".  The new paragraph should read:
   |  For the rule to be evaluated to True for the label, it MUST be
   |  evaluated separately for every occurrence of the code point in
   |  the label; each of those evaluations must result in True.

2. In Appendix A.1, replace the "Rule Set" by

       Rule Set:
         False;
         If Canonical_Combining_Class(Before(cp)) .eq. Virama
               Then True;
         If cp .eq. \u200C And
                RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*cp
           (Joining_Type:T)*(Joining_Type:{R,D})) Then True;

IANA Considerations

For the algorithmic review described in Section 3.1, the IESG is to appoint a designated expert [RFC8126] with appropriate expertise to conduct the review and to supply derived property tables to IANA. As provided in Section 5.2 of the Guidelines for Writing IANA Considerations [RFC8126], the designated expert is expected to consult additional sources of expertise as needed. For the code point review, the expertise will be supplied by an IESG-designated expert team as discussed in Section 3.2 and Appendix B. In both cases, the experts should draw on the expertise of other members of the community as needed. In particular, and especially if there is no overlap of the people holding the various roles, coordination with the IAB-appointed liaison to the Unicode Consortium will be essential to mitigate possible errors due to confusion.

As discussed in Section 5, IANA has modified the IDNA tables collection [IANA-IDNA-Tables] by identifying them clearly as non- normative, so that a "current" or "correct" version of those tables is not implied, and by pointing to this document for an explanation. IANA has published tables supplied by the IETF for all Unicode versions through 11.0, retaining all older versions and making them available. Newer tables will be constructed as specified in this document and then made available by IANA. IANA has changed the title of that registry from "IDNA Parameters", which is misleading, to "IDNA Rules and Derived Property Values".

The "Note" in that registry says:

| IDNA does not require that applications and libraries, either for | registration/storage or lookup, support any particular version of | Unicode. Instead, they are required to use derived property | values based on calculations associated with whatever version of | Unicode they are using elsewhere in the application or library. | For the convenience of application and library developers and | others, the IETF has supplied, and IANA maintains, derived | property tables for several version of Unicode as listed below. | It should be stressed that these are not normative in that, in | principle, an application can do its own calculations and these | tables can change as IETF understanding evolves. By contrast, the | list of code points requiring contextual rules and the associated | rules are normative and should be treated as updates to the list | in RFC 5892.

As long as the intent is preserved, the text of that note may be changed in the future at IANA's discretion.

IANA's attention is called to the introduction, in Section 3.2, of a temporary "under review" category to the PVALID, DISALLOWED, etc., entries in the tables.

Security Considerations

Applying the procedures described in this document and understanding of the clarifications it provides should reduce confusion about IDNA requirements. Because past confusion has provided opportunities for bad behavior, the effect of these changes should improve Internet security to at least some small extent.

Because of the preference to keep the derived property value stable (as specified in RFC 5892 and discussed in Section 4), the algorithm used to calculate those derived properties does change as explained in Section 3. If these changes are not taken into account, the derived property value will change, and the implications might have negative consequences, in some cases with security implications. For example, changes in the calculated derived property value for a code point from either DISALLOWED to PVALID or from PVALID to DISALLOWED can cause changes in label interpretation that would be visible and confusing to end users and might enable attacks.

References

Normative References

[IANA-IDNA-Tables]

          IANA, "IDNA Rules and Derived Property Values",
          <https://www.iana.org/assignments/idna-tables>.

[RFC5892] Faltstrom, P., Ed., "The Unicode Code Points and

          Internationalized Domain Names for Applications (IDNA)",
          RFC 5892, DOI 10.17487/RFC5892, August 2010,
          <https://www.rfc-editor.org/info/rfc5892>.

[RFC5892a] Faltstrom, P., Ed., "The Unicode Code Points and

          Internationalized Domain Names for Applications (IDNA)",
          Section 2.7, RFC 5892, August 2010,
          <https://www.rfc-editor.org/rfc/rfc5892.txt>.

[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for

          Writing an IANA Considerations Section in RFCs", BCP 26,
          RFC 8126, DOI 10.17487/RFC8126, June 2017,
          <https://www.rfc-editor.org/info/rfc8126>.

[Unicode] The Unicode Consortium, "The Unicode Standard (Current

          Version)", <http://www.unicode.org/versions/latest/>.  The
          link given will always access the current version of the
          Unicode Standard, independent of its version number or
          date.

[Unicode-properties]

          The Unicode Consortium, "The Unicode Standard Version
          11.0", Section 3.5, 2018,
          <https://www.unicode.org/versions/Unicode11.0.0/>.

Informative References

[Err3312] RFC Errata, Erratum ID 3312, RFC 5892,

          <https://www.rfc-editor.org/errata/eid3312>.

[IAB-Unicode-2018]

          Internet Architecture Board (IAB), "IAB Statement on
          Identifiers and Unicode", 15 March 2018,
          <https://www.iab.org/documents/correspondence-reports-
          documents/2018-2/iab-statement-on-identifiers-and-
          unicode/>.

[IAB-Unicode7-2015]

          Internet Architecture Board (IAB), "IAB Statement on
          Identifiers and Unicode 7.0.0", 11 February 2015,
          <https://www.iab.org/documents/correspondence-reports-
          documents/2015-2/iab-statement-on-identifiers-and-unicode-
          7-0-0/>.

[ICANN-LGR-SLA]

          Internet Corporation for Assigned Names and Numbers
          (ICANN), "Proposed IANA SLAs for Publishing LGRs/IDN
          Tables", 10 June 2019, <https://www.icann.org/public-
          comments/proposed-iana-sla-lgr-idn-tables-2019-06-10-en>.

[IDNA-Unicode7]

          Klensin, J. and P. Faltstrom, "IDNA Update for Unicode 7.0
          and Later Versions", Work in Progress, Internet-Draft,
          draft-klensin-idna-5892upd-unicode70-05, 8 October 2017,
          <https://tools.ietf.org/html/draft-klensin-idna-5892upd-
          unicode70-05>.

[IDNA-Unicode11]

          Faltstrom, P., "IDNA2008 and Unicode 11.0.0", Work in
          Progress, Internet-Draft, draft-faltstrom-unicode11-08, 11
          March 2019, <https://tools.ietf.org/html/draft-faltstrom-
          unicode11-08>.

[IDNA-Unicode12]

          Faltstrom, P., "IDNA2008 and Unicode 12.0.0", Work in
          Progress, Internet-Draft, draft-faltstrom-unicode12-00, 11
          March 2019, <https://tools.ietf.org/html/draft-faltstrom-
          unicode12-00>.

[RegRestr] Klensin, J. and A. Freytag, "Internationalized Domain

          Names in Applications (IDNA): Registry Restrictions and
          Recommendations", Work in Progress, Internet-Draft, draft-
          klensin-idna-rfc5891bis-05, 29 August 2019,
          <https://tools.ietf.org/html/draft-klensin-idna-
          rfc5891bis-05>.

[RFC1766] Alvestrand, H., "Tags for the Identification of

          Languages", RFC 1766, DOI 10.17487/RFC1766, March 1995,
          <https://www.rfc-editor.org/info/rfc1766>.

[RFC3282] Alvestrand, H., "Content Language Headers", RFC 3282,

          DOI 10.17487/RFC3282, May 2002,
          <https://www.rfc-editor.org/info/rfc3282>.

[RFC3454] Hoffman, P. and M. Blanchet, "Preparation of

          Internationalized Strings ("stringprep")", RFC 3454,
          DOI 10.17487/RFC3454, December 2002,
          <https://www.rfc-editor.org/info/rfc3454>.

[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,

          "Internationalizing Domain Names in Applications (IDNA)",
          RFC 3490, DOI 10.17487/RFC3490, March 2003,
          <https://www.rfc-editor.org/info/rfc3490>.

[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep

          Profile for Internationalized Domain Names (IDN)",
          RFC 3491, DOI 10.17487/RFC3491, March 2003,
          <https://www.rfc-editor.org/info/rfc3491>.

[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO

          10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
          2003, <https://www.rfc-editor.org/info/rfc3629>.

[RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and

          Recommendations for Internationalized Domain Names
          (IDNs)", RFC 4690, DOI 10.17487/RFC4690, September 2006,
          <https://www.rfc-editor.org/info/rfc4690>.

[RFC5646] Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying

          Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646,
          September 2009, <https://www.rfc-editor.org/info/rfc5646>.

[RFC5890] Klensin, J., "Internationalized Domain Names for

          Applications (IDNA): Definitions and Document Framework",
          RFC 5890, DOI 10.17487/RFC5890, August 2010,
          <https://www.rfc-editor.org/info/rfc5890>.

[RFC5894] Klensin, J., "Internationalized Domain Names for

          Applications (IDNA): Background, Explanation, and
          Rationale", RFC 5894, DOI 10.17487/RFC5894, August 2010,
          <https://www.rfc-editor.org/info/rfc5894>.

[RFC6452] Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code

          Points and Internationalized Domain Names for Applications
          (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452,
          November 2011, <https://www.rfc-editor.org/info/rfc6452>.

Appendix A. Summary of Changes to RFC 5892

Other than the editorial correction specified in Section 6, all of the changes in this document are concerned with the reviews for new versions of Unicode and with the IANA Considerations in Section 5 of [RFC5892], particularly Section 5.1 of [RFC5892]. Whether the changes are substantive or merely clarifications may be somewhat in the eye of the beholder, so the list below should not be assumed to be comprehensive. At a very high level, this document clarifies that two types of review were intended and separates them for clarity. This document also restores the original (but so far unobserved) default for actions when code point derived properties change. For this reason, this document effectively replaces Section 5.1 of [RFC5892] and adds or changes some text so that the replacement makes better sense.

Changes or clarifications that may be considered important include:

  • Separated the new Unicode version review into two explicit parts
  and provided for different review methods and, potentially,
  asynchronous outcomes.
  • Specified a DE team, not a single designated expert, for the code
  point review.
  • Eliminated the de facto requirement for the (formerly single)
  designated expert to be the same person as the IAB's liaison to
  the Unicode Consortium, but called out the importance of
  coordination.
  • Created the "Status" field in the IANA tables to inform the
  community about specific potentially problematic code points.
  This change creates the ability to add information about such code
  points before IETF review is completed instead of having the
  review process hold up the use of the new Unicode version.
  • In part because Unicode is now on a regular one-year cycle rather
  than producing major and minor versions as needed, to avoid
  overloading the IETF's internationalization resources, and to
  avoid generating and storing IANA tables for trivial changes
  (e.g., the single new code point in Unicode 12.1), the review
  procedure is applied only to major versions of Unicode unless
  exceptional circumstances arise and are identified.

Appendix B. Background and Rationale for Expert Review Procedure for

         New Code Point Analysis

The expert review procedure for new code point analysis described in Section 3.2 is somewhat unusual compared to the examples presented in the Guidelines for Writing IANA Considerations [RFC8126]. This appendix explains that choice and provides the background for it.

Development of specifications to support use of languages and writing systems other than English (and Latin script) -- so-called "internationalization" or "i18n" -- has always been problematic in the IETF, especially when requirements go beyond simple coding of characters (e.g., RFC 3629 [RFC3629]) or simple identification of languages (e.g., RFC 3282 [RFC3282] and the earlier RFC 1766 [RFC1766]). A good deal of specialized knowledge is required, knowledge that comes from multiple fields and that requires multiple perspectives. The work is not obviously more complex than routing, especially if one assumes that routing work requires a solid foundation in graph theory or network optimization, or than security and cryptography, but people working in those areas are drawn to the IETF and people from the fields that bear on internationalization typically are not.

As a result, we have often thought we understood a problem, generated a specification or set of specifications, but then have been surprised by unanticipated (by the IETF) issues. We then needed to tune and often revise our specification. The language tag work that started with RFC 1766 is a good example of this: broader considerations and requirements led to later work and a much more complex and finer-grained system [RFC5646].

Work on IDNs further increased the difficulties because many of the decisions that led to the current version of IDNA require understanding the DNS, its constraints, and, to at least some extent, the commercial market of domain names, including various ICANN efforts.

The net result of these factors is that it is extremely unlikely that the IESG will ever find a designated expert whose knowledge and understanding will include everything that is required.

Consequently, Section 7 and other discussions in this document specify a DE team that is expected to have the broad perspective, expertise, and access to information and community in order to review new Unicode versions and to make consensus recommendations that will serve the Internet well. While we anticipate that the team will have one or more leaders, the structure of the team differs from the suggestions given in Section 5.2 of the Guidelines for Writing IANA Considerations [RFC8126] since neither the team's formation nor its consultation is left to the discretion of the designated expert, nor is the designated expert solely accountable to the community. A team that contains multiple perspectives is required, the team members are accountable as a group, and any nontrivial recommendations require team consensus. This also differs from the common practice in the IETF of "review teams" from which a single member is selected to perform a review: the principle for these reviews is team effort.

Acknowledgments

This document was inspired by extensive discussions within the I18N Directorate of the IETF Applications and Real-Time (ART) area in the first quarter of 2019 about sorting out the reviews for Unicode 11.0 and 12.0. Careful reviews by Joel Halpern and text suggestions from Barry Leiba resulted in some clarifications.

Thanks to Christopher Wood for catching some editorial errors that persisted until rather late in the document's life cycle and to Benjamin Kaduk for catching and raising a number of questions during Last Call. Some of the issues they raised have been reflected in the document; others did not appear to be desirable modifications after further discussion, but the questions were definitely worth raising and discussing.

Authors' Addresses

John C Klensin 1770 Massachusetts Ave, Ste 322 Cambridge, MA 02140 United States of America

Phone: +1 617 245 1457 Email: [email protected]

Patrik Fältström Netnod Greta Garbos Väg 13 SE-169 40 Solna Sweden

Phone: +46 70 6059051 Email: [email protected]