[NNTP] AD guidance on NNTP i18n issues

Charles Lindsey chl at clerew.man.ac.uk
Tue Mar 29 03:40:26 PST 2005


In <87u0mv5nl0.fsf at windlord.stanford.edu> Russ Allbery <rra at stanford.edu> writes:

>My interpretation of this guidance is:

> * Clive's changes recently posted to the list are good clarifications.
>   Beyond that, the current permissive language around article content can
>   stay, and we don't need to add stringprep or other canonicalization for
>   newsgroup arguments.

Agreed.

> * We need to add an Internationalization Considerations section.  That
>   section should spell out:

>   o The current internationalization problems from an NNTP perspective,
>     namely article data (including data taken from headers), newsgroup
>     names, and LIST NEWSGROUPS output.

I think you want to do that mainly by punting it to Usefor. Something like

"It is anticipated that extensions or replacements to RFC 1036 will
introduce I18N features, notably in connection with newsgroup-names, the
information needed for the LIST NEWSGROUPS output etc. etc. etc. Insofar
as these extension may make use of UTF-8, this present standard has
hopefully made suitable provision. However, it is not precluded that such
developmens may require further extensions to this standard."

>   o The current growing use of MIME for the article format, but also the
>     substantial use of local character sets in article headers and (less
>     commonly) untagged article bodies.

I think you can safely point out that your standard is completely
MIME-proof, and there is no excuse at all for untagged article bodies. By
all means point out that there is some existing (but unauthorized) use of
local charsets in headers and LIST NEWSGROUPS output, and point to the
various "MAY"s that Clive mentioned in another thread, but please do so in
the context that this is to be addressed by the "extensions or
replacements to RFC 1036" that I mentioned above, without implying either
that such practices will be brought within the standards, or that they
will be totally outlawed.

>   o The mix of character sets used in newsgroup descriptions.

Yes, that is mentioned in my suggested wording, but it may need some
amplification.

>   o The current pure-ASCII convention for newsgroup names, which is
>     widespread but not entirely universal.

Actually, it is _almost_ universal (even in China).

>   o The need, long-term, for standardization of character sets, tagging,
>     and encoding for articles, for standardization on UTF-8 for newsgroup
>     descriptions, and for standardization on UTF-8 and canonicalization
>     of newsgroup names.  We should also clearly state that all of those
>     issues require work done in conjunction with the article format
>     standard.

Yes indeed. But be careful not to commit to any particular final solution.
I am sure we all believe that UTF-8 is the way to go, but we may not
succeed in carrying the rest of the world with us.

> * We should strongly discourage any use of newsgroup names that would
>   interfere with the long-term canonicalization goals, since that's the
>   area where a bad choice may be hard to reverse later.  Personally, I'm
>   leaning towards saying that newsgroup names SHOULD be US-ASCII for the
>   time being (read: until a standard is in place specifying how to handle
>   UTF-8 newsgroup names), just to put a significant warning in front of
>   people that experimenting with UTF-8 may cause them interoperability
>   problems later, while keeping the existing syntax that allows for
>   UTF-8 down the road.

And the initial Usefor standard will undoubtedly stick with US-ASCII for
newsgroup-names. So it would indeed be undesirable for the world to
anticipate what might be in its proposed experimental I18N extension. But,
by and large, the less that is said in this section, the less chance that
we will turn out to have shot ourselves in the foot later on.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl at clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5



More information about the ietf-nntp mailing list