ietf-nntp Message from Eric A. Hall: NNTP and 16-bit charsets

Stan O. Barber sob at verio.net
Sun May 6 00:50:10 PDT 2001


Russ Allbery wrote:

> Surely you would agree that designing a 16-bit or 32-bit character set
> with this property is not particularly hard?  There's a very obvious
> range of character numbers that you simply don't assign.
> 
> I don't know whether anyone has designed such a character set, but it's
> clearly possible.  I believe that makes the note above factually
> incorrect.

Towards your comment above, according to one of my deep Unicode.org
sources, the original design specification for ISO/IEC 10646 prohibited
the use of any octets that could be interpreted as historical C0 controls.
This meant that massive portions of the space were unavailable, but it was
felt that such a tradeoff was necessary to preserve backwards
compatibility. This design element was abandoned when 10646 was unified
with the Unicode spec.

Also interesting to note that the feared collisions are indeed occurring
in some interesting ways. I am hearing from some TELNET people that the C1
lookalike bytes generated by UTF-8 (10xxxxxx) are causing problems with
terminal drivers. Although the host and client may both understand the
UTF-8 stream, the terminal drivers on the server is seeing these octets go
by and acting on them.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/



More information about the ietf-nntp mailing list