ietf-nntp wildmat routines and text

Martin J. Duerst duerst at w3.org
Sun Jul 30 20:33:50 PDT 2000


At 00/07/27 22:55 +0100, Clive D.W. Feather wrote:
>Martin J. Duerst said:
> >> If you're going to do that, can I suggest that you steal the syntax from
> >> C99:
> >>
> >>      \uxxxx           means U+0000xxxx in the ISO 10646 character set
> >>      \Uxxxxxxxx       means U+xxxxxxxx in the ISO 10646 character set
> >>
> >> (the xs are hexadecimal digits).
> >
> > I suggest to change that to e.g. \uxxxx and \Uxxxxxx, i.e. only
> > six digits for the second one. Both the Unicode consortium and
> > ISO/IEC JTC1/WG2/SC2 have agreed to not encode any characters
> > beyond \U10FFFF, so defining eight digits is clear overkill.
>
>That depends on how trusting you are.
>
>The *space* is defined as 31 bits. The boundary you mention looks rather
>arbitrary to me (it's 2^21 + 2^16 - 1).

It's what you can express in UTF-16.

>And I thought the idea was for
>large private spaces to occupy the high areas.

There are two private planes in UTF-16, together about 130000
characters. Anyway, I don't think it's a good idea to use
private characters in newsgroups.


>I'd rather rely on the limit being 31 bits than a statement "we won't use
>all the space we've allocated". I've *seen* ISO committees break such
>commitments ("we felt the new functionality was too important").

Well, if we were invaded by the Marians, and they would bring a few
million characters with them, they would indeed have to reevaluate
that decision.

Regards,    Martin.



More information about the ietf-nntp mailing list