ietf-nntp wildmat routines and text

Clive D.W. Feather clive at demon.net
Wed Jul 26 10:51:14 PDT 2000


Andrew Gierth said:
> One additional comment on wildmat syntax: especially when handling
> UTF-8, it would be useful to support numeric \-escapes (e.g. \040
> or \x1234), even in []-sets.

If you're going to do that, can I suggest that you steal the syntax from
C99:

    \uxxxx           means U+0000xxxx in the ISO 10646 character set
    \Uxxxxxxxx       means U+xxxxxxxx in the ISO 10646 character set

(the xs are hexadecimal digits).

It is then clear:
- how long the escape sequence is (6 or 10 single-octet characters)
- whether the code number is the Unicode value or the local encoding
  (the former).

You might also want to forbid escapes where the value is less than 0x80 (so
that there's no question of dealing with \u codes for [ or \.

-- 
Clive D.W. Feather  | Work:  <clive at demon.net>   | Tel:  +44 20 8371 1138
Internet Expert     | Home:  <clive at davros.org>  | Fax:  +44 20 8371 1037
Demon Internet      | WWW: http://www.davros.org | DFax: +44 20 8371 4037
Thus plc            |                            | Mobile: +44 7973 377646 



More information about the ietf-nntp mailing list