ietf-nntp Re: [NNTP Draft] 5. The WILDMAT format

Russ Allbery rra at stanford.edu
Sat Feb 26 22:52:48 PST 2000


Andrew Gierth <andrew at erlenstar.demon.co.uk> writes:

>             The WILDMAT format[5] described here is based on the version
>             first developed by Rich Salz which was derived from the format
>             used in the UNIX "find" command to articulate file names.

[...]

This paragraph badly needs to be broken up into several separate
paragraphs for readability.

>             The third specifies a specific set of characters. The set is
>             specified as a list of characters, or as a range of
>             characters where the beginning and end of the range are
>             separated by a minus (or dash) character, or as any
>             combination of lists and ranges. The dash can also be
>             included in the set as a character it if is the beginning or
>             end of the set. This set is enclosed in square brackets. The
>             close square bracket (]) may be used in a set if it is the
>             first character in the set.

In the presence of UTF-8, ranges are very poorly specified.  Handling
character ranges in wildcard expressions in the presence of Unicode and
combining characters and the like is one of those incredibly complicated
subjects that one really doesn't want to get into.  I'm not sure what a
good solution to this would be; in practice, character ranges are rarely
used.  Perhaps it would be worth just saying that character ranges are
only guaranteed to be usable for ASCII characters.

-- 
Russ Allbery (rra at stanford.edu)         <URL:http://www.eyrie.org/~eagle/>



More information about the ietf-nntp mailing list