ietf-nntp [a-z] in Wildmats

Charles Lindsey chl at clw.cs.man.ac.uk
Fri Jun 22 03:03:58 PDT 2001


In <20010621125802.L85722 at demon.net> "Clive D.W. Feather" <clive at demon.net> writes:

>Consider two groups:

>              /
>    alt.stranger           alt.stranger
>                                     ,

>where the first has an e-acute and the second an e-cedilla.

>On the screen both will appear as single characters. Therefore a user might
>expect that "alt.strang?r" would match both and "alt.strange*" would match
>neither. However, although Unicode has an e-acute character, it doesn't
>have an e-cedilla character. Therefore normalization means that the first
>is a single character in the name, while the second is represented by "e"
>followed by "cedilla". Thus:

Yes, there is a problem with '?', but I think that has to be lived with
(and people who use languages needing such characters will probably be
familiar with the problem anyway). But the case I has in mind was where
you put such characters in [...], and applying that to the case in
question will give you
	alt.strang[ée¸]r
which does not look much like what you typed, and will match
	alt.strangér
	alt.stranger
	alt.strangr
	         ¸
(sorry, it is hard to show g-cedilla)

Which is why I now favour omitting the [...] feature (but leave the '?').

BTW, does anyone know what happens to filename globbing in Unix shells
that operate in a UTF-8 locale. Presumably they have exactly this same
problem. On of the things on my todo list is to switch on the UTF-8 locale
in my solaris system someday, and see what it actually does.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl at clw.cs.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5



More information about the ietf-nntp mailing list