ietf-nntp [a-z] in Wildmats

Clive D.W. Feather clive at demon.net
Thu Jun 21 04:58:02 PDT 2001


Maurizio Codogno said:
> Given that a normalization phase has to be carried out anyway,
> so the problems with E ACUTE do not arise, for diacritical marks in 
> other languages we may just dictate that they are normalized in a 
> fixed order.

Yes, and we're already doing that in Usefor. That's not the problem.

Consider two groups:

              /
    alt.stranger           alt.stranger
                                     ,

where the first has an e-acute and the second an e-cedilla.

On the screen both will appear as single characters. Therefore a user might
expect that "alt.strang?r" would match both and "alt.strange*" would match
neither. However, although Unicode has an e-acute character, it doesn't
have an e-cedilla character. Therefore normalization means that the first
is a single character in the name, while the second is represented by "e"
followed by "cedilla". Thus:

    alt.strange*       matches the e-cedilla form but not the e-acute form
    alt.strang?r       matches the e-acute form but not the e-cedilla form

and that's the potential problem.

-- 
Clive D.W. Feather  | Work:  <clive at demon.net>   | Tel:  +44 20 8371 1138
Internet Expert     | Home:  <clive at davros.org>  | Fax:  +44 20 8371 1037
Demon Internet      | WWW: http://www.davros.org | DFax: +44 20 8371 4037
Thus plc            |                            | Mobile: +44 7973 377646 



More information about the ietf-nntp mailing list