ietf-nntp wildmat routines and text

Clive D.W. Feather clive at demon.net
Fri Jul 28 00:39:50 PDT 2000


Russ Allbery said:
> My main question here is how much are we really gaining?  UTF-8 characters
> can be sent in NNTP commands to pretty much all existing servers without
> any difficulty, so adding backslash escapes doesn't result in any changes
> in expressiveness.  Is it mostly just a change to make debugging and
> manual command entry easier?

I think so.

>> Would it be less significant if the property was only removed for ASCII
>> alphanumerics, and retained elsewhere. I can see people writing \: for
>> safety, but not \a\l\t.
> 
> Depends on how simple and stupid the algorithm they're using is.  The
> "simplest thing that could possibly work" is to just backslash every
> character without caring if it's alphanumeric or not.  It complicates the
> meaning of backslash any way you cut it; the question is whether anyone
> was taking advantage of that meaning of backslash.

Well, I'm coming from the background of C programming, where it's well
known that backslash has two meanings:
(1) before an alphanumeric, introduces some special feature (e.g. "\n");
(2) before other characters, removes the special meaning of that character.

> It also makes the
> wildmat parser mildly more complicated (because now it has to know how to
> do ISO 10646 to UTF-8 conversion

Though it's pretty trivial.

> and has to handle more ill-formed
> wildmats; one of the rather nice things about wildmats is that outside of
> character classes, I think the only ill-formed wildmat is one ending in a
> backslash).

or a malformed UTF-8 sequence.

> The nice thing about wildmats is that they're extremely simple
> (particularly if you ignore character classes, which almost no one uses).

Would a way to increase consensus be to say that [ in a wildmat introduces
implementation-defined behaviour, and leave it at that ? I honestly don't
know. As I keep saying, I see my role here as mostly one of ensuring the
document says what we mean, rather than having any particular meaning.

>> So what do you think \ should mean in a class ?
> I guess I don't really care all that much; it means more work for the
> parser, but I only have to write that once.  :)  I doubt that there are
> enough users of character classes out there already to notice the change.

I think we've had most of the possibilities banded about. I'd like to know
what other people (preferably lots of them) think should be in the document.

-- 
Clive D.W. Feather  | Work:  <clive at demon.net>   | Tel:  +44 20 8371 1138
Internet Expert     | Home:  <clive at davros.org>  | Fax:  +44 20 8371 1037
Demon Internet      | WWW: http://www.davros.org | DFax: +44 20 8371 4037
Thus plc            |                            | Mobile: +44 7973 377646 



More information about the ietf-nntp mailing list