ietf-nntp Commetns on draft-15.pdf

Wed Jan 2 08:58:16 PST 2002

"Clive D.W. Feather" wrote:
> 
> Stan Barber said:
> >> P53-17       I find it odd that the commands DATE, HELP, NEWGROUPS and NEWNEWS
> >>      are described _after_ the CONCLUSION step, and even after the
> >>      Extensions.  I would recommend some section re-ordering here (but
> >>      I don't think I want a wholesale re-ordering as some others have
> >>      suggested).
> >
> > These commands are here because they are not usually used in a session.
> > So, they are defined after all the commands that are usually used in a
> > session.
> 
> That is extraordinarily arrogant of you.

Coming from you, I take that as a compliment. :-)

> 
> In the last 57,641 NNTP sessions made by romana.davros.org, the NEWGROUPS
> command was used at least once in every session, the NEWNEWS command at
> least twice, and the GROUP, LAST, NEXT, STAT, IHAVE, LIST ACTIVE.TIMES,
> and LIST DISTRIB.PATS commands (all part of your "usually used" section)
> *never*.

Thanks for the data point. 

> 
> This machine is not alone. I strongly suspect that the Demon servers see
> far more NEWNEWS commands than NEXT commands.

The Demon servers are not all the news servers in the world, last I checked. Of
course, that was a month ago and things may have changed. The stats I have
access to indicate that NEWNEWS is often used, but not used any more than
GROUP/LAST/NEXT/STAT group. Originally, the  sentiment was to encourage use of
the GROUP/LAST/NEXT/STAT commands over using NEWNEWS.

> 
> You don't know what the "usual" pattern of use is. Therefore ordering the
> document on that basis is wrong. The ordering should be logical:
> 
> - greeting step
> - mandatory commands
> - conclusion step        ) either way round
> - extensions             ) would be logical

To quote your statement earlier, it's arrogant of you to assume that I don't
know something
that I actually do know. In any case, I am not looking to reorganize the
document. I am looking to try to get the document wrapped up and submitted to
IESG. 

> 
> >> P61+8,14
> >>      The UTF-8 syntax in USEFOR is:
> [...]
> >>      The difference is that USEFOR has excluded more octets
> >>      that are not supposed to occur in UTF-8, including all those which
> >>      would belong to Unicode "surrogates". Do we want to make the two
> >>      drafts identical at this point, for the removal of all confusion?
> >
> > It might be more appropriate for one or the other group to publish a
> > UTF-8 definition as its own RFC and then have both groups refer to it.
> 
> UTF-8 is *defined* by the Unicode Organisation in cooperation with ISO.
> 
> What we're talking about here is a syntax notation. To write the syntax
> so as to include exactly the valid sequences and no others is almost
> impossible, especially when you look at character semantics. You *always*
> need to say that, despite the syntax, it is not permitted to use sequences
> forbidden by the formal definition.
> 
> So the question is how much effort to put in. For example:
> 
> * The minimal unambiguous syntax is:
> 
>     UTF-8-non-ascii = %xC0-FF *%x80-BF
> 
> * A trivial change is:
> 
>     UTF-8-non-ascii = %xC2-FD 1*5%x80-BF
> 
> which eliminates many invalid sequences.
> 
> * You can expose the basic structure of UTF-8 using the syntax we have
> at present:
> 
>   UTF-8-non-ascii = UTF8-2 / UTF8-3 / UTF8-4 / UTF8-5 / UTF8-6
>   UTF8-1 = %x80-BF
>   UTF8-2 = %xC0-DF UTF8-1
>   UTF8-3 = %xE0-EF 2UTF8-1
>   UTF8-4 = %xF0-F7 3UTF8-1
>   UTF8-5 = %xF8-FB 4UTF8-1
>   UTF8-6 = %xFC-FD 5UTF8-1
> 
> Again, C0 can be changed to C2.
> 
> * You can eliminate all the "wrong length" sequences:
> 
>   UTF-8-non-ascii = UTF8-2 / UTF8-3 / UTF8-4 / UTF8-5 / UTF8-6
>   UTF8-1 = %x80-BF
>   UTF8-2 = %xC2-DF UTF8-1
>   UTF8-3 = %xE0 %xA0-BF  UTF8-1 / %xE1-EF 2UTF8-1
>   UTF8-4 = %xF0 %x90-BF 2UTF8-1 / %xF1-F7 3UTF8-1
>   UTF8-5 = %xF8 %x88-BF 3UTF8-1 / %xF9-FB 4UTF8-1
>   UTF8-6 = %xFC %x84-BF 4UTF8-1 / %xFD 5UTF8-1
> 
> * You can eliminate "surrogates" by changing one line of that:
> 
>   UTF8-3 = %xE0 %xA0-BF UTF8-1 / %xE1-EC 2UTF8-1 /
>            %xED %x80-9F UTF8-1 / %xEE-EF 2UTF8-1
> 
> * You can eliminate all values outside Unicode's declared limit of
> U+10FFFF:
> 
>   UTF-8-non-ascii = UTF8-2 / UTF8-3 / UTF8-4
>   UTF8-1 = %x80-BF
>   UTF8-2 = %xC2-DF UTF8-1
>   UTF8-3 = %xE0 %xA0-BF UTF8-1 / %xE1-EC 2UTF8-1 /
>            %xED %x80-9F UTF8-1 / %xEE-EF 2UTF8-1
>   UTF8-4 = %xF0 %x90-BF 2UTF8-1 / %xF1-F3 3UTF8-1 /
>            %xF4 %x80-8F 2UTF8-1
> 
> The choice is ours !
> 

Do you believe that no more changes to the BNF concerning UTF8 will be required
after this change? We have to stop changing the document at some point to get
the work closed out and there is a point where the changes are not really doing
significant good.