[ietf-nntp] Re: ietf-nntp Niggles

Mon Dec 8 11:54:47 PST 2003

Charles Lindsey said:
> P1.

In future, please use section numbers rather than page numbers, or use
both. It makes it easier to find in the source. Thanks.

>    UNIX is a registered trademark of the X/Open Company Ltd.
> 
> Is that still correct this week? And if it is, is it "Ltd" or "Inc".

It's "The Open Group" (who were X/Open). See <http://lwn.net/Articles/33409/>.

> P17.
>    o  A message-id MUST NOT contain octets other than printable US-ASCII
>       characters.
> 
> Is is sufficiently clear that SP is "non-printable"?

Yes. The term is adequately defined in both sections 2 and 9.

>    This specification does not describe how the message-id of an article
>    is determined. If the server does not have any way to determine a
>    message-id from the article itself, it MUST synthesise one (this
>    specification does not require the article to be changed as a
>    result).
> 
> This not immediately clear to those who have not followed our discussions.
> Some reference to Appendix B2 would be helpful here.

Added.

> P25.
>    An NNTP client may cache the results of this command, but MUST NOT
>                   ^^^
> 		  MAY

Done.

> Now that we have at last defined an extension that has a parameter, please can
> we have an example here. e.g.:
>       [S] HDR ALL

Done.

> P29.
>    o  new articles may be added with article numbers greater than the
>       reported high water mark (if an article that was the one with the
>       highest number has been removed, the next new article will not
>                                                             ^^^^
> 							    may
>       have the number one greater than the reported high water mark)

Added "and the high water mark adjusted accordingly" after "removed".

> P39.
> 
>    Example of an unsuccessful retrieval the headers of an article by
>                                        ^
> 				       of
>    number because no newsgroup was selected first:

Fixed.

> P42.
>    Example of an STAT of an article not on the server by message-id:

All changed to read "Example of STAT on an article ...".

> P43.
> 
>    If posting is permitted, the article MUST be in the format specified
>    in Section 3.4 and MUST be sent by the client to the server in the
>    manner specified in (Section 3.1) for multi-line responses (except
>    that there is no initial line containing a response code). Thus a
> 
> Either
>    "specified in Section 3.1 for"
> Or
>    "specified (Section 3.1) for"

Actually meant to be "specified (in Section 3.1) for"

> P76.
>    and then each TAB is replaced with a single space (note that this is
>    the same transformation as is performed by the OVER extension
>    (Section 8.5.1.2), and the same comment concerning NUL, CR, and LF
>    applies).
> 
> Doubly nested parentheses.

So what?

>    If the requested header is not present in the article or if it is
>    present but empty, a line for that article is included in the output
> 
> That is not my understanding of the case when the requested header is absent
> from the article, and it is not consistent with what is said on the previous
> page:
> 
>    If the information is available, it is returned as a multi-line
>    response following the 225 response code and contains one line for
>    each article where the relevant header line or metadata item exists
>    (note that unless the argument is a range including a dash, there
>    will be at most one line but it will still be in multi-line format).
> 
> If my interpretation is correct, then the example in the following section
> needs fixing also.

I see that Russ isn't sure either.

There's four situations to consider:
(1) Header present and has content.
(2) Header present but empty.
(3) Header not present.
(4) No article.

(1) is easy. (4) is agreed - omit the article from the response. The
question is over (2) and (3). I think (2) should be included in the
response and am agnostic over (3).

I will wait for consensus.

> P78.
> 8.6.2 LIST HEADERS

Answered by Russ.

> P83.
> 
>      A-NOTGT    = %x21-3D / %x3F-7E  ; exclude ">"
>                                                ^^^
> 					       SP and ">"
>
> (for the removal of all doubt - see earlier).

And have you looked 6 lines above? Also see earlier.

> P94.
> 
>    It has been proposed that the response code range 6xx is used for
>                                                          ^^
> 							 be
>    multiline responses.

Fixed.

> P95.
> 
>    NNTP is most often used for transferring articles that conform to RFC
>    1036 [RFC1036] (such articles are called "Usenet articles" here). It
>    is also sometimes used for transferring email messages that conform
>    to RFC 2822 [RFC2822] (such articles are called "email articles"
>    here). In this situation, articles must conform both to this
>    specification and to that other one; this appendix describes some
>    relevant issues.
> 
> Maybe you should be speaking of "Netnews articles" rather than "Usenet
> articles". Opinions are often divided as to how "Usenet" is defined. Usefor
> has taken the view that "Netnews" is the protocol, and "Usenet" is the
> particularly large and well-known instantiation of that protocol.

Okay, I'm happy to be consistent with this.

One query - we have "global Usenet system" in the text. Should this
change to "global Netnews system" or not.

> Also, I believe it is customary to speak about "Usenet/Netnews _articles_",
> but "Email _messages_".

But they're "articles" in our document.

>    Every article handled by an NNTP server MUST have a unique
>    message-id. For the purposes of this specification, a message-id is
>    an arbitrary opaque string that is merely needs to meet certain
>                                              ^^^^^
> 					     needed
>    syntactic requirements and is just a way to refer to the article.

No, "that merely needs to meet".

>    This specification states that message-ids are the same if and only
>    if they consist of the same sequence of octets. Other specifications
>    may define two different sequences as being equal because they are
>    putting an interpretation on particular characters. RFC 2822
>    [RFC2822] has a concept of "quoted" and "escaped" characters. It
>    therefore considers the three messages-ids:
>                                  ^^^^^^^^^^^^
> 				 message-ids

Oops.

> P96.
> 
>       <abcd at example.com>
>       <"abcd"@example.com>
>       <"ab\cd"@example.com>
> 
>    as being identical. Therefore an NNTP implementation handing email
>    articles must ensure that only one of these three appears in the
>    protocol and the other two are converted to it as and when necessary,
>    such as when a client checks the results of a NEWNEWS command against
>    an internal database of message-ids.
> 
> That is a mess which one hopes some future revision of RFC 2822 will fix (by
> moving the 'bad' cases to its 'obsolete' syntax). So I would not encourage
> implementors to dally with that nonsense (better just to compare the bytes and
> if it breaks then the nonsense will be truly seen for what it is).

But it hasn't been fixed yet (and "future" could be a long way away).

> More importantly, I think you should make it clear that this nonsense is NOT
> needed with Netnews, (whether you take the definition of Message-ID from RFC
> 1036 or from Usefor, which has taken great care to avoid it).

What is the current definition in Usefor? Which of those three IDs is
legal? I'm happy to add text when I know where we are.

>    A common approach, and one that SHOULD be used for email and Usenet
>    articles, is to extract the message-id from the contents of a header
>    with name "Message-ID". This may not be as simple as copying the
>    entire header contents; it may be necessary to strip off comments and
>    undo quoting, or to reduce "equivalent" message-ids to a canonical
>    form.
> 
> No, that is bad advice so far as Usenet is concerned. Neither RFC 1036 or
> Usefor allows comments in the Message-ID-header, and neither allows quoting.
> Moreover, the performance penalty of doing this routinely in Usenet would be
> prohibitive, and no implementor is going to do it.
> 
> If some implementor chooses to use an NNTP server to serve Email messages,
> then he can build it in if he wants to, but not for Usenet.

This text doesn't apply only to Usenet. People writing an NNTP server for
RFC1036 articles can rely on anything in that; we say so at the very start
of Appendix B.

-- 
Clive D.W. Feather  | Work:  <clive at demon.net>   | Tel:    +44 20 8495 6138
Internet Expert     | Home:  <clive at davros.org>  | *** NOTE CHANGE ***
Demon Internet      | WWW: http://www.davros.org | Fax:    +44 870 051 9937
Thus plc            |                            | Mobile: +44 7973 377646