ietf-nntp Niggles

Thu Nov 6 10:04:14 PST 2003

I have just completed a read-through of the latest draft (well, the
pre-1.txt actually, but I have checked 20.txt where necessary). I have
unearthed various niggles which could usefully be fixed (but only a couple
that could be considered serious).

P1.
   UNIX is a registered trademark of the X/Open Company Ltd.

Is that still correct this week? And if it is, is it "Ltd" or "Inc".

P17.

   o  A message-id MUST NOT contain octets other than printable US-ASCII
      characters.

Is is sufficiently clear that SP is "non-printable"? Some would argue that is
it printed, but it just so happens that the amount of ink deposited is zero.
Would do no harm to clarify.

   This specification does not describe how the message-id of an article
   is determined. If the server does not have any way to determine a
   message-id from the article itself, it MUST synthesise one (this
   specification does not require the article to be changed as a
   result).

This not immediately clear to those who have not followed our discussions.
Some reference to Appendix B2 would be helpful here.

P25.

   An NNTP client may cache the results of this command, but MUST NOT
                  ^^^
		  MAY

P26.

5.3.3 Examples

   Example of a successful response:

      [C] LIST EXTENSIONS
      [S] 202 Extensions supported:
      [S] OVER
      [S] HDR
      [S] LISTGROUP
      [S] .

Now that we have at last defined an extension that has a parameter, please can
we have an example here. e.g.:

      [S] HDR ALL

P29.

   o  new articles may be added with article numbers greater than the
      reported high water mark (if an article that was the one with the
      highest number has been removed, the next new article will not
                                                            ^^^^
							    may
      have the number one greater than the reported high water mark)

Counter-example:

HWM was 999
That article was then cancelled/removed
The server did not bother to reduce the HWM to 998 (it is not so obliged)
Next article arrived and was given #1000.

P39.

   Example of an unsuccessful retrieval the headers of an article by
                                       ^
				       of
   number because no newsgroup was selected first:

P42.

   Example of an STAT of an article not on the server by message-id:
              ^^
	      a

   Example of STAT of an article not in the server by number:
                                     ^^
				     on

P43.

   If posting is permitted, the article MUST be in the format specified
   in Section 3.4 and MUST be sent by the client to the server in the
   manner specified in (Section 3.1) for multi-line responses (except
   that there is no initial line containing a response code). Thus a

Either
   "specified in Section 3.1 for"
Or
   "specified (Section 3.1) for"

P62.

   An extension is either a private extension or else it is included in
   the IANA registry and is defined in an RFC. Such RFCs either must be
   on the standards track or must define an IESG-approved experimental
   protocol.

My reading of RFC 2026 is that the IESG does not approve experimental
protocols. There is provision for the IESG to "review" them if the RFC Editor
considers them to be a bit "iffy" and to insert a "disclaimer" or take other
action, but failure to do that does not actually imply "approval".

P76.

   and then each TAB is replaced with a single space (note that this is
   the same transformation as is performed by the OVER extension
   (Section 8.5.1.2), and the same comment concerning NUL, CR, and LF
   applies).

Doubly nested parentheses.

   If the requested header is not present in the article or if it is
   present but empty, a line for that article is included in the output

That is not my understanding of the case when the requested header is absent
from the article, and it is not consistent with what is said on the previous
page:

   If the information is available, it is returned as a multi-line
   response following the 225 response code and contains one line for
   each article where the relevant header line or metadata item exists
   (note that unless the argument is a range including a dash, there
   will be at most one line but it will still be in multi-line format).

If my interpretation is correct, then the example in the following section
needs fixing also.

P78.

8.6.2 LIST HEADERS

8.6.2.1 Usage

   Syntax
      LIST HEADERS

Why is this command not optional in the same way as LIST OVERVIEW.FMT?

P83.

     A-NOTGT    = %x21-3D / %x3F-7E  ; exclude ">"
                                               ^^^
					       SP and ">"

(for the removal of all doubt - see earlier).

P85.

   This specification requires IANA to keep a registry of
   extension-labels. The initial contents of this registry are specified
   in Section 8.1. As described in Section 8, names beginning with X are
   reserved for private use while all other names are to be associated
   with a specification in an RFC on the standards-track or defining an
   IESG-approved experimental protocol.

See earlier remarks re IESG-approved experimental protocols.

P94.

   It has been proposed that the response code range 6xx is used for
                                                         ^^
							 be
   multiline responses. While existing commands and extensions do not
   use this, it would at least limit the problem clients would face in
   dealing with an unknown response.

(yes, English _does_ have a subjunctive mood).

P95.

   NNTP is most often used for transferring articles that conform to RFC
   1036 [RFC1036] (such articles are called "Usenet articles" here). It
   is also sometimes used for transferring email messages that conform
   to RFC 2822 [RFC2822] (such articles are called "email articles"
   here). In this situation, articles must conform both to this
   specification and to that other one; this appendix describes some
   relevant issues.

Maybe you should be speaking of "Netnews articles" rather than "Usenet
articles". Opinions are often divided as to how "Usenet" is defined. Usefor
has taken the view that "Netnews" is the protocol, and "Usenet" is the
particularly large and well-known instantiation of that protocol.

Also, I believe it is customary to speak about "Usenet/Netnews _articles_",
but "Email _messages_".

   Every article handled by an NNTP server MUST have a unique
   message-id. For the purposes of this specification, a message-id is
   an arbitrary opaque string that is merely needs to meet certain
                                             ^^^^^
					     needed
   syntactic requirements and is just a way to refer to the article.

   This specification states that message-ids are the same if and only
   if they consist of the same sequence of octets. Other specifications
   may define two different sequences as being equal because they are
   putting an interpretation on particular characters. RFC 2822
   [RFC2822] has a concept of "quoted" and "escaped" characters. It
   therefore considers the three messages-ids:
                                 ^^^^^^^^^^^^
				 message-ids

P96.

      <abcd at example.com>
      <"abcd"@example.com>
      <"ab\cd"@example.com>

   as being identical. Therefore an NNTP implementation handing email
   articles must ensure that only one of these three appears in the
   protocol and the other two are converted to it as and when necessary,
   such as when a client checks the results of a NEWNEWS command against
   an internal database of message-ids.

That is a mess which one hopes some future revision of RFC 2822 will fix (by
moving the 'bad' cases to its 'obsolete' syntax). So I would not encourage
implementors to dally with that nonsense (better just to compare the bytes and
if it breaks then the nonsense will be truly seen for what it is).

More importantly, I think you should make it clear that this nonsense is NOT
needed with Netnews, (whether you take the definition of Message-ID from RFC
1036 or from Usefor, which has taken great care to avoid it).

   A common approach, and one that SHOULD be used for email and Usenet
   articles, is to extract the message-id from the contents of a header
   with name "Message-ID". This may not be as simple as copying the
   entire header contents; it may be necessary to strip off comments and
   undo quoting, or to reduce "equivalent" message-ids to a canonical
   form.

No, that is bad advice so far as Usenet is concerned. Neither RFC 1036 or
Usefor allows comments in the Message-ID-header, and neither allows quoting.
Moreover, the performance penalty of doing this routinely in Usenet would be
prohibitive, and no implementor is going to do it.

If some implementor chooses to use an NNTP server to serve Email messages,
then he can build it in if he wants to, but not for Usenet.

Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl at clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5