[NNTP] Article Numbers Becoming Invalid (RFC 3977)

Julien ÉLIE julien at trigofacile.com
Sat Jan 2 15:29:58 PST 2010


Hi Russ,

>> If it is really what a news server is supposed to do, then RFC 3977 needs
>> to be amended.  Only one additional word is needed:  "the current article
>> number MUST be set to the first *valid* article in the group".
>> Because the first article is currently defined as the reported low water
>> mark -- which is also what INN implements.
>
> Hm, while that wording change may be useful for clarity, I think that's
> what RFC 3977 already says.  See:
>
>   The successful selection response will return the article numbers of
>   the first and last articles in the group at the moment of selection
>   (these numbers are referred to as the "reported low water mark" and
>   the "reported high water mark") and an estimate of the number of
>   articles in the group currently available.
>
> To me, the "at the moment of selection" pretty clearly implies that GROUP
> should be checking and those numbers should really be valid.
>
> Also, in general it doesn't seem right that one should have to say that it
> must be set to the first valid article in the group.  I think that should
> be implied by saying that it's set to the first article.  A number that
> doesn't correspond to an article isn't really an article.
>
> I think the implication of RFC 3977 is that the server can handwave the
> article estimate, but the low and high water marks should really be
> correct.

It is a very brilliant analysis of the situation.  Thanks!
It's clear and precise.



> (In practice, this will slow down INN a little bit.)

Anyway, I believe it cannot always be done in practice.  CNFS buffers are
self-expiring and it would need to continuously update low water marks.
LIST ACTIVE should give up-to-date low water marks.  (As well as accurate
high water marks.)

Besides, as Curt Welch once said in news.software.nntp:

  http://groups.google.fr/group/news.software.nntp/browse_frm/thread/35fc307f6e9e39ca
  news:20091022170403.918$Nb at newsreader.com

    The reality is that the low and high water mark are, for practical reasons,
    only approximations because they can be, and sometimes actually are,
    changing faster than you issue two NNTP commands. [...]

    In today's distributed servers, it's actually quite a lot of work to get
    the low water mark synchronized in near real time to the articles being
    expired on the spool servers.  Some systems just don't attempt to do it in
    real time and instead, update the low water marks as a periodic batch job
    (that might run even less than once a day).  It takes a lot of IO to
    accurately track the low water mark with the expiration of the articles in
    a distributed systems and many have simply stopped trying in order to
    reduce that load.


The other mail you sent summarizes the problem, which in fact turns to be
a no-problem, in again a brilliant way:

    It's probably worth noting the as-is principle here, though (that being
    the general principle in standardization that standards dictate behavior,
    and an implementation that behaves in a way indistinguishable from a
    conforming implementation is a conforming implementation).  Given that the
    client cannot meaningfully distinguish between a server reporting a
    correct low-water mark and then that article disappearing and a server not
    checking whether the article still exists, and the client has to be able
    to cope either way, an argument could be made that checking on group entry
    is just a quality of implementation issue.




>> The NNTP reference implementation, and also INN, cas use 420 and 423 for
>> what is now called the third form.  It appears that the concept of
>> "invalid" is not the same as it used to be.
>
> In retrospect, it looks like we unfortunately introduced two separate
> concepts of invalid and used the same terminology for both: the invalid
> article number that results from entering an empty group, and an article
> that's gone missing.  We probably should have reserved 420 for the former
> case and continued to use 423 for the latter case.

OK.  Then the only remaining issue I see is in Section 6.1.3.2 of RFC 3977
regarding the LAST command:

   If the currently selected newsgroup is valid, the current article
   number MUST be set to the previous article in that newsgroup (that
   is, the highest existing article number less than the current article
   number).  If successful, a response indicating the new current
   article number and the message-id of that article MUST be returned.

   [...]

   If the current article number is already the first article of the
   newsgroup, a 422 response MUST be returned.  If the current article
   number is invalid, a 420 response MUST be returned.


I think the last sentence should be "If the currently selected newsgroup
is empty, a 420 response MUST be returned."

Otherwise, LAST is supposed to send 420 after ARTICLE has sent 420
because of that ambiguous notion of "invalid" (which is now 420+423).


Incidentally, if the currently selected newsgroup is empty and an article
arrives in it, we cannot retrieve it without reselecting the newsgroup.
ARTICLE will answer 420 (no valid pointer).  So will LAST.

-- 
Julien ÉLIE

« Commencez à creuser des trous pour planter les piquets.
  De beaux trous normands ! » (Astérix) 



More information about the ietf-nntp mailing list