ietf-nntp Backfill

Richard Clayton richard at turnpike.com
Sun Dec 15 09:05:31 PST 1996


I'm looking at this from the point of view of an author of a "offline"
client using sensible assumptions about how news servers work.

By an "offline" client I mean one which connects to the server and drags
all articles of interest across to the client machine. It then
disconnects from the server (and in the usual case, stops paying the
local telephone company for being connected to the Internet).

In article <K$FX2JA30AsyEwSv at on-the-train.demon.co.uk>, "Clive D.W.
Feather" <Clive at on-the-train.demon.co.uk> writes

>[Q1] When a group becomes the current group, is the set of articles made
>available to the client fixed at that moment ?

Clients will assume so... they will only issue requests for articles
within this range, because to do otherwise is to expect a failure, and
that's a waste of bandwidth.

>An answer of "yes" to [Q1] leads to:
>
>[Q2] When will new articles become available and expired/cancelled ones
>become unavailable ? Is it:
>  - not until a new NNTP session
>  - next time the group is made the current group after another group
>    has been the current group
>  - next time the group is made the current group, whether or not it
>    was the current group
>?

Clients will expect that articles will disappear from under their feet,
since they are aware of cancellation and expiry. It would be a strange
client indeed which treated the [min] and [max] values as special and
assumed that they sacrosanct. It would also be a strange client which
assumed that a server will delay cancelling or expiring articles merely
because they happen to be doing some reading.

Clients are also aware that articles appear. However they expect that
new articles will be "new". Viz if they ask after article 1250 (which is
between [max] and [min] and the server says it does not exist then they
will record this fact. Clients don't want to have to ask again just in
case the server has suddenly made the article available. That's a
serious waste of bandwidth (which someone has to pay for).

A client which was issuing NEXTs without keeping close track of which
article number had been reached should not be surprised to find that it
was given articles which had not existed when the GROUP command was
issued. If GROUP was reissued then the new maximum would become
apparent.

>[Q3] Can an article appear with a number higher than the previous
>maximum, such as 1357 ? Obviously the new maximum would be higher.

one expects this (otherwise Usenet would rapidly stop)

>[Q4] Can an article appear with a number less than the previous minimum,
>such as 1111 ? Obviously the new minimum would be lower.

this is annoying, but not impossible, to code for.

Once the GROUP was issued and this was discovered, requests would have
to be issued for [new min]..[old min]. It requires the client to hold
the [old min] value, which is a bit tedious.

>[Q5] Can an article appear with a number between the previous limits,
>such as 1250 ?

this is annoying, but not impossible, to code for.

It requires the client to issue requests for all values between [min]
and [max] which were not previously available. It requires the client to
hold as "state" the entire list of article numbers which were previously
sent from the server. In combination with [min] changing in Q4 it would
require the client to hold an indefinitely growing amount of state.
This, it seems to me, is a nonsense!

If (as is the way in which almost every server is actually coded)
articles only appear above the old [max] value (ie neither [Q4] or [Q5]
situations apply) then the only state that a client needs to hold is the
old [max] value, since all articles up to that value will have been
collected on a previous connection. This is nice and simple!

>[Q6] Can the lowest numbered article disappear ? Obviously the minimum
>would increase if necessary.

one expects this (also if the group was totally empty then the [min]
value article would not be available)

>[Q7] Can the highest numbered article disappear even though a lower
>numbered article (such as 1266) remains ? Obviously the maximum would
>decrease if necessary.

one expects this.

But I don't see why the maximum should have to decrease!

Clients should expect that articles from the middle of the range will
not be available, I don't see why they would tell the fetching routines
"this is the end of the range and so you have to get especially upset if
it is missing".  One just does not write code like that (or if one does,
one deserves to be burnt)

>[Q8] Can any other article disappear even though a lower numbered
>article remains (e.g. 1266 disappearing with 1234 remaining) ?

of course

>[Q9] If an article disappears, can it ever reappear again ? The same
>article, this is, not a new article given the same number by the server,
>which we all assume is not permitted.

if it does then one is likely to miss it :( see Q5

>I believe that the answers to Q6, Q7, and Q8 are all "yes", based on the
>properties of cancels and expiry. Q9 might at first instance seem to be
>"no", but "yes" allows cancels to be cancelled and mistakenly expired
>articles to be restored.

does anyone do this ?

for clients to be able to cope then, as in Q5, the client has to issue
requests for all the "holes" in the range just in case the article has
been miraculously resurrected. User's may not wish to pay for the
bandwidth to do this.

>There are three common scenarios posited for NNTP clients and servers. I
>call these "monotonic", "backfilling", and "wind-back". They answer the
>first three questions:
>
>                   Q3  Q4  Q5
>Monotonic          Y   N   N
>Backfilling        Y   N   Y
>Wind-back          Y   Y   Y
>
>Servers appear to exist of all three kinds, but all clients I am aware
>of that use article numbers at all are monotonic.

As I indicated, if one assumes Y N N then one needs to keep very little
state information about the newsgroup on the server.

One may well keep a great deal of state locally (about the state of the
offline copy of the newsbase), but one only needs to record [old max]
because that will be sufficient to determine what is new on the server
when one next goes online to connect.

>A sensible client needs to allow for the user wanting to mark articles
>as unread.

In my world, this is something to do with state information recorded in
the offline newsbase, and nothing to do with the server connection.

> Therefore it must be able to cope with the backfilling
>strategy, by (for example) maintaining a list of ranges.

this may be true for an online reader, or a reader with an offline
newsbase which is used by a single person. It is not very much to do
with connections to servers.

> Given this,
>there seems little reason to require that servers be monotonic.

I disagree strongly... and I suspect most client authors will agree.

>On the other hand, there are advantages to the answer to Q4 being "no".
>If it is, then it is possible to forget everything about articles with
>numbers below the current minimum. This ensures that historical
>information with no current use can be eventually thrown away.

forgetting all information about articles below the current maximum is
also extremely attractive!

-- 
richard                      richard.clayton    @    T U R N P I K E .com
                                                     tel: +44 1306 732300
"Assembly of Japanese bicycle require great peace of mind" quoted in ZAMM



More information about the ietf-nntp mailing list