ietf-nntp Issue: reinstatement

Chris Hall chris.hall at turnpike.com
Mon Dec 30 03:29:27 PST 1996


In article <199612292026.UAA21882 at lyra.csx.cam.ac.uk>, USENET news
manager <newsmaster at ucs.cam.ac.uk> writes
>Chris Hall wrote:
>>In article <851805333.28095.0 at office.demon.net>, "Clive D.W. Feather"
>><clive at demon.net> writes
>>>[clarification about intended meaning of backfilling]
>>>
>>>My definition of reinstatement: an article is removed (either because of a
>>>cancel or expiry), and then the server operator decides that this removal
>>>was an error and reinstates the article *with the same number*.
>>>
>>>In earlier discussion, I was under the impression that people wanted to
>>>allow this behaviour; the alternative is to state explicitly that, once an
>>>article has been removed, it will never reappear.
...snip...
>>>I don't know what the concensus mechanism is for this list; when someone
>>>tells me we have a consensus one way or the other on this issue, I'll
>>>adjust the wording as needed.

>>The choices appear to be:
>>
>>  1. ban reinstatement,
>>...
>>     This doesn't seem right.

>Agreed.

>>  2. reinstate with its original article number,
>>
>>     assuming there's some mechanism at the server to do that !  For
>>     NEWNEWS purposes the article would have to be reinstated with its
>>     original "timestamp".

>With INN, unless you'd run "makehistory" to rebuild the history database
>from the current spool contents in the meantime, the history entry (which
>includes the timestamps) will remain until the article would have expired
>(or possibly longer, for groups with very short expiry), so reinstating the
>article file should plus "ctlinnd renumber <groupname>" should do it. [Just
>an example to show that it's not necessarily difficult to do.]

>>     For clients that remember the state of newsgroups (eg. those that
>>     use NEWNEWS), there is a problem here.  Clients that look at a
>>     group in the interval between removal and reinstatement may well
>>     never see the article !  The longer the delay between removal and
>>     reinstatement, the greater the number of people who will miss the
>>     article.

>Yes, but if reinstatement is used only to minimise the effects of 
>accidental or malicious loss, it will benefit those people who see the
>reinstated article, and the effect is no worse for others than if the 
>article had not been reinstated.

>>     Reinstatement is a subtle form of back-fill, against which faces
>>     have been set.

>Yes and no... They have properties in common, but backfilling requires 
>clients to locate newly arrived but lower-numbered articles or their 
>users will see a very patchy feed, missing a lot of articles. Reinstatement
>is a "best efforts" fixup for loss of previously-existing articles.

If the event is rare (however one defines that), then yes, this
reinstatement is better than nothing, and has no side effects.  If it is
not rare, then depending on the gap between removal and reinstatement,
more or less people will see a disrupted news feed.

>>  3. reinstate with a new article number (and new "timestamp"),
>>
>>     which means that no client should miss the article, but some may
>>     see the article twice !
>>
>>     Duplicating articles is anathema.  Nevertheless, I expect clients
>>     cover themselves against it, and simply ignore duplicates.  But, if
>>     the delay between removal and reinstatement is long, then there is
>>     the risk that the client has expired its memory of the article-id,
>>     and not be able to detect the duplication.

>Clients using article numbers wouldn't know if a reinstated article with
>a new number was a duplicate of one they'd shown the user previously (as
>when broken gateways spew duplicates with new message IDs).

If clients do not in general remember the message IDs of the articles
already show to the user, then this is indeed a non-starter.

As far as the client is concerned a duplicate is something that has the
same message ID as a previous message -- in that sense I don't think you
can produce a duplicate with a new message ID !

> Clients (if 
>there are any?)

Since we develop it, I know of one -- which is proof against duplicates
(that is, messages with the same message ID) at least until the original
article is expired out of the stored news.

If you're fetching news by NEWNEWS it is common practice to fiddle the
NEWNEWS time back a bit, to take account of small variations in clocks
and such.  This means that it's necessary to keep some record of message
IDs seen, to cope with some overlap.  But I grant that that does not
require a perticularly comprehensive memory of recent message IDs.

> which saved message IDs for all articles seen recently
>(could be many megabytes) should be safe against duplicates, *except* that 
>with a history database (underlying lookup by message ID, and NEWNEWS) like 
>INN's, articles reinstated with a different article number could not be added
>to the history database in any straightforward and non-disruptive way, and 
>it would retain the original details including article file number (hence
>also number) - so the reinstated article would never be seen (and the 
>reinstated article would never be expired, since the history database is 
>used by expiry to derive the list of files to unlink).

Hmmm.  I don't know INN, but if it has fetched and stored the first
instance of the article, would it not simply ignore the second instance?

>>Looks like a choice between evils to me :-(.

>For anyone using INN, the choice is an easy one - reinstatement with the
>original article number is the only practical option!

>>If reinstating articles is a common requirement, then there is a serious
>>problem here.  The client either has to be able to tolerate a form of
>>back-fill, or it has to tolerate duplicate articles.

>Only if you consider it essential for such articles never to be missed by 
>clients, rather than viewing it as minimising the effects of losing the 
>articles. Unless you could reinstate a copy of the article as it appeared on
>your server (and if it's been deleted, that would be fiddly), other problems
>could arise (e.g. Xref: header not matching reality on your server, Path: 
>misleading, etc.) so it seems unlikely to be done very often, anyway. I've 
>only ever needed to do it when something was posted in a local group which 
>looked seriously inappropriate, when I mv'd it elsewhere until I'd checked 
>that it was OK, then mv'd it back.

It is possible for an article to be removed from a server, and later be
reinstated -- if not, this entire discussion would be redundant !  If
the only practical way to do that is to reinstate with its original
article number and time stamp, then in theory the system is broken --
since some people will miss an article they would have received but for
the spurious removal (mistaken, mischievous or malicious).

In the limit, news articles are not guaranteed to arrive everywhere, so
this is just another way for some articles to be lost.  Whether it's
worth looking for alternative reinstatement mechanism depends really on
the perception of how often it's required.  Accepting that permanent
removal is not acceptable, the balance to be struck is between:

   frequency of having to reinstate *
      proportion of users who will miss articles *
         harm of missing articles

and:

   frequency of having to reinstate *
      proportion of users who will see duplicated articles *
         harm of duplicating articles

In absolute terms, if the frequency is negligible, then it doesn't much
matter which is chosen.  In relative terms, however, to make an informed
choice requires some feel for the weight of the other factors.

If news servers are vulnerable to mischievous or malicious removal of
articles, then reinstatement with the same article number and timestamp
is an incomplete solution -- a broken reed from the outset :-(.  Having
said all that, we're not in a position to redesign news -- so I suppose
if this is a problem, then the solution has to be strengthening the
servers against spurious article removal.

>                                John Line
-- 
Chris Hall                                       Chris.Hall at turnpike.com



More information about the ietf-nntp mailing list