ietf-nntp OVER and HDR parameter comments

Tue Apr 1 09:53:02 PST 2003

Ken Murchison <ken at oceana.com> writes:

> My general observation of the current OVER text and the HDR parameter
> proposal is that implemention is driving the protocol.

One general point here to keep in mind is that implementation is going to
drive our protocol to a substantial degree because that's our charter.

    The first concern of this working group shall be for the
    interoperability of the various NNTP implementations, and therefore
    for clear and explicit specification of the protocol.  It is very
    important that we document the existing situation before taking up any
    new work.

The primary goal of this working group is to document the protocol as it
is currently implemented, at least in those areas where standardization
makes sense.  That doesn't mean that I think we should just blindly copy
everything that people implement, but it does mean that if multiple
implementations have done someting, I think we should have a strong bias
towards including that in the standard unless there is some compelling
reason not to.

> For instance, the OVER text describes the command as outputing the
> contents of the overview database.  Clearly some implementations may be
> using a database, but others may not.  Cyrus for example uses a separate
> cache file for each mailbox/newsgroup.

That's a database.

It is, however, certainly true that one could implement a fully functional
NNTP server supporting OVER (if a fairly slow one) without any separate
overview database at all, just by generating the overview information from
the articles themselves each time it's requested.

> I'm not a big fan of implementation finding its way into protocols.  I'd
> like to see the current text sanitized so that the concept of a database
> doesn't creep in.  Something like "The OVER extension provides access to
> article overview information.  The overview information consists of a
> [fixed?] set of parsed message headers and article meta-data."

I have no inherent problems with wording changes along these lines to make
the descriptions more generic.  At this stage in the development of the
draft, it would probably be most useful if you could suggest specific
wording changes to the current draft, though.  (In other words, "change
the first sentence of the fourth paragraph of section X.Y to instead
read....")

> Similarly, its seems as if the HDR parameter proposal is a way of
> linking HDR back to the OVER implementation (or some other
> implementation decision).

No, it's an attempt to document a very widespread implementation choice.
Servers have found a need in practice to restrict the use of HDR to
particular headers that are already indexed.  I don't want to second-guess
that decision; there's a great deal of operational experience behind it.

> I don't think that HDR shouln't be restricted to a subset of headers in
> any way.  The server has the article, so parse it and find the header.

The server does not necessarily have the article.  It's increasingly
common to have a collection of reader servers with full feeds of overview
information and a local article cache, which don't retrieve the actual
article from an upstream server until a client requests it.  It's
potentially very slow to open the article and parse it.

> Whether the particular header is cached in some way shouldn't be an
> issue.  If the client really wants/needs a header, its going to get it
> by grabbing the HEAD and then parsing it itself.

Yes, but clients expect that command to be slower, and that puts more work
on the client.  The client can only retrieve one header at a time.  HDR on
the other hand can be applied to thousands of articles with a single
command, and can put an extremely disproportionate load on the server.

This is a significant issue for Usenet servers.  It's considerably less of
an issue for mail servers because the reading patterns, quantity of
messages one is generally dealing with, and abuse patterns are different.

INN does implement HDR the way that you describe by default, but many (I
believe most) large news outsourcers don't.

-- 
Russ Allbery (rra at stanford.edu)             <http://www.eyrie.org/~eagle/>