[NNTP] Suggestions for NNTP extensions (CAPABILITIES)

Sun Sep 21 12:43:24 PDT 2008

Julien ÉLIE <julien at trigofacile.com> writes:

> In order to properly advertise a few (legacy) recognized commands in
> CAPABILITIES, some more extensions are needed, as you all know.  They
> should be standardized.  Here is what I suggest:
>
> 1/ Additions to LIST.
>
>    * LIST DISTRIBUTIONS (with a wildmat for the area?)
>    * LIST MODERATORS
>    * LIST MOTD
>    * LIST SUBSCRIPTIONS

I certainly agree with standardizing all of these based on the INN
implementation.  The implementations of these haven't changed in quite
some time and for whatever they're worth to the world, we may as well
write a specification for them.

>    * Define the meaning of some other status fields of LIST ACTIVE.  I
>      do not know how to advertise it.  Maybe we could have LIST
>      ACTIVE.STATUS where the return would be:
>        y Posting is permitted.
>        n Posting is not permitted.
>        =* Alias group.

>      because it might be different in implementations.  Or should we
>      have another capability name (like EXTENDED-STATUS) which imposes
>      the meaning of the defined status(?)

>      Perhaps EXTENDED-STATUS is a better idea than LIST ACTIVE.STATUS
>      because only humans will be able to understand the result of LIST
>      ACTIVE.STATUS.

y, n, and m are already standardized.  The flags that INN adds are:

    j   No posting allowed, incoming articles filed into junk
    x   No posting allowed, remote postings rejected
    =*  Group aliasing

Thinking about this, I think we may want a combination of standardization
and a change in INN's behavior here.

To a reader, j groups really don't exist.  They will never contain any
articles, and are really just an internal hint to the serving agent to
move articles into a junk group.  nnrpd should ideally just suppress them
from LIST ACTIVE output.

To a reader, x groups are completely equivalent to n groups.  The only
difference is the behavior of incoming new messages from peers.  nnrpd
should probably change x to n when responding to LIST ACTIVE.

So I think INN should actually not return any flags other than the ones
already standardized and the alias flag.  Standardizing aliasing requires
deciding how it's supposed to work, but I think it's a useful capability.

>    * Say in the introduction that LIST EXTENSIONS is deprecated in
>    favour of CAPABILITIES.

I'm not sure it's worth bothering with this.  LIST EXTENSIONS existed only
in Internet-Drafts.

> 2/ PAT capability.
>
>    * The legacy syntax is kept:
>      PAT header range|message-ID pattern [pattern ...]

There are two main reasons why we didn't standardize PAT.  The first and
most serious is whitespace handling in patterns.  Given the way that NNTP
command parsing works, how would you search for a pattern containing two
spaces?  And what do the spaces between patterns really mean?

The second, as you mention below, is the encoding problem, which is very
hairy and difficult to deal with.

IMAP has addressed the search problem at some length, and my impression
was that it wasn't at all simple to deal with.  I'm afraid that doing a
good job of it is going to require quite a lot of work.

It would be worthwhile writing up an informational (?) I-D explaining what
INN actually does with XPAT without trying to change anything about how it
currently works and specifying an extension name of XPAT rather than a
standardized one.

>    * New metadata :body to specify that PAT will search in bodies.  It
>    would be advertised as PAT BODY in CAPABILITIES.  (Or by default in
>    PAT?)
>
>    * Maybe another metadata :text to search in the whole article
>    (headers+body)?  It would be advertised as PAT TEXT in CAPABILITIES.
>    (Or by default in PAT?)

Anything involving full body search should definitely be optional, since
the server may not hold the full article locally until the user requests
reading it and may not want to support expensive full search operations.

>    * Support for existing :bytes and :lines metadata, like what is currently
>      implemented in INN 2.5 (for XPAT):
>        XPAT :lines 112- *
>        221 Header or metadata information for :lines follows (from overview)
>        112 7
>        113 8
>        .
>        XPAT lines 112- *
>        221 Header information for lines follows (from articles)
>        112 Not a number!
>        113 1789
>        .

Seems like a reasonable thing to do.

>    * What for a search in folded headers?  May we assume they are in the
>    same format as in the overview database?  Therefore, we cannot search
>    for tabulations...

This is more of the whitespace problem.

>    * Hmm...  Should PAT be more extensible?  Allowing to find articles
>    matching several conditions?  (header Path: contains "server" AND
>    metadata :lines is LARGER than 20...)

Yeah, you can start adding almost arbitrary complexity when you get into
the more general search problem.

> 3/ Something to deal with large article numbers.  What can be done?
>   An extension?  But what kind of capability and use?
>
> Available references:
>    http://lists.eyrie.org/pipermail/ietf-nntp/2005-July/005720.html
>    http://lists.eyrie.org/pipermail/ietf-nntp/2005-July/005802.html
>    [...]

IIRC, Clive had an extension proposal for how to deal with this.

> 4/ Something to deal with compressed feeds.
>   Maybe we could define BATCH there (the legacy XBATCH command) along
>   with another command(?)

Compressed feeds in general require something that replaces the current
data conveyance method so that you can convey arbitrary binary data over
NNTP.  The problem with the existing NNTP protocol for binary data is that
the data MUST end in CRLF, which is considered part of the data.  We need
some way of avoiding that, or some set of commands that say that the first
CRLF of CRLF.CRLF isn't part of the data.

XBATCH is worth documenting.  I don't know if it's worth standardizing as
BATCH, but I wouldn't mind at all.  Most of the work there will be
defining the batch format, and that will depend on how many different
transforms people feel like writing up (c7unbatch, cunbatch, gunbatch,
etc.).

> 5/ Something to deal with cancels.  Maybe the CANCEL capability?
>   A basic CANCEL <message-ID> command.

For those not familiar with INN's implementation, this is referring to
the existing INN MODE CANCEL hack, which is only supported on local
sockets and which lets the client just spew message IDs at the server and
have them treated as cancels.  The right way to do this is to introduce a
new CANCEL command that can be governed by normal authorization and
capability negotiation.

This is primarily used by other software on the same system, such as NoCeM
processors.

> 6/ Something to deal with headers feeds.  It should be great to have
>   a standard for that.
>   MODE HEADFEED is normally used.  I thought MODE commands began to
>   be deprecated (MODE STREAM and MODE READER) but I do not see well
>   how we could have a headers feed without a special mode...

You have to introduce new commands or at least options to existing
commands (and new commands is probably cleaner).  Does the Diablo
implementation do streaming for header feeds?  If so, we need a
header-only equivalent for TAKETHIS, maybe TAKEHEADER.

> 7/ I also see QUOTA there:
>    http://lists.eyrie.org/pipermail/ietf-nntp/2007-February/005989.html
>
>   Is it useful to go on with it?

I think we'd need someone who would actually use it to write it up.

> 8/ INN also implements a wider syntax for wildmats (with "[" and "]" for
>   instance).  Ranges like "-12" and "-" are also recognized in order to
>   be more symmetric with the already defined "12-".
>   Notwithstanding, I do not think it needs an extension for that...

The wildmat extensions would be nice to write up formally, but I'm not
sure it's worth a full extension.

> Could someone *briefly* explain me the process to follow in order to
> write a draft for these extensions?  I admit I am a bit lost when I try
> to find information on rfc-editor.org when there is an available working
> group.

Since there is no active NNTP working group (and I doubt there's enough
interest to create one), any drafts would need to be individual
submissions.  This isn't really a problem; the IETF publishes
standards-track RFCs from individual submissions.  However, there will be
more burden on IETF Last Call and on expert review to be sure that the
drafts are reasonable, in the absence of a working group process.  I think
the IETF ADs realize that the resources for NNTP are limited, though.

To write an I-D, I recommend using xml2rfc:

    http://xml.resource.org/

It makes the whole process much easier.  I thought there was a repository
somewhere of the XML versions of RFCs written with xml2rfc, but I haven't
been able to find it.

-- 
Russ Allbery (rra at stanford.edu)             <http://www.eyrie.org/~eagle/>