[NNTP] NNTP URI draft

Russ Allbery rra at stanford.edu
Tue Mar 8 14:24:53 PST 2005


Charles Lindsey <chl at clerew.man.ac.uk> writes:
> Russ Allbery <rra at stanford.edu> writes:

>> So are you taking over as editor of this draft, or is Pete planning on
>> publishing a new version if you get consensus on other mailing lists?
>> (I have no idea why usefor would be involved; this has nothing to do
>> with usefor.)

> Pete has agreed to incorporate my texts if I can persuade him that the
> Netnews community have been consulted and are happy. I put it before
> Usefor because they are a likely bunch of people who might want to
> comment. Officially, it should be duscussed on the uri at w3.org list, but
> I am happy for it to be tossed around here for a bit if you are willing.

Officially, we were asked by the IESG to review the document here, and I
was just way behind on my e-mail.  If something has changed since then, I
haven't heard about it (although I might not have).

Okay, that sounds good.

>> It's not at all clear to me that we want to support commas and "!",
>> though.  Maybe the news URL should just use wildmat-item, appropriately
>> escaped?

> I don't see what harm the full wildmat would do.

It just seems weird to me, but in the absence of anything firmer than "it
seems weird," I guess that's okay.  I find the <news:comp.lang.*> very
useful and I've seen it used in practice on real web pages (although it's
been a few years), so it would be nice to support it.

> I would imagine that all any system implementing this form of the URI
> would do would be to issue a LIST ACTIVE command with that wildmat, and
> then say to the user "here are all the groups you asked about - which
> one would you like to read/subscribe/whatever?".

Maybe.  It could just check a local newsgroup list, or do any number of
other things.  But yes, that's what lynx does.

> Hmmmm! I think a low-profile on UTF-8 would be advisable given previous
> furores on that issue. How about:

>    .... It is not
>    precluded that future extensions for internationalized <newsgroup-name>s
>    may permit octets outside of the given ranges, in which case they too MUST
>    be %-encoded (except perhaps when used in an IRI [RFC 3987]).

Works for me.

>>> 2.1  The newsURI contains an <article>

>>>    A <message-id> corresponds to the <msg-id> of [RFC 2822] and to the
>>>    Message-ID of section 2.1.5 of [RFC 1036], but without the enclosing
>>>    "<" and ">". It MUST be the message identifier of an actual Netnews
>>>    article

>> Bad use of MUST.  Referring to a non-existent article is not a
>> violation of the standard.  As you've worded it right now, a client
>> would need to ensure through some other means that the article it's
>> asking for actually exists before using a news URI referencing it.

> Hmmmm! It is really a requirement to be observed by users

Then it's even more absurd.  :)

> I was trying to avoid arguments about the syntactic form of a message
> identifier, and this seemed a simple way of saying "it MUST be a valid
> message-id" without needing to say what "valid" actually meant. Would
> you settle for "SHOULD"?

No, because again, using a nonexistent message ID is not a protocol
violation any more than sending an NNTP command "ARTICLE <id>" where <id>
doesn't correspond to an existing article is a protocol violation.  I
think you're confusing protocol violations with normal errors.

Think of it this way: it's the difference, in C, between calling open() on
a file that you don't have access to (error) and calling open() without a
file argument (protocol violation).

I think it's reasonable to just drop that sentence.  If we want to include
the information there, something like:

    It is used to retrieve the article with that message identifier.

that doesn't use protocol language for something that isn't a protocol
constraint.  Although really this is already covered in the paragraph
discussed below, so I think you can just drop that bit.

>>>    The resource retrieved by this URI is the Netnews article with the
>>>    given <message-id>.  In a properly working Netnews system, the same
>>>    article will be obtained whatever server is accessed for the
>>>    purpose (assuming the server in question carried that article in
>>>    the first place and that it has not expired).

>> There's got to be a better way of phrasing this.

> It was really aimed at URI-savvy people who might not understand fully
> the nature of Usenet. I am open to suggestions for rewording.

How about:

    The resource retrieved by this URI is the Netnews article with the
    given <message-id>.  Message identifiers are required to be globally
    unique, so the same article will be obtained whatever server is
    accessed for that purpose (provided the server in question has that
    article available).

I think explicitly stating the globally unique part makes it a bit clearer
what's going on.

>>> 2.2  The newsURI contains a <group>

>>>    According to [RFC 1036], the <newsgroup-name> will in practice be a
>>>    period-delimited hierarchical name, such as "comp.lang.perl.modules".

>> I don't see any need to refer to 1036 here or, really, anywhere else in
>> this document.

> I think mention of RFC 1036 (or, one day, of USEFOR) is essential
> somewhere (after all, the nntpext draft references 1036).

Yeah, I thought about this some more and changed my mind, since after all
1036 does define the format of the resource that one gets back.  Although
it's not at all clear to me that this is the right reference for newsgroup
names in particular, since in practice the news URL can be used with any
NNTP-supported newsgroup name (which is a richer set than RFC 1036).

This one isn't as much of an issue as the later NNTP URI reference,
though.

> What is daid here is similar to what was said above regarding
> message-ids. You can write any <newsgroup-name> you like, but it ain't
> going to work unless it is a real one. The wording there actually comes
> from RFC 1738, which was exceedingly vague on the whole issue. How
> about:

>    The <newsgroup-name> SHOULD be that of an existing newsgroup,

No, bad use of protocol language.  The original is much better.

>> ....  Since this is a URI scheme for NNTP, it should be sufficient to
>> just refer to the NNTP draft, which already defines such things as
>> message IDs and newsgroup names.

> Actually, it isn't just a URI scheme for NNTP. It might be used to
> access a local server directly, or it might be used to retrieve
> groups/articles from an IMAP server. Which, come to think of it, is a
> good reason NOT to use wildmats.

Yeah, although in practice the chances of someone doing that are pretty
remote.

> It was in RFC 1738, but has anybody actually encountered it in the wild?

Yes.

> It is quite unlike anything in any other URI scheme, whereas the 3rd and
> 4th forms are typical of many schemes with the meaning of "show me
> everything you have", or "show me the default".

That's great.  I still don't see any point of changing this; it's been
around for years and was in the original standard.

> I am suggesting that we either drop the "*" bit entirely (my preference)
> or turn it into something useful (like a wildmat).

It already is a wildmat, if we adopt wildmats, which is fine with me.  It
needs to stay regardless of whether we adopt wildmats or not, though; it's
simply not okay to just willy-nilly deprecate stuff that's being used when
there isn't any inherent flaw in the specification.

> [In any case, the danger with all of those forms is that they may
> institute a download of the complete active file. Try that on supernews,
> and you will sit there for 5 minuts waiting for it all to appear :-( .]

Quality of implementation issue for your client; i.e., not our problem.

> Yes, that would have to come in the rewording if we adopt that
> alternative. But, having realized the possible use of IMAP and other
> servers with this URI, I am getting rather doubtful. Is the facility
> likely to be useful enough to be worth the trouble? As I said above, did
> the "*" ever actually happen in the wild?

Well, like I said, I find it useful and have certainly seen it used (in
fact, the <news:comp.lang.perl.*> form is used a lot more than the
<news:*> form, although both are used).  I don't have a strong opinion
about adding the general wildmat form, just a feeling that it would be
nice.

>> It's not clear to me whether we should allow the trailing slash to be
>> optional.  What did HTTP do here?  I know that browsers support leaving
>> it off, but is that fixed internal to the browser, or actually allowed
>> in the protocol?

> As Clive has just shown, HTTP did all sorts of amazing things, the net
> effect of which was that you got the same effect with or without the
> "/".

Clive didn't actually answer my question, which is about what the URI
syntax says, not about what user agents do.  NNTP user agents can also
autocorrect the URI if need be; what I want to copy is what the HTTP URI
syntax specifies.  If that says the trailing slash is optional, great,
that's all I want.

>>> It would be readily implemented, but it is quite certain that nowhere
>>> is it implemented currently.

>> Whoops.  :)

> Indeed. Which is why I ask whether it is really useful enough to bring
> it in.

"Whoops" here means "whoops, you wrote an authoritative-sounding statement
about what NNTP software implements that turns out to have been wrong."  I
thought I already pointed that out.

>>> 3.  The nntp URI scheme

>>>    The nntp URI scheme is used to refer to individual Netnews articles,
>>>    as defined in [RFC 1036].

>> Again, refer to NNTP not RFC 1036.  (Even more so here, since the whole
>> concept of an article number is purely an NNTP construction.)

> Yes, but this is the introductory remarks introducing this URI, and
> getting hold of Netnews articles is its whole purpose.

No, it's not.  That's the whole point of the news URI.  The whole point of
the nntp URI is to access an NNTP server, which is not limited to serving
Netnews articles.  Please refer to the NNTP standard for the nntp URI, not
the Usenet article standard; the NNTP standard will cover this point in
some detail and there's no need to include the whole fairly complex issue
again here.

> So I don't think mention of 1036 can be omitted.

I do, here.  I think you're right about the news URI.

-- 
Russ Allbery (rra at stanford.edu)             <http://www.eyrie.org/~eagle/>



More information about the ietf-nntp mailing list