[NNTP] Re: New NNTP drafts approaching IETF Last Call

Charles Lindsey chl at clerew.man.ac.uk
Wed Mar 16 05:18:47 PST 2005


In <Pine.LNX.4.63.0503150724250.21278 at shiva1.cac.washington.edu> Mark Crispin <mrc at CAC.Washington.EDU> writes:

>On Tue, 15 Mar 2005, Clive D.W. Feather wrote:
>>>  2) have some wording such as what section 4.3.1 in IMAP has, e.g.
>>> 	Article texts MAY contain 8-bit or multi-octet characters,
>>> 	but SHOULD do so when the [CHARSET] is identified via
>>> 	[MIME-IMB] and/or [MIME-HDRS].
>> This is a Usefor issue, not an NNTP issue.

>IMAP didn't get away with that argument, and NNTP should not be allowed to 
>get away with it either.

>I strongly suggest that you read (and understand) RFC 2130.

I fail to see the relevance of RFC 2130 here. We are talking about
multiline responses, which means that we are talking essentially about
article text, whether from client to server of from server to client.

Whereas headers are more or less structured, bodies are not. They may not
even represent data expressed in characters (and RFC 2130 applies only to
data expressed in characters).

But even with character data, RFC 2130 provides for a mapping from
characters to integers (CCS), a mapping from integers to sets of octets
(CES) and a mapping from octets to whatever the wire will accomodate
(TES).

Examples of TESs are Quoted-Printable and Base64 (which have the
interesting property that they map the octets back into ASCII, which is
then sent over the wire as-is). But that is not a necessary property of
TESs - they may, for example, map into a stream of octets from which NUL
and naked CR and LF are carefully excluded. yEnc is an example of such a
TES and is fairly widely used on Usenet (though we could all expound at
great length on what is wrong with it). People who "just send 8 bits" are
in fact using some CCS and CES of their choice, where the CES just happens
to avoid NUL and naked CR anf LF, and so the TES is just "straight
through". Those people are not aware that is what they are doing, and life
would be much simpler if they used some notation to indicate what CCS, CES
and TES they were in fact using (by adhering to the MIME standards, for
example). But that is nothing to do with NNTP.

NNTP is a protocol designed for wires that will accomodate streams of
octets excluding NUL and naked CR and LF. What it receives and passes on
as articles (in the form of multiline responses with dot stuffing) is the
output of the TES, which has already been applied by the clients that
submit the articles.

Thus it is a complete nonsense to speak of the "character set" within the
multiline response, because it is simply not a stream of characters at all
(well, in some special cases it might be).

Thus what are the proper CCSs, CESs and TESs to use with news articles are
a matter to be addressed by the standards that address the format of such
articles. Ditto for the standards that define any other formats that one
might choose to transport over NNTP.

NNTP is of a similar status to SMTP, as definined in RFC 2821 and its
extensions. RFC 2821 makes it clear that its wire can only accomodate
US-ASCII, and so the TES used must map into US-ASCII, and it points you to
the MIME standards as a way of ensuring that (but it does not require the
use of MIME if the body is unstructured). Likewise RFC 1652 describes
8BITMIME, which permits a less restrictive TES to be used (the same as
NNTP, as it happens).

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl at clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5



More information about the ietf-nntp mailing list