8bit & i18n (was Re: ietf-nntp My notes ...)

Chris Newman Chris.Newman at INNOSOFT.COM
Wed Dec 18 16:42:38 PST 1996


On Wed, 18 Dec 1996, Brian Hernacki wrote:

> Chris Lewis wrote:
> > The defacto standards (INN + Cnews/Reference NNTP implementation) transport
> > 8-bit article bodies unmolested.  I see no reason to continue to enforce
> > this _unnecessary_ archaism which has already been purposefully abandoned.
> > 
> > We don't need to deal with charsets - NNTP is a transport mechanism, and
> > isn't involved in display issues.  The only place where it matters is in
> > the headers - where we may simply wish to take a similar bailout as,
> > say, Posix C did, and insist that the headers are, say, UTF8 (where can
> > I find a listing of this?) or Latin-1.  Indeed, we may well be able
> > to get away with insisting that the keywords are the current ASCII
> > encodings, and most of/all of the keyword values are 8-bit.
> 
> OK...my bad. I thought you were proposing adding alot of charset and
> other i18n supprt stuff to NNTP. I'm all for at least clarifying the use
> of 8-bit vs 7 bit. I think this does fall under documenting current
> de-fact standards and would go a long way to i18n support.

We have a number of choices on this front:

1) Declare the protocol 7-bit and ignore i18n issues.  This doesn't
reflect reality and probably won't pass IETF scrutiny.

2) Declare the protocol 7-bit and use MIME for i18n.  This would probably
work fine.

3) Declare the protocol 8-bit and ignore i18n issues.  This would reflect
reality -- a non-interoperable one.  I'd certainly object and I suspect
many others would.

4) Declare the protocol 8-bit and use MIME for i18n, possibly allowing
8-bit MIME.  Disallow unlabelled localized charsets because they don't
interoperate. This is the best choice, IMHO.

On the issue of 8-bit headers -- this is definitely a bad idea.  The
current installed base is completely non-interoperable, as clients all
over the world use different 8-bit charsets.  In addition, it would make
gatewaying between email and news a nightmare.  We're stuck with MIME
header encodings for i18n headers thanks to the installed base.  Also,
Latin-1 is obviously a lose -- it's not i18n and it creates more problems
than it solves.

You can see RFC 2044 for a description of UTF8.  UTF8 would be a good
choice for new protocol elements which need to be i18n, such as pretty
names.




More information about the ietf-nntp mailing list