8bit & i18n (was Re: ietf-nntp My notes ...)
Chris Newman
Chris.Newman at INNOSOFT.COM
Wed Dec 18 16:42:38 PST 1996
On Wed, 18 Dec 1996, Brian Hernacki wrote:
> Chris Lewis wrote:
> > The defacto standards (INN + Cnews/Reference NNTP implementation) transport
> > 8-bit article bodies unmolested. I see no reason to continue to enforce
> > this _unnecessary_ archaism which has already been purposefully abandoned.
> >
> > We don't need to deal with charsets - NNTP is a transport mechanism, and
> > isn't involved in display issues. The only place where it matters is in
> > the headers - where we may simply wish to take a similar bailout as,
> > say, Posix C did, and insist that the headers are, say, UTF8 (where can
> > I find a listing of this?) or Latin-1. Indeed, we may well be able
> > to get away with insisting that the keywords are the current ASCII
> > encodings, and most of/all of the keyword values are 8-bit.
>
> OK...my bad. I thought you were proposing adding alot of charset and
> other i18n supprt stuff to NNTP. I'm all for at least clarifying the use
> of 8-bit vs 7 bit. I think this does fall under documenting current
> de-fact standards and would go a long way to i18n support.
We have a number of choices on this front:
1) Declare the protocol 7-bit and ignore i18n issues. This doesn't
reflect reality and probably won't pass IETF scrutiny.
2) Declare the protocol 7-bit and use MIME for i18n. This would probably
work fine.
3) Declare the protocol 8-bit and ignore i18n issues. This would reflect
reality -- a non-interoperable one. I'd certainly object and I suspect
many others would.
4) Declare the protocol 8-bit and use MIME for i18n, possibly allowing
8-bit MIME. Disallow unlabelled localized charsets because they don't
interoperate. This is the best choice, IMHO.
On the issue of 8-bit headers -- this is definitely a bad idea. The
current installed base is completely non-interoperable, as clients all
over the world use different 8-bit charsets. In addition, it would make
gatewaying between email and news a nightmare. We're stuck with MIME
header encodings for i18n headers thanks to the installed base. Also,
Latin-1 is obviously a lose -- it's not i18n and it creates more problems
than it solves.
You can see RFC 2044 for a description of UTF8. UTF8 would be a good
choice for new protocol elements which need to be i18n, such as pretty
names.
More information about the ietf-nntp
mailing list