ietf-nntp draft-ietf-nntpext-base-17

Tue Mar 25 06:45:46 PST 2003

In <20030324124116.GX45272 at finch-staff-1.thus.net> "Clive D.W. Feather" <clive at demon.net> writes:

>>> Are you saying that, if I receive 0xC1 0x96, I MUST NOT convert it to 0x56?
>> From RFC 2279bis:
>>    Implementations of the decoding algorithm above MUST protect against
>>    decoding invalid sequences.  For instance, a naive implementation may
>>    decode the overlong UTF-8 sequence C0 80 into the character U+0000,
>>    or the surrogate pair ED A1 8C ED BE B4 into U+233B4. Decoding
>>    invalid sequences may have security consequences or cause other
>>    problems.  See Security Considerations (Section 10) below.

>It says "protect against". I see nothing in 2279bis that forbids such
>decoding, but section 10 requires you to be careful.

In my view, "protect against" means "make sure it never happens".

>I've added:

>  In the first case, the implementation MUST ensure that any replacement
>  cannot be used to bypass validity or security checks.  For example,
>  the illegal sequence %xC0.A0 is an over-long encoding for US-ASCII
>  space (%x20).  If it is replaced by the latter in a command line,
>  this needs to happen before the command line is parsed into individual
>  arguments.  If the replacement came after parsing, it would be
>  possible to generate an argument with an embedded space, which is
>  forbidden.  Use of the "replacement character" does not have this
>  problem, since it is permitted wherever non-US-ASCII characters are.

I think it would be better to stay silent on the matter. Just remove the
existing sentence.

Or you could ask Yergeau what he intended it to mean.

>> RFC 2279 said the same, and I wrote a similar wording into Usefor. There
>> are also serious suggestions that if you see something that is not valid
>> UTF-8 you MAY try to treat it as whatever Chinese charset you think it
>> might be

>Not in *my* specification.

It has been extensively discussed on the Usefor list, mainly I think as an
interim measure that might allow the Chinese to migrate to UTF-8 in an
orderly manner (otherwise they would need a flag day). And always
supposing that UTF-8 happens, of course.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl at clw.cs.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5