I recently rediscovered this strange behaviour in Python’s Unicode handling.—Evan

►

-1; there’s no standard for UTF-8 BOMs—adding it to the codecs module was probably a mistake to begin with.—M.-A.

►

There is a standard for UTF-8 signatures, however.—Stephen

►

With the UTF-8-SIG codec, it would apply to all operation modes of the codec, whether stream-based or from strings.—"Martin

►

I’d suggest to use the same mode of operation as we have in the UTF-16 codec:—M.-A.

►

I’ve actually been confused about this point for quite some time now, but never had a chance to bring it up.—Nicholas

►

See above.

Thanks,—M.-A.