I recently rediscovered this strange behaviour in Python’s Unicode handling.—Evan

Shouldn’t the decoder be capable of doing a partial match and quitting early? After all, "ab" is encoded in UTF8 as <61> <62> but the BOM is <ef> <bb> <bf>. If it did this type of partial matching, this issue would be avoided except in rare situations.—Evan