I recently rediscovered this strange behaviour in Python’s Unicode handling.—Evan

I would personally like to see an "utf-8-bom" codec (perhaps better named "utf-8-sig", which strips the BOM on reading (if present) and generates it on writing.—"Martin

+1. —M.-A.