I recently rediscovered this strange behaviour in Python’s Unicode handling.—Evan

It’s not a default for the UTF-16 encoding byte order. It’s a default for the UTF-16 encoding byte order _when UTF-16 is a communications medium_. Given that the generic network byte order is bigendian, I think it would be insane to specify littleendian as Unicode’s default.

With Unicode same as network, you specify UTF-16 strings internally as an array of uint16_t, and when you put them on the wire (including saving them to a file that might be put on the wire as octet-stream) you apply htons(3) to it. On reading, you apply ntohs(3) to it. The source code is portable, the file is portable. How can you beat that?—Stephen