I recently rediscovered this strange behaviour in Python’s Unicode handling.—Evan

Well, either one is possible, however the Unicode standard suggests, but does not require, silently removing them:

  It is good practice, however, to recognize it as a noncharacter and to   take appropriate action, such as removing it from the text. Note that   Unicode conformance freely allows the removal of these characters.

I would prefer silently ignoring them from the str.decode() function, since I believe in "be strict in what you emit, but liberal in what you accept." I think that this only applies to str.decode(). Any other attempt to create non-characters, such as unichr( 0xffff ), should raise an exception because clearly the programmer is making a mistake.—Evan