Running transform seems to translate HTML entities in the source into unicode literals. For example:
<p>©    2014</p>
becomes
<p>©    2014</p>
This is causing issues for me and I'm guessing it's just a side effect of the lxml settings and not intentional. My understanding is that "©" has better email client compatibility as "©" (If anything I'd prefer an option to go the other way: escape any unicode literals in the source)