Recent Posts

Archive

2012

2011

2010

2009

2008

2007

2006

Tags

Authors

Brad Montgomery

Feeds

RSS / Atom

Dealing with Unicode and ASCII using Python

Posted by Brad Montgomery 2 years, 1 month ago (3 comments)

Dealing with Character Encodings is (sometimes) hard. It's especially confusing for those who've never done it before. Converting text from unicode to ascii can be tricky.

A lot of times, I'll import some data from a text file, and I just want to convert everything to ASCII and ignore anything that's not ascii (like MS Word's smart quotes). Luckily, this is fairly easy:

mystring = mystring.decode('ascii', 'ignore')

There's tons of great Python resources (and code!) for all your character encoding needs. In no particular order, here are a few I've found:

There's probably more, but most of these have helped me get the job done.

Sorry, I've disabled comments. If you want, talk to me on twitter: @bkmontgomery