Conversion Of Unicode
Solution 1:
When I use the sys.getdefaultencoding() I get the output as "Cp1252"
Two comments on that: (1) it's "cp1252", not "Cp1252". Don't type from memory. (2) Whoever caused sys.getdefaultencoding() to produce "cp1252" should be told politely that that's not a very good idea.
As for the rest, let me guess. You have a unicode
object that contains some text in the Tamil language. You try, erroneously, to decode it. Decode means to convert from a str
object to a unicode
object. Unfortunately you don't have a str
object, and even more unfortunately you get bounced by one of the very few awkish/perlish warts in Python 2: it tries to make a str
object by encoding your unicode
string using the system default encoding. If that's 'ascii' or 'cp1252', encoding will fail. That's why you get a Unicode*En*codeError instead of a Unicode*De*codeError.
Short answer: do text = testString.encode("utf-8")
, if that's what you really want to do. Otherwise please explain what you want to do, and show us the result of print repr(testString)
.
Solution 2:
add this as your 1st line of code
# -*- coding: utf-8 -*-
later in your code...
text = unicode(testString,"UTF-8")
Solution 3:
you need to know which character-encoding is testString using. if not utf8, an error will occur when using decode('utf8').
Post a Comment for "Conversion Of Unicode"