Unicode 到字符串 Python 2

Question

我正在尝试将普通字符串转换为特殊字符，以便在 python 2.

中按照我的逻辑使用它

word = 'Tb\u03b1'
word = unicode('Tb\u03b1')

if word.encode('utf-8') == u'Tb\u03b1'.encode('utf-8'):
    print 'They are equals'

print word.encode('utf-8')
print type(word.encode('utf-8'))
print u'Tb\u03b1'.encode('utf-8')
print type(u'Tb\u03b1'.encode('utf-8'))

我收到了这个回复

Tb\u03b1
<type 'str'>
Tbα
<type 'str'>

我的问题是...当我将 unicode 方法应用于单词时，我不应该在第 1 行和第 3 行有相同的响应？我想得到第 3 行，因为我需要根据那个特殊字符做一些逻辑

Answer 1

问题是 \u 在非 unicode 文字中没有特殊含义，因此它在您的字符串中仍然是 \u。要解释 \u 转义并生成相应的 Unicode，请使用编码 "unicode_escape":

>>> as_str = "\u03b1"
>>> as_unicode = as_str.decode(encoding="unicode_escape")
>>> print as_unicode
α

但是如果你能想出避免这种情况的方法，你会过得更好。更好的是，切换到 Python 3，这样的事情更有意义。

Unicode 到字符串 Python 2

Unicode to String Python 2

python

unicode

utf-8

python-2.x