Python -unicode- translate table 不删除字符

Python -unicode- translate table doesn't remove chars

我有一个列表,该列表包含我想从列表中删除 ')' 和 \n 以及空白 space 的 unicode 元素。实质上是创建列表的 "clean" 副本。

我的尝试参考了此 SO 解决方案 Remove specific characters from a string in python and python docs strings for 2.7。

我使用删除的 bs4 导入创建我的列表以最小化大小。

def isNotBlank(myString):
    if myString and myString.strip():
        return True
    return False

names = soup.find_all('span', class_="TextLarge")
bucket_list = []

for name in names:
    for item in name.contents:
        for value in item.split('('):
            if isNotBlank(value):
                bucket_list.append(value)

translation_table = dict.fromkeys(map(ord, ')(@\n#$'), None)
[x.translate(translation_table) for x in bucket_list ]

所以打印(名字)returns

[<span class="TextLarge">Mossfun (11) (Rtg:103)</span>, <span class="TextLarge">58.0</span>, <span class="TextLarge scratched">Atmospherical (8)
      (Rtg:99)</span>, <span class="TextLarge">56.5</span>, <span class="TextLarge scratched">Chloe In Paris (7)
      (Rtg:97)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Bound For Earth (5) (Rtg:92)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Fine Bubbles (4) (Rtg:91)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Brook Road (9) (Rtg:90)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Shamalia (10) (Rtg:89)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge scratched">Tawteen (6) (Rtg:88)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Ygritte (2) (Rtg:77)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Tahni Dancer (1) (Rtg:76)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">All Salsa (3) (Rtg:72)</span>, <span class="TextLarge">55.5</span>]

和 bucket_list returns 作为

[u'Mossfun ', u'11) ', u'Rtg:103)', u'58.0', u'Atmospherical ', u'8) \n      ', u'Rtg:99)', u'56.5', u'Chloe In Paris ', u'7) \n      ', u'Rtg:97)', u'55.5', u'Bound For Earth ', u'5) ', u'Rtg:92)', u'55.5', u'Fine Bubbles ', u'4) ', u'Rtg:91)', u'55.5', u'Brook Road ', u'9) ', u'Rtg:90)', u'55.5', u'Shamalia ', u'10) ', u'Rtg:89)', u'55.5', u'Tawteen ', u'6) ', u'Rtg:88)', u'55.5', u'Ygritte ', u'2) ', u'Rtg:77)', u'55.5', u'Tahni Dancer ', u'1) ', u'Rtg:76)', u'55.5', u'All Salsa ', u'3) ', u'Rtg:72)', u'55.5']

希望

[['Mossfun', 11, 103, 58.0],[Atmospherical, 8, 99, 56.5]]

目前它通过了所有字符的翻译

您忽略了此处的 return 值;你翻译得很好(尽管实际上没有处理换行符):

>>> bucket_list = [u'Mossfun ', u'11) ', u'Rtg:103)', u'58.0', u'Atmospherical ', u'8) \n      ', u'Rtg:99)', u'56.5', u'Chloe In Paris ', u'7) \n      ', u'Rtg:97)', u'55.5', u'Bound For Earth ', u'5) ', u'Rtg:92)', u'55.5', u'Fine Bubbles ', u'4) ', u'Rtg:91)', u'55.5', u'Brook Road ', u'9) ', u'Rtg:90)', u'55.5', u'Shamalia ', u'10) ', u'Rtg:89)', u'55.5', u'Tawteen ', u'6) ', u'Rtg:88)', u'55.5', u'Ygritte ', u'2) ', u'Rtg:77)', u'55.5', u'Tahni Dancer ', u'1) ', u'Rtg:76)', u'55.5', u'All Salsa ', u'3) ', u'Rtg:72)', u'55.5']
>>> translation_table = dict.fromkeys(map(ord, ')(@\n#$'), None)
>>> [x.translate(translation_table) for x in bucket_list ]
['Mossfu ', '11 ', 'Rtg:103', '58.0', 'Atmospherical ', '8 \n      ', 'Rtg:99', '56.5', 'Chloe I Paris ', '7 \n      ', 'Rtg:97', '55.5', 'Boud For Earth ', '5 ', 'Rtg:92', '55.5', 'Fie Bubbles ', '4 ', 'Rtg:91', '55.5', 'Brook Road ', '9 ', 'Rtg:90', '55.5', 'Shamalia ', '10 ', 'Rtg:89', '55.5', 'Tawtee ', '6 ', 'Rtg:88', '55.5', 'Ygritte ', '2 ', 'Rtg:77', '55.5', 'Tahi Dacer ', '1 ', 'Rtg:76', '55.5', 'All Salsa ', '3 ', 'Rtg:72', '55.5']

但结果存储在一个新列表中;原始字符串 而不是 就地更改,因为它们是不可变的。将结果分配回 bucket_list,并使用 \n 而不是 \n:

来解决换行问题
translation_table = dict.fromkeys(map(ord, ')(@\n#$'), None)
bucket_list = [x.translate(translation_table) for x in bucket_list ]

你可能想输入一个 str.strip() 来去掉剩余的空格;结果将是:

>>> [x.translate(translation_table).strip() for x in bucket_list ]
['Mossfun', '11', 'Rtg:103', '58.0', 'Atmospherical', '8', 'Rtg:99', '56.5', 'Chloe In Paris', '7', 'Rtg:97', '55.5', 'Bound For Earth', '5', 'Rtg:92', '55.5', 'Fine Bubbles', '4', 'Rtg:91', '55.5', 'Brook Road', '9', 'Rtg:90', '55.5', 'Shamalia', '10', 'Rtg:89', '55.5', 'Tawteen', '6', 'Rtg:88', '55.5', 'Ygritte', '2', 'Rtg:77', '55.5', 'Tahni Dancer', '1', 'Rtg:76', '55.5', 'All Salsa', '3', 'Rtg:72', '55.5']

在这种特定情况下,str.strip() 也会处理换行符。