用于从字符串列表中删除数字的嵌套循环和列表理解

Nested loop and list comrehension for removing digits from list of strings

我有一个 "sentences"(有 3000 个字符串)的列表,如下所示:

sentences[0:5]

['So there is no way for me to plug it in here in the US unless I go by a converter.',
 'Good case, Excellent value.',
 'Great for the jawbone.',
 'Tied to charger for conversations lasting more than 45 minutes.MAJOR PROBLEMS!!',
 'The mic is great.']

我想从此列表中的每个字符串中删除数字。例如上例中第 4 个字符串中的“45”。

当我使用嵌套循环时,它没有给出想要的结果。相反,它会重复每个字符串句子,等于 "digits" 列表中的位数,如下所示:

digits=[str(i) for i in range(0,10)]    
t=[]    
for i in sentences:    
    for j in digits:    
        a=i.replace(j,'')    
        t.append(a)    
print(t[0:5])

['So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.']

但是,当我创建一个函数然后在列表理解中调用它时,它完美地工作,如下所示:

def full_remove(x,remove_list):    
    for i in remove_list:    
        x=x.replace(i,' ')    
    return x

digits=[str(x) for x in range(10)]    
digit_less=[full_remove(i,digits) for i in sentences]    
print(digit_less[0:5])

['So there is no way for me to plug it in here in the US unless I go by a converter.', 'Good case, Excellent value.', 'Great for the jawbone.', 'Tied to charger for conversations lasting more than    minutes.MAJOR PROBLEMS!!', 'The mic is great.']

据我了解,这里在列表理解中调用函数的逻辑与使用嵌套循环相同,但为什么嵌套循环不起作用?我哪里出错了?

请说明。

谢谢

基于正则表达式的解决方案在这里可能更可取:

digit_less = [re.sub(r'\s*\d+\s*', ' ', i).strip() for i in sentences]
print(digit_less)

这会打印:

   ['So there is no way for me to plug it in here in the US unless I go by a converter.',
    'Good case, Excellent value.',
    'Great for the jawbone.',
    'Tied to charger for conversations lasting more than minutes.MAJOR PROBLEMS!!',
    'The mic is great.']

这种方法只用一个 space 替换了所有数字和周围的白色 space。调用 strip() 会删除任何 leading/trailing 白色 space,这可能会导致副作用。

第一种情况的问题是你的缩进。

在嵌套循环方法中: - 每一句话 - 遍历每个数字并替换它 - 每次都附加句子

这导致每个句子被附加 10 次,因为在内循环 [0,9] 中有 10 个数字要循环。

相反,您可以通过仅附加一次来解决这个问题。如果你这样写循环,它应该可以解决你的问题:

for i in sentences:    
    for j in digits:    
        i=i.replace(j,'')    
    t.append(i)

注意附加语句的缩进。

这现在只将句子附加到 t 列表中,一旦所有数字都被删除,而不是每次循环遍历一个数字

你可以解决很多问题,但关于你提出的问题,这就是如何纠正你的错误。