用于从字符串列表中删除数字的嵌套循环和列表理解
Nested loop and list comrehension for removing digits from list of strings
我有一个 "sentences"(有 3000 个字符串)的列表,如下所示:
sentences[0:5]
['So there is no way for me to plug it in here in the US unless I go by a converter.',
'Good case, Excellent value.',
'Great for the jawbone.',
'Tied to charger for conversations lasting more than 45 minutes.MAJOR PROBLEMS!!',
'The mic is great.']
我想从此列表中的每个字符串中删除数字。例如上例中第 4 个字符串中的“45”。
当我使用嵌套循环时,它没有给出想要的结果。相反,它会重复每个字符串句子,等于 "digits" 列表中的位数,如下所示:
digits=[str(i) for i in range(0,10)]
t=[]
for i in sentences:
for j in digits:
a=i.replace(j,'')
t.append(a)
print(t[0:5])
['So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.']
但是,当我创建一个函数然后在列表理解中调用它时,它完美地工作,如下所示:
def full_remove(x,remove_list):
for i in remove_list:
x=x.replace(i,' ')
return x
digits=[str(x) for x in range(10)]
digit_less=[full_remove(i,digits) for i in sentences]
print(digit_less[0:5])
['So there is no way for me to plug it in here in the US unless I go by a converter.', 'Good case, Excellent value.', 'Great for the jawbone.', 'Tied to charger for conversations lasting more than minutes.MAJOR PROBLEMS!!', 'The mic is great.']
据我了解,这里在列表理解中调用函数的逻辑与使用嵌套循环相同,但为什么嵌套循环不起作用?我哪里出错了?
请说明。
谢谢
基于正则表达式的解决方案在这里可能更可取:
digit_less = [re.sub(r'\s*\d+\s*', ' ', i).strip() for i in sentences]
print(digit_less)
这会打印:
['So there is no way for me to plug it in here in the US unless I go by a converter.',
'Good case, Excellent value.',
'Great for the jawbone.',
'Tied to charger for conversations lasting more than minutes.MAJOR PROBLEMS!!',
'The mic is great.']
这种方法只用一个 space 替换了所有数字和周围的白色 space。调用 strip()
会删除任何 leading/trailing 白色 space,这可能会导致副作用。
第一种情况的问题是你的缩进。
在嵌套循环方法中:
- 每一句话
- 遍历每个数字并替换它
- 每次都附加句子
这导致每个句子被附加 10 次,因为在内循环 [0,9] 中有 10 个数字要循环。
相反,您可以通过仅附加一次来解决这个问题。如果你这样写循环,它应该可以解决你的问题:
for i in sentences:
for j in digits:
i=i.replace(j,'')
t.append(i)
注意附加语句的缩进。
这现在只将句子附加到 t 列表中,一旦所有数字都被删除,而不是每次循环遍历一个数字
你可以解决很多问题,但关于你提出的问题,这就是如何纠正你的错误。
我有一个 "sentences"(有 3000 个字符串)的列表,如下所示:
sentences[0:5]
['So there is no way for me to plug it in here in the US unless I go by a converter.',
'Good case, Excellent value.',
'Great for the jawbone.',
'Tied to charger for conversations lasting more than 45 minutes.MAJOR PROBLEMS!!',
'The mic is great.']
我想从此列表中的每个字符串中删除数字。例如上例中第 4 个字符串中的“45”。
当我使用嵌套循环时,它没有给出想要的结果。相反,它会重复每个字符串句子,等于 "digits" 列表中的位数,如下所示:
digits=[str(i) for i in range(0,10)]
t=[]
for i in sentences:
for j in digits:
a=i.replace(j,'')
t.append(a)
print(t[0:5])
['So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.', 'So there is no way for me to plug it in here in the US unless I go by a converter.']
但是,当我创建一个函数然后在列表理解中调用它时,它完美地工作,如下所示:
def full_remove(x,remove_list):
for i in remove_list:
x=x.replace(i,' ')
return x
digits=[str(x) for x in range(10)]
digit_less=[full_remove(i,digits) for i in sentences]
print(digit_less[0:5])
['So there is no way for me to plug it in here in the US unless I go by a converter.', 'Good case, Excellent value.', 'Great for the jawbone.', 'Tied to charger for conversations lasting more than minutes.MAJOR PROBLEMS!!', 'The mic is great.']
据我了解,这里在列表理解中调用函数的逻辑与使用嵌套循环相同,但为什么嵌套循环不起作用?我哪里出错了?
请说明。
谢谢
基于正则表达式的解决方案在这里可能更可取:
digit_less = [re.sub(r'\s*\d+\s*', ' ', i).strip() for i in sentences]
print(digit_less)
这会打印:
['So there is no way for me to plug it in here in the US unless I go by a converter.',
'Good case, Excellent value.',
'Great for the jawbone.',
'Tied to charger for conversations lasting more than minutes.MAJOR PROBLEMS!!',
'The mic is great.']
这种方法只用一个 space 替换了所有数字和周围的白色 space。调用 strip()
会删除任何 leading/trailing 白色 space,这可能会导致副作用。
第一种情况的问题是你的缩进。
在嵌套循环方法中: - 每一句话 - 遍历每个数字并替换它 - 每次都附加句子
这导致每个句子被附加 10 次,因为在内循环 [0,9] 中有 10 个数字要循环。
相反,您可以通过仅附加一次来解决这个问题。如果你这样写循环,它应该可以解决你的问题:
for i in sentences:
for j in digits:
i=i.replace(j,'')
t.append(i)
注意附加语句的缩进。
这现在只将句子附加到 t 列表中,一旦所有数字都被删除,而不是每次循环遍历一个数字
你可以解决很多问题,但关于你提出的问题,这就是如何纠正你的错误。