带有 运行 数字的子字符串文本
sub-string text with a running number
这应该非常简单和简短,但我想不出一个好的和简短的方法:
例如,我有一个字符串:
'How many roads must a man walk down Before you call him a man? How
many seas must a white dove sail Before she sleeps in the sand? Yes,
and how many times must the cannon balls fly Before they're forever
banned?'
我想用 运行 数字对 "how" 这个词进行子串,所以我得到:
'[1] many roads must a man walk down Before you call him a man? [2]
many seas must a white dove sail Before she sleeps in the sand? Yes,
and [3] many times must the cannon balls fly Before they're forever
banned?'
您可以使用 re.sub
和替换功能。该函数将查找该词在字典中出现的频率和 return 一个相应的数字。
counts = collections.defaultdict(int)
def subst_count(match):
word = match.group().lower()
counts[word] += 1
return "[%d]" % counts[word]
示例:
>>> text = "How many ...? How many ...? Yes, and how many ...?"
>>> re.sub(r"\bhow\b", subst_count, text, flags=re.I)
'[1] many ...? [2] many ...? Yes, and [3] many ...?'
注意:这对要替换的每个单词使用 不同 计数(以防您使用匹配多个单词的正则表达式),但 不会 重置调用 re.sub
.
之间的计数
Test = 'How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before theyre forever banned?'
i = 0
while("How" in Test):
new = "["+str(i)+"]"
Test = Test.replace("How",new,i)
i=i+1
print Test
输出
[1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before theyre forever banned?
这是使用 re.sub
替换函数的另一种方法。但是这段代码没有使用全局对象来跟踪计数,而是使用了函数属性。
import re
def count_replace():
def replace(m):
replace.count += 1
return '[%d]' % replace.count
replace.count = 0
return replace
src = '''How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'''
pat = re.compile('how', re.I)
print(pat.sub(count_replace(), src))
输出
[1] many roads must a man walk down Before you call him a man? [2]
many seas must a white dove sail Before she sleeps in the sand? Yes,
and [3] many times must the cannon balls fly Before they're forever
banned?
如果您需要 仅 替换完整的单词而不是部分单词,那么您需要更智能的正则表达式,例如 r"\bhow\b"
.
您可以利用 itertools.count
和一个函数作为替换参数,例如:
import re
from itertools import count
text = '''How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'''
result = re.sub(r'(?i)\bhow\b', lambda m, c=count(1): '[{}]'.format(next(c)), text)
# [1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and [3] many times must the cannon balls fly Before they're forever banned?
只是为了好玩,我想看看我是否可以使用递归来解决这个问题,这就是我得到的:
def count_replace(s, to_replace, leng=0, count=1, replaced=[]):
if s.find(' ') == -1:
replaced.append(s)
return ' '.join(replaced)
else:
if s[0:s.find(' ')].lower() == to_replace.lower():
replaced.append('[%d]' % count)
count += 1
leng = len(to_replace)
else:
replaced.append(s[0:s.find(' ')])
leng = s.find(' ')
return count_replace(s[leng + 1:], to_replace, leng, count, replaced)
不用说,我不会推荐它,因为它效率低得离谱,而且它也过于复杂,但我想我还是会分享它!
这应该非常简单和简短,但我想不出一个好的和简短的方法:
例如,我有一个字符串:
'How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'
我想用 运行 数字对 "how" 这个词进行子串,所以我得到:
'[1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and [3] many times must the cannon balls fly Before they're forever banned?'
您可以使用 re.sub
和替换功能。该函数将查找该词在字典中出现的频率和 return 一个相应的数字。
counts = collections.defaultdict(int)
def subst_count(match):
word = match.group().lower()
counts[word] += 1
return "[%d]" % counts[word]
示例:
>>> text = "How many ...? How many ...? Yes, and how many ...?"
>>> re.sub(r"\bhow\b", subst_count, text, flags=re.I)
'[1] many ...? [2] many ...? Yes, and [3] many ...?'
注意:这对要替换的每个单词使用 不同 计数(以防您使用匹配多个单词的正则表达式),但 不会 重置调用 re.sub
.
Test = 'How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before theyre forever banned?'
i = 0
while("How" in Test):
new = "["+str(i)+"]"
Test = Test.replace("How",new,i)
i=i+1
print Test
输出
[1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before theyre forever banned?
这是使用 re.sub
替换函数的另一种方法。但是这段代码没有使用全局对象来跟踪计数,而是使用了函数属性。
import re
def count_replace():
def replace(m):
replace.count += 1
return '[%d]' % replace.count
replace.count = 0
return replace
src = '''How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'''
pat = re.compile('how', re.I)
print(pat.sub(count_replace(), src))
输出
[1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and [3] many times must the cannon balls fly Before they're forever banned?
如果您需要 仅 替换完整的单词而不是部分单词,那么您需要更智能的正则表达式,例如 r"\bhow\b"
.
您可以利用 itertools.count
和一个函数作为替换参数,例如:
import re
from itertools import count
text = '''How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'''
result = re.sub(r'(?i)\bhow\b', lambda m, c=count(1): '[{}]'.format(next(c)), text)
# [1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and [3] many times must the cannon balls fly Before they're forever banned?
只是为了好玩,我想看看我是否可以使用递归来解决这个问题,这就是我得到的:
def count_replace(s, to_replace, leng=0, count=1, replaced=[]):
if s.find(' ') == -1:
replaced.append(s)
return ' '.join(replaced)
else:
if s[0:s.find(' ')].lower() == to_replace.lower():
replaced.append('[%d]' % count)
count += 1
leng = len(to_replace)
else:
replaced.append(s[0:s.find(' ')])
leng = s.find(' ')
return count_replace(s[leng + 1:], to_replace, leng, count, replaced)
不用说,我不会推荐它,因为它效率低得离谱,而且它也过于复杂,但我想我还是会分享它!