带有运行数字的子字符串文本

Question

这应该非常简单和简短，但我想不出一个好的和简短的方法：
例如，我有一个字符串：

'How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'

我想用运行数字对 "how" 这个词进行子串，所以我得到：

'[1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and [3] many times must the cannon balls fly Before they're forever banned?'

Answer 1

您可以使用 re.sub 和替换功能。该函数将查找该词在字典中出现的频率和 return 一个相应的数字。

counts = collections.defaultdict(int)
def subst_count(match):
    word = match.group().lower()
    counts[word] += 1
    return "[%d]" % counts[word]

示例：

>>> text = "How many ...? How many ...? Yes, and how many ...?"
>>> re.sub(r"\bhow\b", subst_count, text, flags=re.I)
'[1] many ...? [2] many ...? Yes, and [3] many ...?'

注意：这对要替换的每个单词使用不同计数（以防您使用匹配多个单词的正则表达式），但不会重置调用 re.sub.

之间的计数

Answer 2

Test = 'How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before theyre forever banned?'

i = 0

while("How" in Test):
    new = "["+str(i)+"]"
    Test = Test.replace("How",new,i)
    i=i+1


print Test

输出

[1] many roads must a man walk down Before you call him a man? [2] many seas   must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before theyre forever banned?

Answer 3

这是使用 re.sub 替换函数的另一种方法。但是这段代码没有使用全局对象来跟踪计数，而是使用了函数属性。

import re

def count_replace():
    def replace(m):
        replace.count += 1
        return '[%d]' % replace.count
    replace.count = 0
    return replace

src = '''How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'''

pat = re.compile('how', re.I)

print(pat.sub(count_replace(), src))

输出

[1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and [3] many times must the cannon balls fly Before they're forever banned?

如果您需要仅替换完整的单词而不是部分单词，那么您需要更智能的正则表达式，例如 r"\bhow\b".

Answer 4

您可以利用 itertools.count 和一个函数作为替换参数，例如：

import re
from itertools import count

text = '''How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'''
result = re.sub(r'(?i)\bhow\b', lambda m, c=count(1): '[{}]'.format(next(c)), text)
# [1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and [3] many times must the cannon balls fly Before they're forever banned?

Answer 5

只是为了好玩，我想看看我是否可以使用递归来解决这个问题，这就是我得到的：

def count_replace(s, to_replace, leng=0, count=1, replaced=[]):
    if s.find(' ') == -1:
        replaced.append(s)
        return ' '.join(replaced)
    else:
        if s[0:s.find(' ')].lower() == to_replace.lower():
            replaced.append('[%d]' % count)
            count += 1
            leng = len(to_replace)
        else:
            replaced.append(s[0:s.find(' ')])
            leng = s.find(' ')
        return count_replace(s[leng + 1:], to_replace, leng, count, replaced)

不用说，我不会推荐它，因为它效率低得离谱，而且它也过于复杂，但我想我还是会分享它！

带有运行数字的子字符串文本

sub-string text with a running number

python

string

substring

python-2.7

带有 运行 数字的子字符串文本

sub-string text with a running number

python

string

substring

python-2.7

带有运行数字的子字符串文本