如何用 # 长度等效替换字符串中的每三个单词

Question

输入：

string = "My dear adventurer, do you understand the nature of the given discussion?"

预期输出：

string = 'My dear ##########, do you ########## the nature ## the given ##########?'

如何将一串单词中的第三个单词替换为该单词的 # 长度，同时避免计算字符串中的特殊字符，例如撇号 (')、引号 (")、句号 (. )、逗号(,)、感叹号(!)、问号(?)、冒号(:)和分号(;)。

我采用了将字符串转换为元素列表的方法，但发现很难过滤掉特殊字符并用 # 等价物替换单词。有更好的方法吗？

Answer 1

在一些正则表达式的帮助下。在评论中解释。

import re


imp = "My dear adventurer, do you understand the nature of the given discussion?"
every_nth = 3  # in case you want to change this later

out_list = []

# split the input at spaces, enumerate the parts for looping
for idx, word in enumerate(imp.split(' ')):

    # only do the special logic for multiples of n (0-indexed, thus +1)
    if (idx + 1) % every_nth == 0:
        # find how many special chars there are in the current segment
        len_special_chars = len(re.findall(r'[.,!?:;\'"]', word))  
                                            # ^ add more special chars here if needed
        
        # subtract the number of special chars from the length of segment
        str_len = len(word) - len_special_chars
        
        # repeat '#' for every non-special char and add the special chars
        out_list.append('#'*str_len + word[-len_special_chars] if len_special_chars > 0 else '')
    else:
        # if the index is not a multiple of n, just add the word
        out_list.append(word)
        

print(' '.join(out_list))

Answer 2

有更有效的方法来解决这个问题，但我希望这是最简单的！

我的做法是：

Split the sentence into a list of the words

Using that, make a list of every third word.

Remove unwanted characters from this

Replace third words in original string with # times the length of the word.

这是代码（在注释中解释）：

# original line
line = "My dear adventurer, do you understand the nature of the given discussion?"

# printing original line
print(f'\n\nOriginal Line:\n"{line}"\n')

# printing somehting to indicate that next few prints will be for showing what is happenning after each lone
print('\n\nStages of parsing:')

# splitting by spaces, into list
wordList = line.split(' ')

# printing wordlist
print(wordList)

# making list of every third word
thirdWordList = [wordList[i-1] for i in range(1,len(wordList)+1) if i%3==0]

# pritning third-word list
print(thirdWordList)

# characters that you don't want hashed
unwantedCharacters = ['.','/','|','?','!','_','"',',','-','@','\n','\',':',';','(',')','<','>','{','}','[',']','%','*','&','+']

# replacing these characters by empty strings in the list of third-words
for unwantedchar in unwantedCharacters:
    for i in range(0,len(thirdWordList)):
        thirdWordList[i] = thirdWordList[i].replace(unwantedchar,'')

# printing third word list, now without punctuation 
print(thirdWordList)

# replacing with #
for word in thirdWordList:
    line = line.replace(word,len(word)*'#')

# Voila! Printing the result:
print(f'\n\nFinal Output:\n"{line}"\n\n')

希望对您有所帮助！

Answer 3

以下工作并且不使用正则表达式

special_chars = {'.','/','|','?','!','_','"',',','-','@','\n','\'}

def format_word(w, fill):
    if w[-1] in special_chars:
        return fill*(len(w) - 1) + w[-1]
    else:
        return fill*len(w)


def obscure(string, every=3, fill='#'):
    return ' '.join(
        (format_word(w, fill) if (i+1) % every == 0 else w)
        for (i, w) in enumerate(string.split())
    )

下面是一些示例用法

In [15]: obscure(string)
Out[15]: 'My dear ##########, do you ########## the nature ## the given ##########?'

In [16]: obscure(string, 4)
Out[16]: 'My dear adventurer, ## you understand the ###### of the given ##########?'

In [17]: obscure(string, 3, '?')
Out[17]: 'My dear ??????????, do you ?????????? the nature ?? the given ???????????'

Answer 4

我解决了它：

s  = "My dear adventurer, do you understand the nature of the given discussion?"

def replace_alphabet_with_char(word: str, replacement: str) -> str:
    new_word = []
    alphabet = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    for c in word:
        if c in alphabet:
            new_word.append(replacement)
        else:
            new_word.append(c)
    return "".join(new_word)

every_nth_word = 3
s_split = s.split(' ')
result = " ".join([replace_alphabet_with_char(s_split[i], '#') if i % every_nth_word == every_nth_word - 1 else s_split[i] for i in range(len(s_split))])
print(result)

输出： My dear ##########, do you ########## the nature ## the given ##########?

Answer 5

正则表达式和字符串操作的混合

import re
string = "My dear adventurer, do you understand the nature of the given discussion?"

new_string = []
for i, s in enumerate(string.split()):
    if (i+1) % 3 == 0:
        s = re.sub(r'[^\.:,;\'"!\?]', '#', s)
    new_string.append(s)

new_string = ' '.join(new_string)
print(new_string)

如何用 # 长度等效替换字符串中的每三个单词

How to replace every third word in a string with the # length equivalent

python

string

replace

list

filter