从没有标点符号的字符串搜索到主字符串,然后从那里获取没有库的标点符号切片,可能吗?
From strings without punctuation search into master string and take from there slices with punctuation without libraries, possible?
我有这个作业要做(不允许图书馆),我低估了这个问题:
假设我们有一个字符串列表:str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]
我们可以肯定的是,每个字符串都包含在 master_string 中,是有序的,没有标点符号。 (这一切都归功于我之前所做的控制)
然后我们有字符串:master_string = "'Come, my head's free at last!' said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
我在这里必须做的基本上是检查 master_string 中包含的来自 str_list 的至少 k 个字符串序列(在本例中为 k = 2
),但是我低估了在 str_list 中每个字符串中有超过 1 个单词的事实,所以 master_string.split()
不会带我去任何地方,因为这意味着要问 if "my head's" == "my"
之类的东西,那是错误的当然。
我正在考虑做一些事情,比如一次连接一个字符串并搜索 master_string.strip(".,:;!?")
但如果我找到相应的序列,我绝对需要直接从 master_string 中获取它们,因为我需要结果变量中的标点符号。这基本上意味着直接从 master_string 中获取切片,但这怎么可能呢?甚至有可能还是我必须改变方法?这让我完全发疯,尤其是因为没有图书馆允许这样做。
如果您想知道这里的预期结果是什么:
["my head's free at last!", "into alarm in another moment,"]
(因为两者都遵守来自 str_list 的至少 k 个字符串的条件)并且“neck”将保存在 discard_list 中,因为它不遵守该条件(不能用 .pop() 丢弃它,因为我需要用丢弃的变量做其他事情)
我有两个不同的版本,1 号给你脖子 :(,但 2 号没有那么多,这是 1 号:
master_string = "Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]
new_str = ''
for word in str_list:
if word in master_string:
new_str += word + ' '
print(new_str)
这是数字 2:
master_string = "Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]
new_str = ''
for word in str_list:
if word in master_string:
new_word = word.split(' ')
if len(new_word) == 2:
new_str += word + ' '
print(new_str)
遵循我的解决方案:
- 尝试扩展所有基于
master_string
和一组有限的标点字符(例如 my head’s
-> my head’s free at last!
;free
-> free at last!
).
- 只保留至少扩展
k
次的子字符串。
- 删除多余的子字符串(例如
free at last!
已经与 my head’s free at last!
一起出现)。
这是代码:
str_list = ["my head’s", "free", "at last", "into alarm", "in another moment", "neck"]
master_string = "‘Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
punctuation_characters = ".,:;!?" # list of punctuation characters
k = 1
def extend_string(current_str, successors_num = 0) :
# check if the next token is a punctuation mark
for punctuation_mark in punctuation_characters :
if current_str + punctuation_mark in master_string :
return extend_string(current_str + punctuation_mark, successors_num)
# check if the next token is a proper successor
for successor in str_list :
if current_str + " " + successor in master_string :
return extend_string(current_str + " " + successor, successors_num+1)
# cannot extend the string anymore
return current_str, successors_num
extended_strings = []
for s in str_list :
extended_string, successors_num = extend_string(s)
if successors_num >= k : extended_strings.append(extended_string)
extended_strings.sort(key=len) # sorting by ascending length
result_list = []
for es in extended_strings :
result_list = list(filter(lambda s2 : s2 not in es, result_list))
result_list.append(es)
print(result_list) # result: ['my head’s free at last!', 'into alarm in another moment,']
我有这个作业要做(不允许图书馆),我低估了这个问题:
假设我们有一个字符串列表:str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]
我们可以肯定的是,每个字符串都包含在 master_string 中,是有序的,没有标点符号。 (这一切都归功于我之前所做的控制)
然后我们有字符串:master_string = "'Come, my head's free at last!' said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
我在这里必须做的基本上是检查 master_string 中包含的来自 str_list 的至少 k 个字符串序列(在本例中为 k = 2
),但是我低估了在 str_list 中每个字符串中有超过 1 个单词的事实,所以 master_string.split()
不会带我去任何地方,因为这意味着要问 if "my head's" == "my"
之类的东西,那是错误的当然。
我正在考虑做一些事情,比如一次连接一个字符串并搜索 master_string.strip(".,:;!?")
但如果我找到相应的序列,我绝对需要直接从 master_string 中获取它们,因为我需要结果变量中的标点符号。这基本上意味着直接从 master_string 中获取切片,但这怎么可能呢?甚至有可能还是我必须改变方法?这让我完全发疯,尤其是因为没有图书馆允许这样做。
如果您想知道这里的预期结果是什么:
["my head's free at last!", "into alarm in another moment,"]
(因为两者都遵守来自 str_list 的至少 k 个字符串的条件)并且“neck”将保存在 discard_list 中,因为它不遵守该条件(不能用 .pop() 丢弃它,因为我需要用丢弃的变量做其他事情)
我有两个不同的版本,1 号给你脖子 :(,但 2 号没有那么多,这是 1 号:
master_string = "Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]
new_str = ''
for word in str_list:
if word in master_string:
new_str += word + ' '
print(new_str)
这是数字 2:
master_string = "Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]
new_str = ''
for word in str_list:
if word in master_string:
new_word = word.split(' ')
if len(new_word) == 2:
new_str += word + ' '
print(new_str)
遵循我的解决方案:
- 尝试扩展所有基于
master_string
和一组有限的标点字符(例如my head’s
->my head’s free at last!
;free
->free at last!
). - 只保留至少扩展
k
次的子字符串。 - 删除多余的子字符串(例如
free at last!
已经与my head’s free at last!
一起出现)。
这是代码:
str_list = ["my head’s", "free", "at last", "into alarm", "in another moment", "neck"]
master_string = "‘Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
punctuation_characters = ".,:;!?" # list of punctuation characters
k = 1
def extend_string(current_str, successors_num = 0) :
# check if the next token is a punctuation mark
for punctuation_mark in punctuation_characters :
if current_str + punctuation_mark in master_string :
return extend_string(current_str + punctuation_mark, successors_num)
# check if the next token is a proper successor
for successor in str_list :
if current_str + " " + successor in master_string :
return extend_string(current_str + " " + successor, successors_num+1)
# cannot extend the string anymore
return current_str, successors_num
extended_strings = []
for s in str_list :
extended_string, successors_num = extend_string(s)
if successors_num >= k : extended_strings.append(extended_string)
extended_strings.sort(key=len) # sorting by ascending length
result_list = []
for es in extended_strings :
result_list = list(filter(lambda s2 : s2 not in es, result_list))
result_list.append(es)
print(result_list) # result: ['my head’s free at last!', 'into alarm in another moment,']