收集具有一定长度字母的单词
Gathering words with a set length of letters
有没有办法在 Python 中对给定字母长度的单词进行分组?
我开始研究这个功能:
lenght_words(a,b,text):
returnlist = []
在 return 列表中我想要长度为:
的单词
a <= 长度 <= b
所以我在想:
- 拆分文本行以使函数在文本的不同行上运行
- 删除行中的标点符号
- 如果一行中有长度合适的单词,该函数必须将它们放入 return 列表中,每个单词之间有一个 space(例如 'cat dog'),否则函数 put ''
我知道有 splitlines()
方法,但我不知道如何使用它(甚至在阅读之后)。
我想举例说明该函数的工作原理:
function(6,7,'All in the golden afternoon\nFull leisurely we glide;\nFor both our oars, with little skill,\nBy little arms are plied.')
此函数应分隔行:
All in the golden afternoon
Full leisurely we glide;
For both our oars,
with little skill,
By little arms are plied.
--> 删除标点和return:
['golden','','little','little']
我知道我必须将单词附加到 return 列表中,但我不知道如何进行。
你可以像这样写一个列表理解:
[token for token in s.split(" ") if a <= len(token) <= b]
它将return 变量s (str) 中字符长度在a (int) 和b (int) 之间的所有单词。关于如何使用它的一个例子是
s = 'All in the golden afternoon\nFull leisurely we glide;'
s += '\nFor both our oars, with little skill,\nBy little arms are plied.'
a = 6
b = 7
result = [token for token in s.split(" ") if a <= len(token) <= b]
结果为:
['golden', 'little', 'little', 'plied.']
要删除标点符号,只需添加
import string
s = "".join([char for char in s if char not in string.punctuation])
在最后一行之上。结果是:
['golden', 'little', 'little']
希望这对你有用!
编辑:
如果您想分别搜索不同的行,我会建议这样的解决方案:
import string
def split_by_line_and_find_words_with_length(min, max, s):
#store result
result = []
# separate string lines
lines = s.splitlines()
for line in lines:
# remove punctuation
l = "".join([char for char in line if char not in string.punctuation])
# find words with length between a and b
find = [token for token in l.split(" ") if a <= len(token) <= b]
# add empty string to result if no match
if find == []: find.append("")
# add any findings to result
result += find
return result
对于您的示例字符串和首选字长,这将 return ['golden', '', 'little', 'little'].
当您考虑范围时,您是在正确的轨道上。这是我编写函数的方式。
- 创建一个具有三个参数的函数:
start
和 stop
用于范围,sentence
用于目标句子。
- 在函数内部,创建一个名为
word_list
的列表。
- 通过
.splitlines()
. 分割句子来遍历句子中的每一行
- 过滤掉您迭代的每一行中的所有标点符号。
- 然后您通过列表理解遍历当前行中的每个单词,并测试您遍历的每个单词是否在给定范围内:
tmp = [word for word in line.split() if start <= len(word) <= stop]
。将列表推导的结果分配给名为 tmp
. 的列表
- 如果
tmp
的长度大于1
- 通过 space 连接
tmp
中的每个单词,并将连接的字符串添加到 word_list
。
- 否则,如果
tmp
列表只有一个元素长
- 只需将其添加到
word_list
- 否则为空的话
- 将空字符串添加到
word_list
- return
word_list
使用上述步骤,我将如何编写您的函数:
# create a function with the parameters `start`, `stop` and `sentence`
# `start` and `stop` are for the range, and `sentence` is the
# target sentence to iterate over.
def group_words_by_length(start: int, stop: int, sentence: str) -> list:
# import the string module so we can use its punctuation attribute.
import string
# create a list to hold words that
# are in the given `start`-`stop` range
word_list = []
# iterate over each line in the sentence
# using the string attribute `.splitlines()`
# which splits the string at every new line
for line in sentence.splitlines():
# filter out punctuation from
# every line.
line = ''.join([char for char in line if char not in string.punctuation])
# iterate over every word in each line
# via list comprehension. Inside the list comprehension
# we only add a word if is is in the given range.
tmp = [word for word in line.split() if start <= len(word) <= stop]
# if we found more than one valid word
# in the current line...
if len(tmp) > 1:
# join each word in the
# list by a space, and add
# the joined string to the `word_list`.
tmp = ' '.join(tmp)
word_list.append(tmp)
# if we found only
# one valid word...
elif len(tmp) == 1:
# simply add the word
# to the `word_list`.
word_list.extend(tmp)
# otherwise...
else:
# add an empty string to the
# `word_list`.
word_list.append("")
# return the `word_list`
return word_list
# testing of the function with
# your test string.
print(group_words_by_length(6, 7, 'All in the golden afternoon\nFull leisurely we glide;\nFor both our oars, with little skill,\nBy little arms are plied.'))
输出:
['golden', '', 'little', 'little']
有没有办法在 Python 中对给定字母长度的单词进行分组?
我开始研究这个功能:
lenght_words(a,b,text):
returnlist = []
在 return 列表中我想要长度为:
的单词a <= 长度 <= b
所以我在想:
- 拆分文本行以使函数在文本的不同行上运行
- 删除行中的标点符号
- 如果一行中有长度合适的单词,该函数必须将它们放入 return 列表中,每个单词之间有一个 space(例如 'cat dog'),否则函数 put ''
我知道有 splitlines()
方法,但我不知道如何使用它(甚至在阅读之后)。
我想举例说明该函数的工作原理:
function(6,7,'All in the golden afternoon\nFull leisurely we glide;\nFor both our oars, with little skill,\nBy little arms are plied.')
此函数应分隔行:
All in the golden afternoon
Full leisurely we glide;
For both our oars,
with little skill,
By little arms are plied.
--> 删除标点和return:
['golden','','little','little']
我知道我必须将单词附加到 return 列表中,但我不知道如何进行。
你可以像这样写一个列表理解:
[token for token in s.split(" ") if a <= len(token) <= b]
它将return 变量s (str) 中字符长度在a (int) 和b (int) 之间的所有单词。关于如何使用它的一个例子是
s = 'All in the golden afternoon\nFull leisurely we glide;'
s += '\nFor both our oars, with little skill,\nBy little arms are plied.'
a = 6
b = 7
result = [token for token in s.split(" ") if a <= len(token) <= b]
结果为:
['golden', 'little', 'little', 'plied.']
要删除标点符号,只需添加
import string
s = "".join([char for char in s if char not in string.punctuation])
在最后一行之上。结果是:
['golden', 'little', 'little']
希望这对你有用!
编辑:
如果您想分别搜索不同的行,我会建议这样的解决方案:
import string
def split_by_line_and_find_words_with_length(min, max, s):
#store result
result = []
# separate string lines
lines = s.splitlines()
for line in lines:
# remove punctuation
l = "".join([char for char in line if char not in string.punctuation])
# find words with length between a and b
find = [token for token in l.split(" ") if a <= len(token) <= b]
# add empty string to result if no match
if find == []: find.append("")
# add any findings to result
result += find
return result
对于您的示例字符串和首选字长,这将 return ['golden', '', 'little', 'little'].
当您考虑范围时,您是在正确的轨道上。这是我编写函数的方式。
- 创建一个具有三个参数的函数:
start
和stop
用于范围,sentence
用于目标句子。 - 在函数内部,创建一个名为
word_list
的列表。 - 通过
.splitlines()
. 分割句子来遍历句子中的每一行
- 过滤掉您迭代的每一行中的所有标点符号。
- 然后您通过列表理解遍历当前行中的每个单词,并测试您遍历的每个单词是否在给定范围内:
tmp = [word for word in line.split() if start <= len(word) <= stop]
。将列表推导的结果分配给名为tmp
. 的列表
- 如果
tmp
的长度大于1- 通过 space 连接
tmp
中的每个单词,并将连接的字符串添加到word_list
。
- 通过 space 连接
- 否则,如果
tmp
列表只有一个元素长- 只需将其添加到
word_list
- 只需将其添加到
- 否则为空的话
- 将空字符串添加到
word_list
- 将空字符串添加到
- return
word_list
使用上述步骤,我将如何编写您的函数:
# create a function with the parameters `start`, `stop` and `sentence`
# `start` and `stop` are for the range, and `sentence` is the
# target sentence to iterate over.
def group_words_by_length(start: int, stop: int, sentence: str) -> list:
# import the string module so we can use its punctuation attribute.
import string
# create a list to hold words that
# are in the given `start`-`stop` range
word_list = []
# iterate over each line in the sentence
# using the string attribute `.splitlines()`
# which splits the string at every new line
for line in sentence.splitlines():
# filter out punctuation from
# every line.
line = ''.join([char for char in line if char not in string.punctuation])
# iterate over every word in each line
# via list comprehension. Inside the list comprehension
# we only add a word if is is in the given range.
tmp = [word for word in line.split() if start <= len(word) <= stop]
# if we found more than one valid word
# in the current line...
if len(tmp) > 1:
# join each word in the
# list by a space, and add
# the joined string to the `word_list`.
tmp = ' '.join(tmp)
word_list.append(tmp)
# if we found only
# one valid word...
elif len(tmp) == 1:
# simply add the word
# to the `word_list`.
word_list.extend(tmp)
# otherwise...
else:
# add an empty string to the
# `word_list`.
word_list.append("")
# return the `word_list`
return word_list
# testing of the function with
# your test string.
print(group_words_by_length(6, 7, 'All in the golden afternoon\nFull leisurely we glide;\nFor both our oars, with little skill,\nBy little arms are plied.'))
输出:
['golden', '', 'little', 'little']