如何找到包含所有元音的同一数组的两个元素
How to find two elements of the same array that contain all vowels
我想迭代给定的数组,例如:
["goat", "action", "tear", "impromptu", "tired", "europe"]
我想查看所有可能的配对。
所需的输出是一个新数组,其中包含所有对,组合起来包含所有元音。此外,这些对应连接为输出数组的一个元素:
["action europe", "tear impromptu"]
我尝试了以下代码,但收到一条错误消息:
No implicit conversion of nil into string.
def all_vowel_pairs(words)
pairs = []
(0..words.length).each do |i| # iterate through words
(0..words.length).each do |j| # for every word, iterate through words again
pot_pair = words[i].to_s + words[j] # build string from pair
if check_for_vowels(pot_pair) # throw string to helper-method.
pairs << words[i] + " " + words[j] # if gets back true, concatenade and push to output array "pairs"
end
end
end
pairs
end
# helper-method to check for if a string has all vowels in it
def check_for_vowels(string)
vowels = "aeiou"
founds = []
string.each_char do |char|
if vowels.include?(char) && !founds.include?(char)
founds << char
end
end
if founds.length == 5
return true
end
false
end
你说的是对,所以我认为它是两个元素的组合。我已经使用 #combination
method. Then I #select
-ed 将数组中的每两个元素组合在一起,只有那些在它们连接后包含所有元音的对。最后,我确保加入这些对:
["goat", "action", "tear", "impromptu", "tired", "europe"]
.combination(2)
.select { |c| c.join('') =~ /\b(?=\w*?a)(?=\w*?e)(?=\w*?i)(?=\w*?o)(?=\w*?u)[a-zA-Z]+\b/ }
.map{ |w| w.join(' ') }
#=> ["action europe", "tear impromptu"]
正则表达式来自“”。
与 Viktor 的开始类似,我会使用一个简单的测试来查看单词中存在哪些元音,并比较它们是否匹配 "aeiou"
在去除重复项并对其进行排序后:
def ttm1(ary)
ary.combination(2).select { |a|
a.join.scan(/[aeiou]/).uniq.sort.join == 'aeiou'
}.map { |a| a.join(' ') }
end
ttm1(words) # => ["action europe", "tear impromptu"]
分解它以便您了解发生了什么。
["goat", "action", "tear", "impromptu", "tired", "europe"] # => ["goat", "action", "tear", "impromptu", "tired", "europe"]
.combination(2)
.select { |a| a # => ["goat", "action"], ["goat", "tear"], ["goat", "impromptu"], ["goat", "tired"], ["goat", "europe"], ["action", "tear"], ["action", "impromptu"], ["action", "tired"], ["action", "europe"], ["tear", "impromptu"], ["tear", "tired"], ["tear", "europe"], ["impromptu", "tired"], ["impromptu", "europe"], ["tired", "europe"]
.join # => "goataction", "goattear", "goatimpromptu", "goattired", "goateurope", "actiontear", "actionimpromptu", "actiontired", "actioneurope", "tearimpromptu", "teartired", "teareurope", "impromptutired", "impromptueurope", "tiredeurope"
.scan(/[aeiou]/) # => ["o", "a", "a", "i", "o"], ["o", "a", "e", "a"], ["o", "a", "i", "o", "u"], ["o", "a", "i", "e"], ["o", "a", "e", "u", "o", "e"], ["a", "i", "o", "e", "a"], ["a", "i", "o", "i", "o", "u"], ["a", "i", "o", "i", "e"], ["a", "i", "o", "e", "u", "o", "e"], ["e", "a", "i", "o", "u"], ["e", "a", "i", "e"], ["e", "a", "e", "u", "o", "e"], ["i", "o", "u", "i", "e"], ["i", "o", "u", "e", "u", "o", "e"], ["i", "e", "e", "u", "o", "e"]
.uniq # => ["o", "a", "i"], ["o", "a", "e"], ["o", "a", "i", "u"], ["o", "a", "i", "e"], ["o", "a", "e", "u"], ["a", "i", "o", "e"], ["a", "i", "o", "u"], ["a", "i", "o", "e"], ["a", "i", "o", "e", "u"], ["e", "a", "i", "o", "u"], ["e", "a", "i"], ["e", "a", "u", "o"], ["i", "o", "u", "e"], ["i", "o", "u", "e"], ["i", "e", "u", "o"]
.sort # => ["a", "i", "o"], ["a", "e", "o"], ["a", "i", "o", "u"], ["a", "e", "i", "o"], ["a", "e", "o", "u"], ["a", "e", "i", "o"], ["a", "i", "o", "u"], ["a", "e", "i", "o"], ["a", "e", "i", "o", "u"], ["a", "e", "i", "o", "u"], ["a", "e", "i"], ["a", "e", "o", "u"], ["e", "i", "o", "u"], ["e", "i", "o", "u"], ["e", "i", "o", "u"]
.join == 'aeiou' # => false, false, false, false, false, false, false, false, true, true, false, false, false, false, false
} # => [["action", "europe"], ["tear", "impromptu"]]
看代码是不是所有的元音字母都存在,绕了一大圈。每次检查时,它都必须通过许多方法才能确定是否找到了所有元音;换句话说,它不能短路和失败,直到最后都不好。
此代码将:
def ttm2(ary)
ary.combination(2).select { |a|
str = a.join
str[/a/] && str[/e/] && str[/i/] && str[/o/] && str[/u/]
}.map { |a| a.join(' ') }
end
ttm2(words) # => ["action europe", "tear impromptu"]
但我不喜欢以这种方式使用正则表达式引擎,因为它比直接查找慢,这会导致:
def ttm3(ary)
ary.combination(2).select { |a|
str = a.join
str['a'] && str['e'] && str['i'] && str['o'] && str['u']
}.map { |a| a.join(' ') }
end
这是基准:
require 'fruity'
words = ["goat", "action", "tear", "impromptu", "tired", "europe"]
def viktor(ary)
ary.combination(2)
.select { |c| c.join('') =~ /\b(?=\w*?a)(?=\w*?e)(?=\w*?i)(?=\w*?o)(?=\w*?u)[a-zA-Z]+\b/ }
.map{ |w| w.join(' ') }
end
viktor(words) # => ["action europe", "tear impromptu"]
def ttm1(ary)
ary.combination(2).select { |a|
a.join.scan(/[aeiou]/).uniq.sort.join == 'aeiou'
}.map { |a| a.join(' ') }
end
ttm1(words) # => ["action europe", "tear impromptu"]
def ttm2(ary)
ary.combination(2).select { |a|
str = a.join
str[/a/] && str[/e/] && str[/i/] && str[/o/] && str[/u/]
}.map { |a| a.join(' ') }
end
ttm2(words) # => ["action europe", "tear impromptu"]
def ttm3(ary)
ary.combination(2).select { |a|
str = a.join
str['a'] && str['e'] && str['i'] && str['o'] && str['u']
}.map { |a| a.join(' ') }
end
ttm3(words) # => ["action europe", "tear impromptu"]
compare {
_viktor { viktor(words) }
_ttm1 { ttm1(words) }
_ttm2 { ttm2(words) }
_ttm3 { ttm3(words) }
}
结果:
# >> Running each test 256 times. Test will take about 1 second.
# >> _ttm3 is similar to _viktor
# >> _viktor is similar to _ttm2
# >> _ttm2 is faster than _ttm1 by 2x ± 0.1
现在,因为这看起来很像家庭作业,所以了解学校了解 Stack Overflow 并寻找寻求帮助的学生很重要,因此您可能不想重复使用此代码,尤其不是逐字的。
您的代码包含两个错误,其中之一导致了错误消息。
(0..words.length)
从 0 到 6 循环。 words[6]
但是不存在(数组是从零开始的),所以你得到 nil。用 (0..words.length-1)
替换(两次)应该可以解决这个问题。
您将两次获得每个正确的结果,一次是 "action europe"
,一次是 "europe action"
。这是由于循环太多,每个组合都要循环两次造成的。将第二个循环从 (0..words.length-1)
替换为 (i..words.length-1)
.
这种繁琐的索引簿记很无聊,而且经常导致错误。这就是为什么 Ruby 程序员通常更喜欢更简单的方法(如 combination
和其他答案一样),完全避免使用索引。
以下代码旨在提供一种在单词数量较大时构造所需数组的有效方法。请注意,与其他答案不同,它没有使用 Array#combination.
方法
说明部分的第一部分(下文)概述了该算法所采用的方法。然后填写详细信息。
代码
require 'set'
VOWELS = ["a", "e", "i", "o", "u"]
VOWELS_SET = VOWELS.to_set
def all_vowel_pairs(words)
h = words.each_with_object({}) {|w,h| (h[(w.chars & VOWELS).to_set] ||= []) << w}
h.each_with_object([]) do |(k,v),a|
vowels_needed = VOWELS_SET-k
h.each do |kk,vv|
next unless kk.superset?(vowels_needed)
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
end
end
end
例子
words = ["goat", "action", "tear", "impromptu", "tired", "europe", "hear"]
all_vowel_pairs(words)
#=> ["action europe", "hear impromptu", "impromptu tear"]
说明
对于给定的示例,步骤如下。
VOWELS_SET = VOWELS.to_set
#=> #<Set: {"a", "e", "i", "o", "u"}>
h = words.each_with_object({}) {|w,h| (h[(w.chars & VOWELS).to_set] ||= []) << w}
#=> {#<Set: {"o", "a"}>=>["goat"],
# #<Set: {"a", "i", "o"}>=>["action"],
# #<Set: {"e", "a"}>=>["tear", "hear"],
# #<Set: {"i", "o", "u"}>=>["impromptu"],
# #<Set: {"i", "e"}>=>["tired"],
# #<Set: {"e", "u", "o"}>=>["europe"]}
可见h
的键是五个元音的子集。这些值是 words
(单词)的元素数组,其中包含键给出的元音,没有其他元音。因此,这些值共同构成 words
的分区。当字数很大时,人们会期望 h
有 31 个键 (2**5 - 1
)。
我们现在遍历 h
的键值对。对于每个,使用键 k
和值 v
,确定缺失的元音集 (vowels_needed
),然后我们循环遍历 [=] 的那些键值对 [kk, vv]
30=] 其中 kk
是 vowels_needed
的超集。然后将 v
和 vv
元素的所有组合添加到返回的数组中(经过调整以避免重复计算每对单词)。
继续,
enum = h.each_with_object([])
#=> #<Enumerator: {#<Set: {"o", "a"}>=>["goat"],
# #<Set: {"a", "i", "o"}>=>["action"],
# ...
# #<Set: {"e", "u", "o"}>=>["europe"]}:
# each_with_object([])>
第一个值由enum
生成并传递给块,块变量赋值:
(k,v), a = enum.next
#=> [[#<Set: {"o", "a"}>, ["goat"]], []]
参见 Enumerator#next。
各个变量的赋值方式为array decomposition:
k #=> #<Set: {"o", "a"}>
v #=> ["goat"]
a #=> []
现在执行块计算。
vowels_needed = VOWELS_SET-k
#=> #<Set: {"e", "i", "u"}>
h.each do |kk,vv|
next unless kk.superset?(vowels_needed)
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
end
单词“goat”(v
)有元音“o”和“a”,因此只能匹配包含元音“e”、“i”和“u”的单词(也可能是“o”and/or“a”)。表达式
next unless kk.superset?(vowels_needed)
跳过 h
(kk
) 中那些不是 vowels_needed
超集的键。参见 Set#superset?。
None words
中的单词包含“e”、“i”和“u”,因此数组 a
不变。
下一个元素现在由 enum
生成,传递给块并为块变量赋值:
(k,v), a = enum.next
#=> [[#<Set: {"a", "i", "o"}>, ["action"]], []]
k #=> #<Set: {"a", "i", "o"}>
v #=> ["action"]
a #=> []
区块计算开始:
vowels_needed = VOWELS_SET-k
#=> #<Set: {"e", "u"}>
我们看到 h
只有一个键值对,其键是 vowels_needed
的超集:
kk = %w|e u o|.to_set
#=> #<Set: {"e", "u", "o"}>
vv = ["europe"]
因此我们执行:
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
将一个元素添加到 a
:
a #=> ["action europe"]
子句if w1 < w2
是为了确保在后面的计算中"europe action"
不会添加到a
。
如果v
(包含'a'、'i'和'u'的单词)和vv
(包含'e'、[=150的单词=] 和 'o') 改为:
v #=> ["action", "notification"]
vv #=> ["europe", "route"]
我们会在 a
中添加 "action europe"
、"action route"
和 "notification route"
。 (”europe notification”
稍后会添加,当 k #=> #<Set: {"e", "u", "o"}
时。)
基准
我将我的方法与其他建议使用@theTinMan 的 Fruity 基准代码进行了基准测试。唯一的区别在于要测试的单词数组和将我的方法添加到基准测试中,我将其命名为 cary
。对于要考虑的单词数组,我从计算机上的英语单词文件中选择了位于 运行dom 的 600 个单词:
words = IO.readlines('/usr/share/dict/words', chomp: true).sample(600)
words.first 10
#=> ["posadaship", "explosively", "expensilation", "conservatively", "plaiting",
# "unpillared", "intertwinement", "nonsolidified", "uraemic", "underspend"]
发现此数组包含 46,436 对包含所有五个元音的单词。
结果如下图
compare {
_viktor { viktor(words) }
_ttm1 { ttm1(words) }
_ttm2 { ttm2(words) }
_ttm3 { ttm3(words) }
_cary { cary(words) }
}
Running each test once. Test will take about 44 seconds.
_cary is faster than _ttm3 by 5x ± 0.1
_ttm3 is faster than _viktor by 50.0% ± 1.0%
_viktor is faster than _ttm2 by 30.000000000000004% ± 1.0%
_ttm2 is faster than _ttm1 by 2.4x ± 0.1
然后我将 cary
与 ttm3
进行了 1,000 个 运行domly 选择的单词的比较。这个数组被发现包含 125,068 对包含所有五个元音的单词。结果如下:
Running each test once. Test will take about 19 seconds.
_cary is faster than _ttm3 by 3x ± 1.0
为了感受基准的可变性,我 运行 最后比较了两次,每次都有一个新的 运行dom 选择 1,000 个单词。这给了我以下结果:
Running each test once. Test will take about 17 seconds.
_cary is faster than _ttm3 by 5x ± 1.0
Running each test once. Test will take about 18 seconds.
_cary is faster than _ttm3 by 4x ± 1.0
可见样本之间存在相当大的差异。
我想迭代给定的数组,例如:
["goat", "action", "tear", "impromptu", "tired", "europe"]
我想查看所有可能的配对。
所需的输出是一个新数组,其中包含所有对,组合起来包含所有元音。此外,这些对应连接为输出数组的一个元素:
["action europe", "tear impromptu"]
我尝试了以下代码,但收到一条错误消息:
No implicit conversion of nil into string.
def all_vowel_pairs(words)
pairs = []
(0..words.length).each do |i| # iterate through words
(0..words.length).each do |j| # for every word, iterate through words again
pot_pair = words[i].to_s + words[j] # build string from pair
if check_for_vowels(pot_pair) # throw string to helper-method.
pairs << words[i] + " " + words[j] # if gets back true, concatenade and push to output array "pairs"
end
end
end
pairs
end
# helper-method to check for if a string has all vowels in it
def check_for_vowels(string)
vowels = "aeiou"
founds = []
string.each_char do |char|
if vowels.include?(char) && !founds.include?(char)
founds << char
end
end
if founds.length == 5
return true
end
false
end
你说的是对,所以我认为它是两个元素的组合。我已经使用 #combination
method. Then I #select
-ed 将数组中的每两个元素组合在一起,只有那些在它们连接后包含所有元音的对。最后,我确保加入这些对:
["goat", "action", "tear", "impromptu", "tired", "europe"]
.combination(2)
.select { |c| c.join('') =~ /\b(?=\w*?a)(?=\w*?e)(?=\w*?i)(?=\w*?o)(?=\w*?u)[a-zA-Z]+\b/ }
.map{ |w| w.join(' ') }
#=> ["action europe", "tear impromptu"]
正则表达式来自“
与 Viktor 的开始类似,我会使用一个简单的测试来查看单词中存在哪些元音,并比较它们是否匹配 "aeiou"
在去除重复项并对其进行排序后:
def ttm1(ary)
ary.combination(2).select { |a|
a.join.scan(/[aeiou]/).uniq.sort.join == 'aeiou'
}.map { |a| a.join(' ') }
end
ttm1(words) # => ["action europe", "tear impromptu"]
分解它以便您了解发生了什么。
["goat", "action", "tear", "impromptu", "tired", "europe"] # => ["goat", "action", "tear", "impromptu", "tired", "europe"]
.combination(2)
.select { |a| a # => ["goat", "action"], ["goat", "tear"], ["goat", "impromptu"], ["goat", "tired"], ["goat", "europe"], ["action", "tear"], ["action", "impromptu"], ["action", "tired"], ["action", "europe"], ["tear", "impromptu"], ["tear", "tired"], ["tear", "europe"], ["impromptu", "tired"], ["impromptu", "europe"], ["tired", "europe"]
.join # => "goataction", "goattear", "goatimpromptu", "goattired", "goateurope", "actiontear", "actionimpromptu", "actiontired", "actioneurope", "tearimpromptu", "teartired", "teareurope", "impromptutired", "impromptueurope", "tiredeurope"
.scan(/[aeiou]/) # => ["o", "a", "a", "i", "o"], ["o", "a", "e", "a"], ["o", "a", "i", "o", "u"], ["o", "a", "i", "e"], ["o", "a", "e", "u", "o", "e"], ["a", "i", "o", "e", "a"], ["a", "i", "o", "i", "o", "u"], ["a", "i", "o", "i", "e"], ["a", "i", "o", "e", "u", "o", "e"], ["e", "a", "i", "o", "u"], ["e", "a", "i", "e"], ["e", "a", "e", "u", "o", "e"], ["i", "o", "u", "i", "e"], ["i", "o", "u", "e", "u", "o", "e"], ["i", "e", "e", "u", "o", "e"]
.uniq # => ["o", "a", "i"], ["o", "a", "e"], ["o", "a", "i", "u"], ["o", "a", "i", "e"], ["o", "a", "e", "u"], ["a", "i", "o", "e"], ["a", "i", "o", "u"], ["a", "i", "o", "e"], ["a", "i", "o", "e", "u"], ["e", "a", "i", "o", "u"], ["e", "a", "i"], ["e", "a", "u", "o"], ["i", "o", "u", "e"], ["i", "o", "u", "e"], ["i", "e", "u", "o"]
.sort # => ["a", "i", "o"], ["a", "e", "o"], ["a", "i", "o", "u"], ["a", "e", "i", "o"], ["a", "e", "o", "u"], ["a", "e", "i", "o"], ["a", "i", "o", "u"], ["a", "e", "i", "o"], ["a", "e", "i", "o", "u"], ["a", "e", "i", "o", "u"], ["a", "e", "i"], ["a", "e", "o", "u"], ["e", "i", "o", "u"], ["e", "i", "o", "u"], ["e", "i", "o", "u"]
.join == 'aeiou' # => false, false, false, false, false, false, false, false, true, true, false, false, false, false, false
} # => [["action", "europe"], ["tear", "impromptu"]]
看代码是不是所有的元音字母都存在,绕了一大圈。每次检查时,它都必须通过许多方法才能确定是否找到了所有元音;换句话说,它不能短路和失败,直到最后都不好。
此代码将:
def ttm2(ary)
ary.combination(2).select { |a|
str = a.join
str[/a/] && str[/e/] && str[/i/] && str[/o/] && str[/u/]
}.map { |a| a.join(' ') }
end
ttm2(words) # => ["action europe", "tear impromptu"]
但我不喜欢以这种方式使用正则表达式引擎,因为它比直接查找慢,这会导致:
def ttm3(ary)
ary.combination(2).select { |a|
str = a.join
str['a'] && str['e'] && str['i'] && str['o'] && str['u']
}.map { |a| a.join(' ') }
end
这是基准:
require 'fruity'
words = ["goat", "action", "tear", "impromptu", "tired", "europe"]
def viktor(ary)
ary.combination(2)
.select { |c| c.join('') =~ /\b(?=\w*?a)(?=\w*?e)(?=\w*?i)(?=\w*?o)(?=\w*?u)[a-zA-Z]+\b/ }
.map{ |w| w.join(' ') }
end
viktor(words) # => ["action europe", "tear impromptu"]
def ttm1(ary)
ary.combination(2).select { |a|
a.join.scan(/[aeiou]/).uniq.sort.join == 'aeiou'
}.map { |a| a.join(' ') }
end
ttm1(words) # => ["action europe", "tear impromptu"]
def ttm2(ary)
ary.combination(2).select { |a|
str = a.join
str[/a/] && str[/e/] && str[/i/] && str[/o/] && str[/u/]
}.map { |a| a.join(' ') }
end
ttm2(words) # => ["action europe", "tear impromptu"]
def ttm3(ary)
ary.combination(2).select { |a|
str = a.join
str['a'] && str['e'] && str['i'] && str['o'] && str['u']
}.map { |a| a.join(' ') }
end
ttm3(words) # => ["action europe", "tear impromptu"]
compare {
_viktor { viktor(words) }
_ttm1 { ttm1(words) }
_ttm2 { ttm2(words) }
_ttm3 { ttm3(words) }
}
结果:
# >> Running each test 256 times. Test will take about 1 second.
# >> _ttm3 is similar to _viktor
# >> _viktor is similar to _ttm2
# >> _ttm2 is faster than _ttm1 by 2x ± 0.1
现在,因为这看起来很像家庭作业,所以了解学校了解 Stack Overflow 并寻找寻求帮助的学生很重要,因此您可能不想重复使用此代码,尤其不是逐字的。
您的代码包含两个错误,其中之一导致了错误消息。
(0..words.length)
从 0 到 6 循环。 words[6]
但是不存在(数组是从零开始的),所以你得到 nil。用 (0..words.length-1)
替换(两次)应该可以解决这个问题。
您将两次获得每个正确的结果,一次是 "action europe"
,一次是 "europe action"
。这是由于循环太多,每个组合都要循环两次造成的。将第二个循环从 (0..words.length-1)
替换为 (i..words.length-1)
.
这种繁琐的索引簿记很无聊,而且经常导致错误。这就是为什么 Ruby 程序员通常更喜欢更简单的方法(如 combination
和其他答案一样),完全避免使用索引。
以下代码旨在提供一种在单词数量较大时构造所需数组的有效方法。请注意,与其他答案不同,它没有使用 Array#combination.
方法说明部分的第一部分(下文)概述了该算法所采用的方法。然后填写详细信息。
代码
require 'set'
VOWELS = ["a", "e", "i", "o", "u"]
VOWELS_SET = VOWELS.to_set
def all_vowel_pairs(words)
h = words.each_with_object({}) {|w,h| (h[(w.chars & VOWELS).to_set] ||= []) << w}
h.each_with_object([]) do |(k,v),a|
vowels_needed = VOWELS_SET-k
h.each do |kk,vv|
next unless kk.superset?(vowels_needed)
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
end
end
end
例子
words = ["goat", "action", "tear", "impromptu", "tired", "europe", "hear"]
all_vowel_pairs(words)
#=> ["action europe", "hear impromptu", "impromptu tear"]
说明
对于给定的示例,步骤如下。
VOWELS_SET = VOWELS.to_set
#=> #<Set: {"a", "e", "i", "o", "u"}>
h = words.each_with_object({}) {|w,h| (h[(w.chars & VOWELS).to_set] ||= []) << w}
#=> {#<Set: {"o", "a"}>=>["goat"],
# #<Set: {"a", "i", "o"}>=>["action"],
# #<Set: {"e", "a"}>=>["tear", "hear"],
# #<Set: {"i", "o", "u"}>=>["impromptu"],
# #<Set: {"i", "e"}>=>["tired"],
# #<Set: {"e", "u", "o"}>=>["europe"]}
可见h
的键是五个元音的子集。这些值是 words
(单词)的元素数组,其中包含键给出的元音,没有其他元音。因此,这些值共同构成 words
的分区。当字数很大时,人们会期望 h
有 31 个键 (2**5 - 1
)。
我们现在遍历 h
的键值对。对于每个,使用键 k
和值 v
,确定缺失的元音集 (vowels_needed
),然后我们循环遍历 [=] 的那些键值对 [kk, vv]
30=] 其中 kk
是 vowels_needed
的超集。然后将 v
和 vv
元素的所有组合添加到返回的数组中(经过调整以避免重复计算每对单词)。
继续,
enum = h.each_with_object([])
#=> #<Enumerator: {#<Set: {"o", "a"}>=>["goat"],
# #<Set: {"a", "i", "o"}>=>["action"],
# ...
# #<Set: {"e", "u", "o"}>=>["europe"]}:
# each_with_object([])>
第一个值由enum
生成并传递给块,块变量赋值:
(k,v), a = enum.next
#=> [[#<Set: {"o", "a"}>, ["goat"]], []]
参见 Enumerator#next。
各个变量的赋值方式为array decomposition:
k #=> #<Set: {"o", "a"}>
v #=> ["goat"]
a #=> []
现在执行块计算。
vowels_needed = VOWELS_SET-k
#=> #<Set: {"e", "i", "u"}>
h.each do |kk,vv|
next unless kk.superset?(vowels_needed)
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
end
单词“goat”(v
)有元音“o”和“a”,因此只能匹配包含元音“e”、“i”和“u”的单词(也可能是“o”and/or“a”)。表达式
next unless kk.superset?(vowels_needed)
跳过 h
(kk
) 中那些不是 vowels_needed
超集的键。参见 Set#superset?。
None words
中的单词包含“e”、“i”和“u”,因此数组 a
不变。
下一个元素现在由 enum
生成,传递给块并为块变量赋值:
(k,v), a = enum.next
#=> [[#<Set: {"a", "i", "o"}>, ["action"]], []]
k #=> #<Set: {"a", "i", "o"}>
v #=> ["action"]
a #=> []
区块计算开始:
vowels_needed = VOWELS_SET-k
#=> #<Set: {"e", "u"}>
我们看到 h
只有一个键值对,其键是 vowels_needed
的超集:
kk = %w|e u o|.to_set
#=> #<Set: {"e", "u", "o"}>
vv = ["europe"]
因此我们执行:
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
将一个元素添加到 a
:
a #=> ["action europe"]
子句if w1 < w2
是为了确保在后面的计算中"europe action"
不会添加到a
。
如果v
(包含'a'、'i'和'u'的单词)和vv
(包含'e'、[=150的单词=] 和 'o') 改为:
v #=> ["action", "notification"]
vv #=> ["europe", "route"]
我们会在 a
中添加 "action europe"
、"action route"
和 "notification route"
。 (”europe notification”
稍后会添加,当 k #=> #<Set: {"e", "u", "o"}
时。)
基准
我将我的方法与其他建议使用@theTinMan 的 Fruity 基准代码进行了基准测试。唯一的区别在于要测试的单词数组和将我的方法添加到基准测试中,我将其命名为 cary
。对于要考虑的单词数组,我从计算机上的英语单词文件中选择了位于 运行dom 的 600 个单词:
words = IO.readlines('/usr/share/dict/words', chomp: true).sample(600)
words.first 10
#=> ["posadaship", "explosively", "expensilation", "conservatively", "plaiting",
# "unpillared", "intertwinement", "nonsolidified", "uraemic", "underspend"]
发现此数组包含 46,436 对包含所有五个元音的单词。
结果如下图
compare {
_viktor { viktor(words) }
_ttm1 { ttm1(words) }
_ttm2 { ttm2(words) }
_ttm3 { ttm3(words) }
_cary { cary(words) }
}
Running each test once. Test will take about 44 seconds.
_cary is faster than _ttm3 by 5x ± 0.1
_ttm3 is faster than _viktor by 50.0% ± 1.0%
_viktor is faster than _ttm2 by 30.000000000000004% ± 1.0%
_ttm2 is faster than _ttm1 by 2.4x ± 0.1
然后我将 cary
与 ttm3
进行了 1,000 个 运行domly 选择的单词的比较。这个数组被发现包含 125,068 对包含所有五个元音的单词。结果如下:
Running each test once. Test will take about 19 seconds.
_cary is faster than _ttm3 by 3x ± 1.0
为了感受基准的可变性,我 运行 最后比较了两次,每次都有一个新的 运行dom 选择 1,000 个单词。这给了我以下结果:
Running each test once. Test will take about 17 seconds.
_cary is faster than _ttm3 by 5x ± 1.0
Running each test once. Test will take about 18 seconds.
_cary is faster than _ttm3 by 4x ± 1.0
可见样本之间存在相当大的差异。