如何使用`gsub`用相同的字符串替换多个子字符串

Question

我想将不同的 charaters/substrings 更改为单个字符或 nil。我想将 "How to chop an onion?" 更改为 "how-chop-onion"。

string
.gsub(/'s/,'')
.gsub(/[?&]/,'')
.gsub('to|an|a|the','')
.split(' ')
.map { |s| s.downcase}
.join '-'

使用竖线字符 | 无效。我如何使用 gsub 执行此操作？

Answer 1

to|an|a|the 是模式，您将其用作字符串。这里：

str.gsub('to|an|a|the', '')   # passing string argument
#=> "How to chop an onion?"

str.gsub(/to|an|a|the/, '')   # passing pattern argument
#=> "How  chop  onion?"

Answer 2

▶ "How to chop an onion?".gsub(/'s|[?&]+|to|an|a|the/,'')
                         .downcase.split(/\s+/).join '-'
#⇒ "how-chop-onion"

Answer 3

首先列出您想要做的事情：

删除某些字词
删除某些标点符号
删除单词后删除多余的空格
转换为小写¹

现在考虑执行这些操作的顺序。转换成小写字母可以随时完成，但先做比较方便，在这种情况下正则表达式不需要区分大小写。某些单词之前的标点符号应该被删除，以便更容易识别单词而不是子字符串。删除多余的空格显然必须在删除单词后完成。因此我们希望顺序为：

转换为小写
删除某些标点符号
删除某些字词
删除单词后删除多余的空格

缩小外壳后，这可以用三个链式 gsubs 来完成：

str = "Please, don't any of you know how to chop an avacado?"

r1 = /[,?]/      # match a comma or question mark

r2 = /
     \b          # match a word break
     (?:         # start a non-capture group
     to|an|a|the # match one of these words (checking left to right)
     )           # end non-capture group
     \b          # match a word break
     /x          # extended/free-spacing regex definition mode

r3 = /\s\s/      # match two whitespace characters

str.downcase.gsub(r1,'').gsub(r2,'').gsub(r3,' ')
  #=> "please don't any of you know how chop avacado"

请注意，如果 r2 中没有单词 breaks (\b)，我们将得到：

"plese don't y of you know how chop vcdo"

此外，第一个 gsub 可以替换为：

tr(',?','')

或：

delete(',?')

这些gsub可以合二为一（我怎么写），如下：

r = /
    [,?]                # as in r1
    |                   # or
    \b(?:to|an|a|the)\b # as in r2
    |                   # or
    \s                  # match a whitespace char
    (?=\s)              # match a whitespace char in a postive lookahead
    /x

str.downcase.gsub(r,'')
  #=> "please don't any of you know how chop avacado"

"Lookarounds"（此处为正前瞻）通常称为 "zero-width"，这意味着虽然需要匹配，但它们不构成返回匹配的一部分。

^{1 您有没有想过 "lower case" 和 "upper case" 这两个词是从哪里来的？在印刷术的早期，排字工将金属活字印刷在两个盒子里，一个位于另一个上方。用于句子和专有名词开头的较高字母的字母大写；剩下的都是小写的}

如何使用`gsub`用相同的字符串替换多个子字符串

How to replace multiple substrings with same string using `gsub`

ruby

string

replace

gsub