如何在 Ruby 中用 "CENSORED" 审查某些词和 return?
How do I censor certain words and return with "CENSORED" in Ruby?
我正在尝试从 Ruby 中的数组中删除某些词,但一直很难做到。我已经设法审查了一些词,但是当我尝试审查所有被禁止的短语时,要么标点符号丢失,要么有标点符号的词没有被审查。
test_tweets = [
"This president sucks!",
"I hate this Blank House!",
"I can't believe we're living under such bad leadership. We were so foolish",
"President Presidentname is a danger to society. I hate that he's so bad -- it sucks."
]
banned_phrases = ["sucks", "bad", "hate", "foolish", "danger to society"]
censored_tweets = []
index = 0
test_tweets.each do |tweet|
censored_tweets[index] = [] if censored_tweets[index] == nil
tweet.split(/\W+/).each do |word|
banned_phrases.include?(word) ?
censored_tweets[index].push("CENSORED") : censored_tweets[index].push(word)
end
censored_tweets[index] = censored_tweets[index].join(" ")
index += 1
end
puts censored_tweets
此方法会审查所有被禁止的短语,但会删除标点符号。有人可以帮忙吗,因为这是一项无法完成的任务。
正则表达式很适合这个。正则表达式非常强大但并不总是那么容易。然而,这个很容易构建:
test_tweets = [
"This president sucks!",
"I hate this Blank House!",
"I can't believe we're living under such bad leadership. We were so foolish",
"President Presidentname is a danger to society. I hate that he's so bad -- it sucks."
]
banned_phrases = ["sucks", "bad", "hate", "foolish", "danger to society"]
regex_banned = Regexp.union( banned_phrases) #that's all there is to it
censored_tweets = test_tweets.map{|tweet| tweet.gsub(regex_banned, "CENSORED") }
puts censored_tweets
我正在尝试从 Ruby 中的数组中删除某些词,但一直很难做到。我已经设法审查了一些词,但是当我尝试审查所有被禁止的短语时,要么标点符号丢失,要么有标点符号的词没有被审查。
test_tweets = [
"This president sucks!",
"I hate this Blank House!",
"I can't believe we're living under such bad leadership. We were so foolish",
"President Presidentname is a danger to society. I hate that he's so bad -- it sucks."
]
banned_phrases = ["sucks", "bad", "hate", "foolish", "danger to society"]
censored_tweets = []
index = 0
test_tweets.each do |tweet|
censored_tweets[index] = [] if censored_tweets[index] == nil
tweet.split(/\W+/).each do |word|
banned_phrases.include?(word) ?
censored_tweets[index].push("CENSORED") : censored_tweets[index].push(word)
end
censored_tweets[index] = censored_tweets[index].join(" ")
index += 1
end
puts censored_tweets
此方法会审查所有被禁止的短语,但会删除标点符号。有人可以帮忙吗,因为这是一项无法完成的任务。
正则表达式很适合这个。正则表达式非常强大但并不总是那么容易。然而,这个很容易构建:
test_tweets = [
"This president sucks!",
"I hate this Blank House!",
"I can't believe we're living under such bad leadership. We were so foolish",
"President Presidentname is a danger to society. I hate that he's so bad -- it sucks."
]
banned_phrases = ["sucks", "bad", "hate", "foolish", "danger to society"]
regex_banned = Regexp.union( banned_phrases) #that's all there is to it
censored_tweets = test_tweets.map{|tweet| tweet.gsub(regex_banned, "CENSORED") }
puts censored_tweets