如何通过 ruby/rails 中的单词将长文本分成较小的行？

Question

如何通过单词将长文本分成更小的行？理想情况下，我需要像

这样的方法

def text_splitter(text, line_size = 5)
    # ...
end

text_splitter("a b c d e text longword") # => ["a b c", "d e ", "text ", "longword"]

Answer 1

Rails 带有 word_wrap 助手，它可以根据给定的线宽拆分长行。它总是在空格处拆分，所以长单词不会被拆分/剪切。

在rails/console中：

lines = helper.word_wrap("a b c d e text longword", line_width: 5)
#=> "a b c\nd e\ntext\nlongword"

puts lines

输出：

a b c
d e
text
longword

请注意，它 returns 是一个字符串，而不是数组。

Answer 2

可以在 pure-Ruby 中完成，如下所示¹。

def text_splitter(text, line_size)
  text.gsub(/(?:.{1,#{line_size}}|\S+)\K(?:$|\s)/, "\n")
end

text = "Beware the Jabberwock, my son! The jaws that bite, the claws that catch!"

puts text_splitter(text, 30)
0        1         2         3
123456789012345678901234567890
Beware the Jabberwock, my son!
The jaws that bite, the claws
that catch!

puts text_splitter(text, 20)
0        1         2
12345678901234567890
Beware the
Jabberwock, my son!
The jaws that bite,
the claws that
catch!

puts text_splitter(text, 10)
0        1
1234567890
Beware the
Jabberwock,
my son!
The jaws
that bite,
the claws
that
catch!

puts text_splitter(text, 8)
0
12345678
Beware
the
Jabberwock,
my son!
The jaws
that
bite,
the
claws
that
catch!

正则表达式可以分解如下（对于line_size = 10）：

(?:         # begin non-capture group
  .{1,10}   # match 1-10 chars
  |         # or 
  \S+       # match >= 1 non-whitespace chars
)           # end non-capture group
\K          # reset start of match and discard all chars previously matched
(?:$|\s)    # match the end of the string or a whitespace chars

^{1.示例文本来自 Lewis Carrol 的诗歌“Jabberwocky”。}

如何通过 ruby/rails 中的单词将长文本分成较小的行？

how to break long text to smaller lines by words in ruby/rails?

ruby

ruby-on-rails