如何对文件中的模式进行 grep 并存储其后的内容?
How to grep for a pattern in a file and store the content following it?
我的文件内容是
blablabla
Name : 'XYZ'
Age : '30'
Place : 'ABCD'
blablabla
如何 grep "Name"、"Age"、"Place" 和商店名称 "XYZ"、年龄“30”并将 "ABCD" 放入散列中?
'?' 应该是什么?在这段代码中得到那些?
data = {}
name = /Name/
age = /Age/
place = /Place/
read_lines(file) { |l|
case l
when name
data[:name] = ?
when age
data[:age] = ?
when place
data[:place]= ?
end
}
你可以使用类似这样的东西。
data = {}
keys = {:name => "Name", :age => "Age", :place => "Place"}
File.open("test.txt", "r") do |f|
f.each_line do |line|
line.chomp!
keys.each do |hash_key, string|
if line[/#{string}/]
data[hash_key] = line.strip.split(" : ")[-1].gsub("'", "")
break
end
end
end
end
输出
p data
# => {:name=>"XYZ", :age=>"30", :place=>"ABCD"}
奇怪的代码,但在这种情况下:
data[:name] = l.split(':')[1] if l.match(name)
when age
data[:age] = l.split(':')[1] if l.match(age)
when place
data[:place]= l.split(':')[1] if l.match(place)
您对重构感兴趣吗?
一个选项是:
mapping =
[
{ name: :name, pattern: /Name/ },
{ name: :age, pattern: /Age/ },
{ name: :place, pattern: /Place/ }
]
data = str.split(/\r?\n|\r/).map do |line|
mapping.map{|pair|
{ pair[:name] => line.split(' : ')[1].gsub("'", "") } if line.match(pair[:pattern])
}.compact.reduce({}, :merge)
end.reduce({}, :merge)
假设我们先将文件读入字符串:
str = File.read('fname')
即:
str =<<_
blablabla
Name : 'XYZ'
Age : '30'
Place : 'ABCD'
blablabla
_
#=> "blablabla\nName : 'XYZ'\nAge : '30'\nPlace : 'ABCD'\nblablabla\n"
然后使用正则表达式
r = /
^ # match beginning of line
Name\s*:\s*'(.*)'\n # match 'Name`, ':' possibly surrounded by spaces, any number
# of any character in capture group 1, end of line
Age\s*:\s*'(.*)'\n # match 'Age`, ':' possibly surrounded by spaces, any number
# of any character in capture group 2, end of line
Place\s*:\s*'(.*)'\n # match 'Place`, ':' possibly surrounded by spaces, any number
# of any character in capture group 3, end of line
/x # free-spacing regex definition mode
与String#scan形成散列:
[:name, :age, :place].zip(str.scan(r).first).to_h
#=> {:name=>"XYZ", :age=>"30", :place=>"ABCD"}
我会这样做:
str = <<EOT
blablabla
Name : 'XYZ'
Age : '30'
Place : 'ABCD'
blablabla
EOT
str.scan(/(Name|Age|Place)\s+:\s'([^']+)/).to_h # => {"Name"=>"XYZ", "Age"=>"30", "Place"=>"ABCD"}
scan
如果在正则表达式中看到模式组,将创建子数组。这些使得将返回的数组数组转换为散列变得容易。
如果需要将按键折叠为小写,或将其转换为符号:
str.scan(/(Name|Age|Place)\s+:\s'([^']+)/)
.map{ |k, v| [k.downcase, v] } # => [["name", "XYZ"], ["age", "30"], ["place", "ABCD"]]
.to_h # => {"name"=>"XYZ", "age"=>"30", "place"=>"ABCD"}
或:
str.scan(/(Name|Age|Place)\s+:\s'([^']+)/)
.map{ |k, v| [k.downcase.to_sym, v] } # => [[:name, "XYZ"], [:age, "30"], [:place, "ABCD"]]
.to_h # => {:name=>"XYZ", :age=>"30", :place=>"ABCD"}
或一些变体:
str.scan(/(Name|Age|Place)\s+:\s'([^']+)/)
.each_with_object({}){ |(k,v), h| h[k.downcase.to_sym] = v}
# => {:name=>"XYZ", :age=>"30", :place=>"ABCD"}
如果 示例字符串确实是完整的文件,并且不会再出现 key/value 对,那么这将起作用。如果可能有多个,那么生成的散列将不正确,因为后续对将踩在第一对上。如果文件如你所说,那么它就可以正常工作。
我的文件内容是
blablabla
Name : 'XYZ'
Age : '30'
Place : 'ABCD'
blablabla
如何 grep "Name"、"Age"、"Place" 和商店名称 "XYZ"、年龄“30”并将 "ABCD" 放入散列中?
'?' 应该是什么?在这段代码中得到那些?
data = {}
name = /Name/
age = /Age/
place = /Place/
read_lines(file) { |l|
case l
when name
data[:name] = ?
when age
data[:age] = ?
when place
data[:place]= ?
end
}
你可以使用类似这样的东西。
data = {}
keys = {:name => "Name", :age => "Age", :place => "Place"}
File.open("test.txt", "r") do |f|
f.each_line do |line|
line.chomp!
keys.each do |hash_key, string|
if line[/#{string}/]
data[hash_key] = line.strip.split(" : ")[-1].gsub("'", "")
break
end
end
end
end
输出
p data
# => {:name=>"XYZ", :age=>"30", :place=>"ABCD"}
奇怪的代码,但在这种情况下:
data[:name] = l.split(':')[1] if l.match(name)
when age
data[:age] = l.split(':')[1] if l.match(age)
when place
data[:place]= l.split(':')[1] if l.match(place)
您对重构感兴趣吗?
一个选项是:
mapping =
[
{ name: :name, pattern: /Name/ },
{ name: :age, pattern: /Age/ },
{ name: :place, pattern: /Place/ }
]
data = str.split(/\r?\n|\r/).map do |line|
mapping.map{|pair|
{ pair[:name] => line.split(' : ')[1].gsub("'", "") } if line.match(pair[:pattern])
}.compact.reduce({}, :merge)
end.reduce({}, :merge)
假设我们先将文件读入字符串:
str = File.read('fname')
即:
str =<<_
blablabla
Name : 'XYZ'
Age : '30'
Place : 'ABCD'
blablabla
_
#=> "blablabla\nName : 'XYZ'\nAge : '30'\nPlace : 'ABCD'\nblablabla\n"
然后使用正则表达式
r = /
^ # match beginning of line
Name\s*:\s*'(.*)'\n # match 'Name`, ':' possibly surrounded by spaces, any number
# of any character in capture group 1, end of line
Age\s*:\s*'(.*)'\n # match 'Age`, ':' possibly surrounded by spaces, any number
# of any character in capture group 2, end of line
Place\s*:\s*'(.*)'\n # match 'Place`, ':' possibly surrounded by spaces, any number
# of any character in capture group 3, end of line
/x # free-spacing regex definition mode
与String#scan形成散列:
[:name, :age, :place].zip(str.scan(r).first).to_h
#=> {:name=>"XYZ", :age=>"30", :place=>"ABCD"}
我会这样做:
str = <<EOT
blablabla
Name : 'XYZ'
Age : '30'
Place : 'ABCD'
blablabla
EOT
str.scan(/(Name|Age|Place)\s+:\s'([^']+)/).to_h # => {"Name"=>"XYZ", "Age"=>"30", "Place"=>"ABCD"}
scan
如果在正则表达式中看到模式组,将创建子数组。这些使得将返回的数组数组转换为散列变得容易。
如果需要将按键折叠为小写,或将其转换为符号:
str.scan(/(Name|Age|Place)\s+:\s'([^']+)/)
.map{ |k, v| [k.downcase, v] } # => [["name", "XYZ"], ["age", "30"], ["place", "ABCD"]]
.to_h # => {"name"=>"XYZ", "age"=>"30", "place"=>"ABCD"}
或:
str.scan(/(Name|Age|Place)\s+:\s'([^']+)/)
.map{ |k, v| [k.downcase.to_sym, v] } # => [[:name, "XYZ"], [:age, "30"], [:place, "ABCD"]]
.to_h # => {:name=>"XYZ", :age=>"30", :place=>"ABCD"}
或一些变体:
str.scan(/(Name|Age|Place)\s+:\s'([^']+)/)
.each_with_object({}){ |(k,v), h| h[k.downcase.to_sym] = v}
# => {:name=>"XYZ", :age=>"30", :place=>"ABCD"}
如果 示例字符串确实是完整的文件,并且不会再出现 key/value 对,那么这将起作用。如果可能有多个,那么生成的散列将不正确,因为后续对将踩在第一对上。如果文件如你所说,那么它就可以正常工作。