使用 Ruby 拆分 MySQL 查询的行并写入 CSV 文件
Splitting the Rows of a MySQL Query Using Ruby and Writing to a CSV File
使用 Ruby 自动执行来自远程数据库的 MySQL 查询,我希望根据下面找到的 month
查询的值拆分行。
这是为了根据开始日期为所有客户生成 2014 年 6 月份的 week-by-week(星期三到下一个星期二)报告。虽然报告中的其他内容不会发生变化,但行的重复是基于该开始日期(在下面的 case
语句中解释)。
请注意此处 mysql2
、watir
和 csv
gem 的使用。
简化代码:
#!/usr/local/bin/ruby
require "mysql2"
require "watir"
require "csv"
puts "Initializing Report"
Mysql2::Client.default_query_options.merge!(:as => :array)
mysql = Mysql2::Client.new(:host => "1.2.3.4", :username => "user", :pass => "password", :database => "db")
puts "Successfully accessed db"
month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m') FROM db.table WHERE db.start.group = 1;")
day = mysql.query("SELECT DATE_FORMAT(db.table.start, '%d') FROM db.table WHERE db.start.group = 1;")
report = mysql.query("SELECT db.table.client, SELECT DATE_FORMAT(db.table.start, '%m/%d/%Y'), SELECT DATE_FORMAT(db.table.end, '%m/%d/%Y') FROM db.table WHERE db.start.group = 1;")
case month
when 5
# code splitting one row into four
when 6
if day <= 4
# code splitting one row into four using weekOf
elsif day >= 11 and day <= 17
# code splitting one row into three using weekOf
elsif day >= 18 and day <= 24
# code splitting one row into two using weekOf
else
# no splitting; only one row using weekOf
end
end
CSV.open("Report.csv", "wb") do |csv|
csv << ["Week of", "Client", "Start Date", "End Date"]
weekOf.zip(report).each {|row| csv << row.flatten}
end
puts "Results can be found in Report.csv"
当前输出(如果我要注释掉 case
语句,删除 CSV header 中的 "Week of",
并且只将 report
查询写入 CSV ):
Client, Start Date, End Date
companyrecordlabel, 05/20/2014, 07/09/2015
beeUrself, 05/27/2014, 02/01/2016
overflowStack, 06/04/2014, 12/11/2015
chapoChaps, 06/11/2014, 01/16/2016
Meds4U, 06/18/2014, NULL
.
.
.
我希望得到以下输出:
Week of, Client, Start Date, End Date
06/04/2014, companyrecordlabel, 05/20/2014, 07/09/2015
06/11/2014, companyrecordlabel, 05/20/2014, 07/09/2015
06/18/2014, companyrecordlabel, 05/20/2014, 07/09/2015
06/25/2014, companyrecordlabel, 05/20/2014, 07/09/2015
06/04/2014, beeUrself, 05/27/2014, 02/01/2016
06/11/2014, beeUrself, 05/27/2014, 02/01/2016
06/18/2014, beeUrself, 05/27/2014, 02/01/2016
06/25/2014, beeUrself, 05/27/2014, 02/01/2016
06/04/2014, overflowStack, 06/04/2014, 12/11/2015
06/11/2014, overflowStack, 06/04/2014, 12/11/2015
06/18/2014, overflowStack, 06/04/2014, 12/11/2015
06/25/2014, overflowStack, 06/04/2014, 12/11/2015
06/11/2014, chapoChaps, 06/11/2014, 01/16/2016
06/18/2014, chapoChaps, 06/11/2014, 01/16/2016
06/25/2014, chapoChaps, 06/11/2014, 01/16/2016
06/18/2014, Meds4U, 06/18/2014, NULL
06/25/2014, Meds4U, 06/18/2014, NULL
.
.
.
为清楚起见:"Client"
和 companyrecordlabel
有四行,因为它的 "Start Date"
在五月,而 "Client"
Meds4U
只分成两行行,因为它的 "Start Date"
是在 6 月 18 日。
我根据几个假设为以下答案构建了 FULL 代码:
- 没有
DATE_FORMAT(db.table.end, '%m') = 6
- 你希望所有的
列出的公司按其所在的顺序排列(即
db.table.id
)
- 查询时间对您来说不是什么大问题
- 您想要但不能或忘记包含一个名为
weekOf
的数组
您在查询中似乎也多次使用 SELECT
一词。即使对于像您提供的示例一样小的查询,您也可能希望将其分开并避免将其全部放在一行中:
month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m')
FROM db.table
WHERE db.start.group = 1;")
而不是:
month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m') FROM db.table WHERE db.start.group = 1;")
现在是代码本身:
#!/usr/local/bin/ruby
require "mysql2"
require "watir"
require "csv"
puts "Initializing Report"
Mysql2::Client.default_query_options.merge!(:as => :array)
mysql = Mysql2::Client.new(:host => "1.2.3.4", :username => "user", :pass => "password", :database => "db")
puts "Successfully accessed db"
date = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m'),
DATE_FORMAT(db.table.start, '%d')
FROM db.table
WHERE db.start.group = 1;")
report = mysql.query("SELECT c, s, e FROM (SELECT * FROM (SELECT db.table.id
db.table.client AS c,
DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s,
DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e
FROM db.table
WHERE db.start.group = 1
UNION ALL
SELECT db.table.id
db.table.client AS c,
DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s,
DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e
FROM db.table
WHERE db.start.group = 1
HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 4))
UNION ALL
SELECT db.table.id
db.table.client AS c,
DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s,
DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e
FROM db.table
WHERE db.start.group = 1
HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 11))
UNION ALL
SELECT db.table.id
db.table.client AS c,
DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s,
DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e
FROM db.table
WHERE db.start.group = 1
HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 18))) AS alias
ORDER BY db.table.id) AS alias2;")
weekOf = []
date.each do |mon, day|
if mon === 5
weekOf << "06/04/2014"
weekOf << "06/11/2014"
weekOf << "06/18/2014"
weekOf << "06/25/2014"
elsif mon === 6
if (day.to_i <= 4)
weekOf << "06/04/2014"
weekOf << "06/11/2014"
weekOf << "06/18/2014"
weekOf << "06/25/2014"
elsif ((day.to_i >= 11) && (day.to_i <= 17))
weekOf << "06/11/2014"
weekOf << "06/18/2014"
weekOf << "06/25/2014"
elsif ((day.to_i >= 18) && (day.to_i <= 24))
weekOf << "06/18/2014"
weekOf << "06/25/2014"
else
weekOf << "06/25/2014"
end
else
puts "Error: #{mon} is before May"
end
end
CSV.open("Report.csv", "wb") do |csv|
csv << ["Week of", "Client", "Start Date", "End Date"]
weekOf.zip(report).each {|row| csv << row.flatten}
end
puts "Results can be found in Report.csv"
解释:
我假设查询时间对您来说不是大问题,因为您的示例查询相当小并且不包含 JOIN
。如果您发现您的查询变得大于 10 个左右 INNER JOIN
(例如 table,每个条目都有数十万个条目),那么这可能不再是您的最佳解决方案。
此解决方案有 两个 部分。
第一个 是使用UNION ALL
从数据库本身复制行。这意味着重复整个查询并在下面添加条件以指定何时发生这种重复。
这就是 HAVING
子句的用武之地。当使用 UNION ALL
时,必须以这种方式使用 HAVING
而不是 WHERE
;因为后者会导致 MySQL.
错误
还要记住,作为子查询结果创建的每个 MySQL table 都必须有一个别名:alias
和 alias2
。我使用的不是一个而是两个嵌套查询,以便 ORDER BY db.table.id
(脱离我的一个假设)然后 select 仅我们下一部分需要的列。
最后,我将两个单独的 month
和 day
组合在一起,而不是将它们变成一个 date
:这将在迭代时 return 一个二维数组。
第二个:我创建了您可能想要但忘记包含的 weekOf
数组。
然后我迭代 date
以便将右边的 "06/#{day}/2014"
推入 weekOf
数组。
就是这样!希望对您有所帮助。
使用 Ruby 自动执行来自远程数据库的 MySQL 查询,我希望根据下面找到的 month
查询的值拆分行。
这是为了根据开始日期为所有客户生成 2014 年 6 月份的 week-by-week(星期三到下一个星期二)报告。虽然报告中的其他内容不会发生变化,但行的重复是基于该开始日期(在下面的 case
语句中解释)。
请注意此处 mysql2
、watir
和 csv
gem 的使用。
简化代码:
#!/usr/local/bin/ruby
require "mysql2"
require "watir"
require "csv"
puts "Initializing Report"
Mysql2::Client.default_query_options.merge!(:as => :array)
mysql = Mysql2::Client.new(:host => "1.2.3.4", :username => "user", :pass => "password", :database => "db")
puts "Successfully accessed db"
month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m') FROM db.table WHERE db.start.group = 1;")
day = mysql.query("SELECT DATE_FORMAT(db.table.start, '%d') FROM db.table WHERE db.start.group = 1;")
report = mysql.query("SELECT db.table.client, SELECT DATE_FORMAT(db.table.start, '%m/%d/%Y'), SELECT DATE_FORMAT(db.table.end, '%m/%d/%Y') FROM db.table WHERE db.start.group = 1;")
case month
when 5
# code splitting one row into four
when 6
if day <= 4
# code splitting one row into four using weekOf
elsif day >= 11 and day <= 17
# code splitting one row into three using weekOf
elsif day >= 18 and day <= 24
# code splitting one row into two using weekOf
else
# no splitting; only one row using weekOf
end
end
CSV.open("Report.csv", "wb") do |csv|
csv << ["Week of", "Client", "Start Date", "End Date"]
weekOf.zip(report).each {|row| csv << row.flatten}
end
puts "Results can be found in Report.csv"
当前输出(如果我要注释掉 case
语句,删除 CSV header 中的 "Week of",
并且只将 report
查询写入 CSV ):
Client, Start Date, End Date
companyrecordlabel, 05/20/2014, 07/09/2015
beeUrself, 05/27/2014, 02/01/2016
overflowStack, 06/04/2014, 12/11/2015
chapoChaps, 06/11/2014, 01/16/2016
Meds4U, 06/18/2014, NULL
.
.
.
我希望得到以下输出:
Week of, Client, Start Date, End Date
06/04/2014, companyrecordlabel, 05/20/2014, 07/09/2015
06/11/2014, companyrecordlabel, 05/20/2014, 07/09/2015
06/18/2014, companyrecordlabel, 05/20/2014, 07/09/2015
06/25/2014, companyrecordlabel, 05/20/2014, 07/09/2015
06/04/2014, beeUrself, 05/27/2014, 02/01/2016
06/11/2014, beeUrself, 05/27/2014, 02/01/2016
06/18/2014, beeUrself, 05/27/2014, 02/01/2016
06/25/2014, beeUrself, 05/27/2014, 02/01/2016
06/04/2014, overflowStack, 06/04/2014, 12/11/2015
06/11/2014, overflowStack, 06/04/2014, 12/11/2015
06/18/2014, overflowStack, 06/04/2014, 12/11/2015
06/25/2014, overflowStack, 06/04/2014, 12/11/2015
06/11/2014, chapoChaps, 06/11/2014, 01/16/2016
06/18/2014, chapoChaps, 06/11/2014, 01/16/2016
06/25/2014, chapoChaps, 06/11/2014, 01/16/2016
06/18/2014, Meds4U, 06/18/2014, NULL
06/25/2014, Meds4U, 06/18/2014, NULL
.
.
.
为清楚起见:"Client"
和 companyrecordlabel
有四行,因为它的 "Start Date"
在五月,而 "Client"
Meds4U
只分成两行行,因为它的 "Start Date"
是在 6 月 18 日。
我根据几个假设为以下答案构建了 FULL 代码:
- 没有
DATE_FORMAT(db.table.end, '%m') = 6
- 你希望所有的
列出的公司按其所在的顺序排列(即
db.table.id
) - 查询时间对您来说不是什么大问题
- 您想要但不能或忘记包含一个名为
weekOf
的数组
您在查询中似乎也多次使用 SELECT
一词。即使对于像您提供的示例一样小的查询,您也可能希望将其分开并避免将其全部放在一行中:
month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m')
FROM db.table
WHERE db.start.group = 1;")
而不是:
month = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m') FROM db.table WHERE db.start.group = 1;")
现在是代码本身:
#!/usr/local/bin/ruby
require "mysql2"
require "watir"
require "csv"
puts "Initializing Report"
Mysql2::Client.default_query_options.merge!(:as => :array)
mysql = Mysql2::Client.new(:host => "1.2.3.4", :username => "user", :pass => "password", :database => "db")
puts "Successfully accessed db"
date = mysql.query("SELECT DATE_FORMAT(db.table.start, '%m'),
DATE_FORMAT(db.table.start, '%d')
FROM db.table
WHERE db.start.group = 1;")
report = mysql.query("SELECT c, s, e FROM (SELECT * FROM (SELECT db.table.id
db.table.client AS c,
DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s,
DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e
FROM db.table
WHERE db.start.group = 1
UNION ALL
SELECT db.table.id
db.table.client AS c,
DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s,
DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e
FROM db.table
WHERE db.start.group = 1
HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 4))
UNION ALL
SELECT db.table.id
db.table.client AS c,
DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s,
DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e
FROM db.table
WHERE db.start.group = 1
HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 11))
UNION ALL
SELECT db.table.id
db.table.client AS c,
DATE_FORMAT(db.table.start, '%m/%d/%Y') AS s,
DATE_FORMAT(db.table.end, '%m/%d/%Y') AS e
FROM db.table
WHERE db.start.group = 1
HAVING ((DATE_FORMAT(db.table.start, '%m') = 5) OR (DATE_FORMAT(db.table.start, '%d') <= 18))) AS alias
ORDER BY db.table.id) AS alias2;")
weekOf = []
date.each do |mon, day|
if mon === 5
weekOf << "06/04/2014"
weekOf << "06/11/2014"
weekOf << "06/18/2014"
weekOf << "06/25/2014"
elsif mon === 6
if (day.to_i <= 4)
weekOf << "06/04/2014"
weekOf << "06/11/2014"
weekOf << "06/18/2014"
weekOf << "06/25/2014"
elsif ((day.to_i >= 11) && (day.to_i <= 17))
weekOf << "06/11/2014"
weekOf << "06/18/2014"
weekOf << "06/25/2014"
elsif ((day.to_i >= 18) && (day.to_i <= 24))
weekOf << "06/18/2014"
weekOf << "06/25/2014"
else
weekOf << "06/25/2014"
end
else
puts "Error: #{mon} is before May"
end
end
CSV.open("Report.csv", "wb") do |csv|
csv << ["Week of", "Client", "Start Date", "End Date"]
weekOf.zip(report).each {|row| csv << row.flatten}
end
puts "Results can be found in Report.csv"
解释:
我假设查询时间对您来说不是大问题,因为您的示例查询相当小并且不包含 JOIN
。如果您发现您的查询变得大于 10 个左右 INNER JOIN
(例如 table,每个条目都有数十万个条目),那么这可能不再是您的最佳解决方案。
此解决方案有 两个 部分。
第一个 是使用UNION ALL
从数据库本身复制行。这意味着重复整个查询并在下面添加条件以指定何时发生这种重复。
这就是 HAVING
子句的用武之地。当使用 UNION ALL
时,必须以这种方式使用 HAVING
而不是 WHERE
;因为后者会导致 MySQL.
还要记住,作为子查询结果创建的每个 MySQL table 都必须有一个别名:alias
和 alias2
。我使用的不是一个而是两个嵌套查询,以便 ORDER BY db.table.id
(脱离我的一个假设)然后 select 仅我们下一部分需要的列。
最后,我将两个单独的 month
和 day
组合在一起,而不是将它们变成一个 date
:这将在迭代时 return 一个二维数组。
第二个:我创建了您可能想要但忘记包含的 weekOf
数组。
然后我迭代 date
以便将右边的 "06/#{day}/2014"
推入 weekOf
数组。
就是这样!希望对您有所帮助。