如何根据其他记录的值更新字段
How to update fields based on value of other record
我有一个类似于以下结构的 table:
City start_date end_date
Paris 1995-01-01 00:00:00 1997-10-01 23:59:59
Paris 1997-10-02 00:00:00 0001-01-01 00:00:00
Paris 2013-01-25 00:00:00 0001-01-01 00:00:00
Paris 2015-04-25 00:00:00 0001-01-01 00:00:00
Berlin 2014-11-01 00:00:00 0001-01-01 00:00:00
Berlin 2014-06-01 00:00:00 0001-01-01 00:00:00
Berlin 2015-09-11 00:00:00 0001-01-01 00:00:00
Berlin 2015-10-01 00:00:00 0001-01-01 00:00:00
Milan 2001-01-01 00:00:00 0001-01-01 00:00:00
Milan 2005-10-02 00:00:00 2006-10-02 23:59:59
Milan 2006-10-03 00:00:00 2015-04-24 23:59:59
Milan 2015-04-25 00:00:00 0001-01-01 00:00:00
数据包含基于城市的开始日期和结束日期的历史视图。一个城市的最新记录应该是开始日期最长的,结束日期为'0001-01-01 00:00:00',表示还没有结束日期。
我需要清理这些数据并确保每个城市的历史记录都有结束日期比下一个记录的开始日期早一秒,仅在 end_date 设置为“0001-01-01 00:00:00”。因此,在 end_date 有实际日期的情况下,该日期将被忽略。此外,最近 start_date 的城市记录不需要修改 end_date。
结果 table 应该如下所示:
City start_date end_date
Paris 1995-01-01 00:00:00 1997-10-01 23:59:59
Paris 1997-10-02 00:00:00 2013-01-24 23:59:59
Paris 2013-01-25 00:00:00 2015-04-24 23:59:59
Paris 2015-04-25 00:00:00 0001-01-01 00:00:00
Berlin 2014-11-01 00:00:00 2014-05-31 23:59:59
Berlin 2014-06-01 00:00:00 2015-09-10 23:59:59
Berlin 2015-09-11 00:00:00 2015-09-30 23:59:59
Berlin 2015-10-01 00:00:00 0001-01-01 23:59:59
Milan 2001-01-01 00:00:00 2005-10-01 23:59:59
Milan 2005-10-02 00:00:00 2006-10-02 23:59:59
Milan 2006-10-03 00:00:00 2015-04-24 23:59:59
Milan 2015-04-25 00:00:00 0001-01-01 00:00:00
我想过很多方法来以编程方式实现此目的,但是我喜欢一个通过 SQL 查询完全处理此问题的解决方案。我发现了一个类似的问题,答案是 here,但它不处理我的特殊情况。我怎样才能修改它以满足我的标准?
编辑:
我已根据以下陈述尝试了以下建议的答案:
update test join
(select t.*,
(select min(start_date)
from test t2
where t2.city = t.city and
t2.start_date > t.start_date
order by t2.start_date
limit 1
) as next_start_date
from test t
) tt
on tt.city = test.city and tt.start_date = test.start_date
set test.end_date = date_sub(tt.next_start_date, interval 1 second)
where test.end_date = '0001-01-01' and
next_start_date is not null;
不幸的是,从柏林记录开始,有些 end_date 并不符合预期(例如 ID 号 5 和 6)。如下所示:
以下是能够复制的创建和插入语句:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`city` varchar(50) DEFAULT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8;
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','1995-01-01 00:00:00','1997-10-01 23:59:59');
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','1997-10-02 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','2013-01-25 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','2015-04-25 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2014-11-01 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2014-06-01 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2015-09-11 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2015-10-01 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2001-01-01 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2005-10-02 00:00:00','2006-10-02 23:59:59');
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2006-10-03 00:00:00','2015-04-24 23:59:59');
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2015-04-25 00:00:00','0001-01-01 00:00:00');
您只需要 lead()
功能,MySQL 中没有。在 update
中使用变量具有挑战性,因此这里有一个具有相关子查询的方法。
获取下一个开始日期:
select t.*,
(select min(start_date)
from t t2
where t2.city = t.city and
t2.start_date > t.start_date
order by t2.start_date
limit 1
) as next_start_date
from t;
您现在可以在 update
使用 join
:
update t join
(select t.*,
(select min(start_date)
from t t2
where t2.city = t.city and
t2.start_date > t.start_date
order by t2.start_date
limit 1
) as next_start_date
from t
) tt
on tt.city = t.city and tt.start_date = t.start_date
set t.end_date = date_sub(tt.next_start_date, interval 1 second)
where t.end_date = '0001-01-01' and
t.next_start_date is not null;
我有一个类似于以下结构的 table:
City start_date end_date
Paris 1995-01-01 00:00:00 1997-10-01 23:59:59
Paris 1997-10-02 00:00:00 0001-01-01 00:00:00
Paris 2013-01-25 00:00:00 0001-01-01 00:00:00
Paris 2015-04-25 00:00:00 0001-01-01 00:00:00
Berlin 2014-11-01 00:00:00 0001-01-01 00:00:00
Berlin 2014-06-01 00:00:00 0001-01-01 00:00:00
Berlin 2015-09-11 00:00:00 0001-01-01 00:00:00
Berlin 2015-10-01 00:00:00 0001-01-01 00:00:00
Milan 2001-01-01 00:00:00 0001-01-01 00:00:00
Milan 2005-10-02 00:00:00 2006-10-02 23:59:59
Milan 2006-10-03 00:00:00 2015-04-24 23:59:59
Milan 2015-04-25 00:00:00 0001-01-01 00:00:00
数据包含基于城市的开始日期和结束日期的历史视图。一个城市的最新记录应该是开始日期最长的,结束日期为'0001-01-01 00:00:00',表示还没有结束日期。
我需要清理这些数据并确保每个城市的历史记录都有结束日期比下一个记录的开始日期早一秒,仅在 end_date 设置为“0001-01-01 00:00:00”。因此,在 end_date 有实际日期的情况下,该日期将被忽略。此外,最近 start_date 的城市记录不需要修改 end_date。
结果 table 应该如下所示:
City start_date end_date
Paris 1995-01-01 00:00:00 1997-10-01 23:59:59
Paris 1997-10-02 00:00:00 2013-01-24 23:59:59
Paris 2013-01-25 00:00:00 2015-04-24 23:59:59
Paris 2015-04-25 00:00:00 0001-01-01 00:00:00
Berlin 2014-11-01 00:00:00 2014-05-31 23:59:59
Berlin 2014-06-01 00:00:00 2015-09-10 23:59:59
Berlin 2015-09-11 00:00:00 2015-09-30 23:59:59
Berlin 2015-10-01 00:00:00 0001-01-01 23:59:59
Milan 2001-01-01 00:00:00 2005-10-01 23:59:59
Milan 2005-10-02 00:00:00 2006-10-02 23:59:59
Milan 2006-10-03 00:00:00 2015-04-24 23:59:59
Milan 2015-04-25 00:00:00 0001-01-01 00:00:00
我想过很多方法来以编程方式实现此目的,但是我喜欢一个通过 SQL 查询完全处理此问题的解决方案。我发现了一个类似的问题,答案是 here,但它不处理我的特殊情况。我怎样才能修改它以满足我的标准?
编辑:
我已根据以下陈述尝试了以下建议的答案:
update test join
(select t.*,
(select min(start_date)
from test t2
where t2.city = t.city and
t2.start_date > t.start_date
order by t2.start_date
limit 1
) as next_start_date
from test t
) tt
on tt.city = test.city and tt.start_date = test.start_date
set test.end_date = date_sub(tt.next_start_date, interval 1 second)
where test.end_date = '0001-01-01' and
next_start_date is not null;
不幸的是,从柏林记录开始,有些 end_date 并不符合预期(例如 ID 号 5 和 6)。如下所示:
以下是能够复制的创建和插入语句:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`city` varchar(50) DEFAULT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8;
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','1995-01-01 00:00:00','1997-10-01 23:59:59');
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','1997-10-02 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','2013-01-25 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','2015-04-25 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2014-11-01 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2014-06-01 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2015-09-11 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2015-10-01 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2001-01-01 00:00:00','0001-01-01 00:00:00');
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2005-10-02 00:00:00','2006-10-02 23:59:59');
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2006-10-03 00:00:00','2015-04-24 23:59:59');
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2015-04-25 00:00:00','0001-01-01 00:00:00');
您只需要 lead()
功能,MySQL 中没有。在 update
中使用变量具有挑战性,因此这里有一个具有相关子查询的方法。
获取下一个开始日期:
select t.*,
(select min(start_date)
from t t2
where t2.city = t.city and
t2.start_date > t.start_date
order by t2.start_date
limit 1
) as next_start_date
from t;
您现在可以在 update
使用 join
:
update t join
(select t.*,
(select min(start_date)
from t t2
where t2.city = t.city and
t2.start_date > t.start_date
order by t2.start_date
limit 1
) as next_start_date
from t
) tt
on tt.city = t.city and tt.start_date = t.start_date
set t.end_date = date_sub(tt.next_start_date, interval 1 second)
where t.end_date = '0001-01-01' and
t.next_start_date is not null;