如何根据 SQL 数据库中的其他字段向字段添加数据
How to add data to a field based on other fields in a SQL database
我有一个名为 wcvp
的 SQLite table,它是由从维管植物世界检查表(请参阅 https://wcvp.science.kew.org/ and http://sftp.kew.org/pub/data-repositories/WCVP/)下载的 csv 文件构建的。当我 运行 这个查询时:
sqlite> SELECT kew_id, genus, species, infraspecies
FROM wcvp
WHERE genus = 'Quercus'
AND species = 'robur'
AND taxonomic_status = 'Accepted';
我得到这个结果:
kew_id
genus
species
infraspecies
304293-2
Quercus
robur
77189540-1
Quercus
robur
broteroana
77189379-1
Quercus
robur
brutia
77189383-1
Quercus
robur
imeretina
60459295-2
Quercus
robur
pedunculiflo
77171868-1
Quercus
robur
robur
我想在 table(其中有数十万行)中添加一个名为 number_of_infraspecies
的列,它看起来像这样:
kew_id
genus
species
infraspecies
number_of_infraspecies
304293-2
Quercus
robur
5
77189540-1
Quercus
robur
broteroana
NULL
77189379-1
Quercus
robur
brutia
NULL
77189383-1
Quercus
robur
imeretina
NULL
60459295-2
Quercus
robur
pedunculiflo
NULL
77171868-1
Quercus
robur
robur
NULL
或者,我可以构建一个包含两列的新 table:kew_id
作为外键,number_of_infraspecies
作为另一列。
无论我采用哪种方法,我只能想到一个过程,该过程会导致对 wcvp
table 的每一行或至少那些没有值的行进行单独查询在亚种列中(AND taxonomic_status = 'Accepted')。
有没有一种方法可以通过一个或几个查询来完成?
我想你只是想 count(*)
作为一个 window 函数:
SELECT kew_id, genus, species, infraspecies,
COUNT(*) OVER (PARTITION BY genus, species) as infra_species
FROM wcvp
WHERE genus = 'Quercus' AND species = 'robur' AND
taxonomic_status = 'Accepted';
创建一个 VIEW
returns 列 number_of_infraspecies
:
CREATE VIEW my_view AS
SELECT kew_id, genus, species, infraspecies,
CASE
WHEN infraspecies IS NULL
THEN COUNT(infraspecies) OVER (PARTITION BY genus, species)
END number_of_infraspecies
FROM wcvp
WHERE taxonomic_status = 'Accepted';
然后 select 从那个 VIEW
具体 genus
和 species
:
SELECT kew_id, genus, species, infraspecies, number_of_infraspecies
FROM my_view
WHERE genus = 'Quercus' AND species = 'robur';
参见demo。
我有一个名为 wcvp
的 SQLite table,它是由从维管植物世界检查表(请参阅 https://wcvp.science.kew.org/ and http://sftp.kew.org/pub/data-repositories/WCVP/)下载的 csv 文件构建的。当我 运行 这个查询时:
sqlite> SELECT kew_id, genus, species, infraspecies
FROM wcvp
WHERE genus = 'Quercus'
AND species = 'robur'
AND taxonomic_status = 'Accepted';
我得到这个结果:
kew_id | genus | species | infraspecies |
---|---|---|---|
304293-2 | Quercus | robur | |
77189540-1 | Quercus | robur | broteroana |
77189379-1 | Quercus | robur | brutia |
77189383-1 | Quercus | robur | imeretina |
60459295-2 | Quercus | robur | pedunculiflo |
77171868-1 | Quercus | robur | robur |
我想在 table(其中有数十万行)中添加一个名为 number_of_infraspecies
的列,它看起来像这样:
kew_id | genus | species | infraspecies | number_of_infraspecies |
---|---|---|---|---|
304293-2 | Quercus | robur | 5 | |
77189540-1 | Quercus | robur | broteroana | NULL |
77189379-1 | Quercus | robur | brutia | NULL |
77189383-1 | Quercus | robur | imeretina | NULL |
60459295-2 | Quercus | robur | pedunculiflo | NULL |
77171868-1 | Quercus | robur | robur | NULL |
或者,我可以构建一个包含两列的新 table:kew_id
作为外键,number_of_infraspecies
作为另一列。
无论我采用哪种方法,我只能想到一个过程,该过程会导致对 wcvp
table 的每一行或至少那些没有值的行进行单独查询在亚种列中(AND taxonomic_status = 'Accepted')。
有没有一种方法可以通过一个或几个查询来完成?
我想你只是想 count(*)
作为一个 window 函数:
SELECT kew_id, genus, species, infraspecies,
COUNT(*) OVER (PARTITION BY genus, species) as infra_species
FROM wcvp
WHERE genus = 'Quercus' AND species = 'robur' AND
taxonomic_status = 'Accepted';
创建一个 VIEW
returns 列 number_of_infraspecies
:
CREATE VIEW my_view AS
SELECT kew_id, genus, species, infraspecies,
CASE
WHEN infraspecies IS NULL
THEN COUNT(infraspecies) OVER (PARTITION BY genus, species)
END number_of_infraspecies
FROM wcvp
WHERE taxonomic_status = 'Accepted';
然后 select 从那个 VIEW
具体 genus
和 species
:
SELECT kew_id, genus, species, infraspecies, number_of_infraspecies
FROM my_view
WHERE genus = 'Quercus' AND species = 'robur';
参见demo。