有没有办法将一列的多个值加载到新行到 sql 中的一行
Is there a way to load multiple values of one column into new lines to a single row in sql
我的数据是这样的,
emp_id,skills
1234,python|java|sql|R|javascript
5639,C|HTML|php|perl
这就是需要将数据加载到 table
中的方式
emp_id skills
python
1234 java
sql
R
perl
C
5639 HTML
php
其实我已经换了|与 \n 但它没有被加载到下一行,而只是通过添加空格来加载。
我将使用 python etl 将数据加载到 table,因此我可以添加后处理。
有什么建议吗??
这是关系数据库的行为,不能只显示 emp_id
每组一次及其所有元素每行一个。改变这种显示数据的方式是前端的特权和责任,而不是数据库的。所以在 Python.
中这样做
话虽如此,Impala 具有 SPLIT_PART()
字符串函数,其中 returns 字符串的 n-th 标记由您传递的分隔符分隔参数。
因此,交叉连接一系列连续的整数,然后应用 SPLIT_PART(skills,'|',i)
即可完成您的需要。
实际上,以我(从不过分)的拙见,每当有人向您抛出格式如此令人不快的文件以加载到数据库中时,您都应该这样做。始终使用以下技术垂直化 comma/bar/semicolon/or_whatever 分隔的“值”列表,并存储垂直化的数据。:
WITH
-- your input
input( emp_id,skills) AS (
SELECT 1234,'python|java|sql|R|javascript'
UNION ALL SELECT 5639,'C|HTML|php|perl'
)
,
-- a big enough series of integers ..
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
)
SELECT
emp_id
, SPLIT_PART(skills,'|',i) AS skill
FROM input
CROSS JOIN i
WHERE SPLIT_PART(skills,'|',i) <> ''
ORDER BY
emp_id
, i
;
-- out emp_id | skill
-- out --------+------------
-- out 1234 | python
-- out 1234 | java
-- out 1234 | sql
-- out 1234 | R
-- out 1234 | javascript
-- out 5639 | C
-- out 5639 | HTML
-- out 5639 | php
-- out 5639 | perl
有两个 bar/comma 分隔的列,它可能看起来像这样:
WITH
-- your input, enhanced
input( emp_id,skills,pubs) AS (
SELECT 1234,'python|java|sql|R|javascript','ship inn,anchor,stag'
UNION ALL SELECT 5639,'C|HTML|php|perl' ,'black horse,crown,mitre'
)
,
-- a big enough series of integers ..
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
)
,
-- another big enough series of integers ..
j(j) AS (
SELECT
i AS j
FROM i
)
SELECT
emp_id
, i AS skill_sequence
, SPLIT_PART(skills,'|',i) AS skill
, j AS pub_sequence
, SPLIT_PART(pubs,',',j) AS pub
FROM input
CROSS JOIN i
CROSS JOIN j
WHERE SPLIT_PART(skills,'|',i) <> ''
AND SPLIT_PART(pubs, ',',j) <> ''
ORDER BY
emp_id
, i
, j
;
-- out emp_id | skill_sequence | skill | pub_sequence | pub
-- out --------+----------------+------------+--------------+-------------
-- out 1234 | 1 | python | 1 | ship inn
-- out 1234 | 1 | python | 2 | anchor
-- out 1234 | 1 | python | 3 | stag
-- out 1234 | 2 | java | 1 | ship inn
-- out 1234 | 2 | java | 2 | anchor
-- out 1234 | 2 | java | 3 | stag
-- out 1234 | 3 | sql | 1 | ship inn
-- out 1234 | 3 | sql | 2 | anchor
-- out 1234 | 3 | sql | 3 | stag
-- out 1234 | 4 | R | 1 | ship inn
-- out 1234 | 4 | R | 2 | anchor
-- out 1234 | 4 | R | 3 | stag
-- out 1234 | 5 | javascript | 1 | ship inn
-- out 1234 | 5 | javascript | 2 | anchor
-- out 1234 | 5 | javascript | 3 | stag
-- out 5639 | 1 | C | 1 | black horse
-- out 5639 | 1 | C | 2 | crown
-- out 5639 | 1 | C | 3 | mitre
-- out 5639 | 2 | HTML | 1 | black horse
-- out 5639 | 2 | HTML | 2 | crown
-- out 5639 | 2 | HTML | 3 | mitre
-- out 5639 | 3 | php | 1 | black horse
-- out 5639 | 3 | php | 2 | crown
-- out 5639 | 3 | php | 3 | mitre
-- out 5639 | 4 | perl | 1 | black horse
-- out 5639 | 4 | perl | 2 | crown
-- out 5639 | 4 | perl | 3 | mitre
我的数据是这样的,
emp_id,skills
1234,python|java|sql|R|javascript
5639,C|HTML|php|perl
这就是需要将数据加载到 table
中的方式emp_id skills
python
1234 java
sql
R
perl
C
5639 HTML
php
其实我已经换了|与 \n 但它没有被加载到下一行,而只是通过添加空格来加载。 我将使用 python etl 将数据加载到 table,因此我可以添加后处理。 有什么建议吗??
这是关系数据库的行为,不能只显示 emp_id
每组一次及其所有元素每行一个。改变这种显示数据的方式是前端的特权和责任,而不是数据库的。所以在 Python.
话虽如此,Impala 具有 SPLIT_PART()
字符串函数,其中 returns 字符串的 n-th 标记由您传递的分隔符分隔参数。
因此,交叉连接一系列连续的整数,然后应用 SPLIT_PART(skills,'|',i)
即可完成您的需要。
实际上,以我(从不过分)的拙见,每当有人向您抛出格式如此令人不快的文件以加载到数据库中时,您都应该这样做。始终使用以下技术垂直化 comma/bar/semicolon/or_whatever 分隔的“值”列表,并存储垂直化的数据。:
WITH
-- your input
input( emp_id,skills) AS (
SELECT 1234,'python|java|sql|R|javascript'
UNION ALL SELECT 5639,'C|HTML|php|perl'
)
,
-- a big enough series of integers ..
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
)
SELECT
emp_id
, SPLIT_PART(skills,'|',i) AS skill
FROM input
CROSS JOIN i
WHERE SPLIT_PART(skills,'|',i) <> ''
ORDER BY
emp_id
, i
;
-- out emp_id | skill
-- out --------+------------
-- out 1234 | python
-- out 1234 | java
-- out 1234 | sql
-- out 1234 | R
-- out 1234 | javascript
-- out 5639 | C
-- out 5639 | HTML
-- out 5639 | php
-- out 5639 | perl
有两个 bar/comma 分隔的列,它可能看起来像这样:
WITH
-- your input, enhanced
input( emp_id,skills,pubs) AS (
SELECT 1234,'python|java|sql|R|javascript','ship inn,anchor,stag'
UNION ALL SELECT 5639,'C|HTML|php|perl' ,'black horse,crown,mitre'
)
,
-- a big enough series of integers ..
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
)
,
-- another big enough series of integers ..
j(j) AS (
SELECT
i AS j
FROM i
)
SELECT
emp_id
, i AS skill_sequence
, SPLIT_PART(skills,'|',i) AS skill
, j AS pub_sequence
, SPLIT_PART(pubs,',',j) AS pub
FROM input
CROSS JOIN i
CROSS JOIN j
WHERE SPLIT_PART(skills,'|',i) <> ''
AND SPLIT_PART(pubs, ',',j) <> ''
ORDER BY
emp_id
, i
, j
;
-- out emp_id | skill_sequence | skill | pub_sequence | pub
-- out --------+----------------+------------+--------------+-------------
-- out 1234 | 1 | python | 1 | ship inn
-- out 1234 | 1 | python | 2 | anchor
-- out 1234 | 1 | python | 3 | stag
-- out 1234 | 2 | java | 1 | ship inn
-- out 1234 | 2 | java | 2 | anchor
-- out 1234 | 2 | java | 3 | stag
-- out 1234 | 3 | sql | 1 | ship inn
-- out 1234 | 3 | sql | 2 | anchor
-- out 1234 | 3 | sql | 3 | stag
-- out 1234 | 4 | R | 1 | ship inn
-- out 1234 | 4 | R | 2 | anchor
-- out 1234 | 4 | R | 3 | stag
-- out 1234 | 5 | javascript | 1 | ship inn
-- out 1234 | 5 | javascript | 2 | anchor
-- out 1234 | 5 | javascript | 3 | stag
-- out 5639 | 1 | C | 1 | black horse
-- out 5639 | 1 | C | 2 | crown
-- out 5639 | 1 | C | 3 | mitre
-- out 5639 | 2 | HTML | 1 | black horse
-- out 5639 | 2 | HTML | 2 | crown
-- out 5639 | 2 | HTML | 3 | mitre
-- out 5639 | 3 | php | 1 | black horse
-- out 5639 | 3 | php | 2 | crown
-- out 5639 | 3 | php | 3 | mitre
-- out 5639 | 4 | perl | 1 | black horse
-- out 5639 | 4 | perl | 2 | crown
-- out 5639 | 4 | perl | 3 | mitre