SQL - Convert/Transpose 行与列的编码文本值
SQL - Convert/Transpose Rows with Coded Text Values to Columns
我正在寻求帮助,将数据存储在行中的 table 转换为存储在 table 中的数据。
背景...我正在处理包含入院数据的 table。我们称 table "Inpatients".
数据当前格式为 table,有 3 列和 n 行。这 3 列包含以下数据:
"Patient_ID" = 唯一的 patient/person 标识符。将此视为患者的名字;
"Event_ID" = 唯一入场事件标识符。识别独特的医院护理事件;
"Diagnosis_Code" = ICD-10 code 用来记录病人住院的原因
对于单个患者(Patient_ID),每次住院(Event_ID)表示为一个或多个[ table 中的 =49=] 行,其中一行用于记录给定住院期间的每个诊断。
因此,任何给定的住院时间都可以在 table(一个记录的诊断) 或 table 中的多行中捕获(与多项诊断相关)。
下面给出当前"Inpatients"table的例子...
-------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code
-------------------------------------------
Pers001 | HospStay001 | C139
Pers001 | HospStay001 | I245
Pers001 | HospStay001 | D456
Pers001 | HospStay002 | C139
Pers001 | HospStay002 | J123
Pers555 | HospStay001 | D312
Pers999 | HospStay001 | C120
Pers999 | HospStay001 | E101
这是我真正想做的事情:我想转换数据,以便我只有 一行,每个患者每次住院,这样上面的table就格式化为:
----------------------------------------------------------------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code_1 | Diagnosis_Code_2 | Diagnosis_Code_3 | Diagnosis_Code_n
----------------------------------------------------------------------------------------------------
Pers001 | HospStay001 | C139 | I245 | D456 |
Pers001 | HospStay002 | C139 | J123 | |
Pers555 | HospStay001 | D312 | | |
Pers999 | HospStay001 | C120 | E101 | |
我怀疑解决方案需要一些动态 sql...恐怕不是我的强项之一。
谢谢!
CREATE table #source (Patient_ID varchar(100), Event_ID varchar (100) ,Diagnosis_Code VARCHAR(100),Dig_Number INT)
insert into #source (Patient_ID, Event_ID,Diagnosis_Code,Dig_Number) values
('Pers001','HospStay001','I245',2),
('Pers001','HospStay001','D456',3),
('Pers001','HospStay002','C139',1),
('Pers001','HospStay002','J123',2),
('Pers555','HospStay001','D312',1),
('Pers999','HospStay001','C120',1),
('Pers999','HospStay001','E101',2),
('Pers001','HospStay001','C139',1)
--DROP TABLE tempdb..#source
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
SELECT @cols = STUFF
(
(
SELECT ',' + QUOTENAME( CONVERT(VARCHAR(10),Dig_Number))
FROM #source
GROUP BY Dig_Number
ORDER BY Dig_Number
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'),
1,1,''
);
SET @query = 'SELECT Patient_ID,Event_ID,' + @cols + '
FROM
(
SELECT Patient_ID,Event_ID,Diagnosis_Code,dig_number
FROM #source
) x
PIVOT
(
MAX(Diagnosis_Code)
FOR Dig_Number IN (' + @cols + ')
) p ';
EXECUTE(@query);
再增加一列即分房数就可以了
Rajat 是正确的 - 您需要某种列来创建 diagnosis_column_1、dignosis_column_2...等等
要在 ms-access 中执行此操作,我会:
1.创建一个虚拟列来计算诊断列
2. 使用 VBA 填充它(对于大型数据库更快),如下所示
Sub Update_Diagnosis_Code_ID()
Dim db As DAO.Database
'Dim qdf As DAO.QueryDef
Dim rs As DAO.Recordset
Dim TmpRecord As String
Dim dummyId As Integer
Dim patientID As String
Dim eventID As String
Dim lastDummyId As Integer
Dim lastpatientID As String
Dim lasteventID As String
Dim i As Integer
pstrSQL = "SELECT Inpat.Dummy_id, Inpat.Patient_id, Inpat.Event_ID, Inpat.Diagnosis_Code FROM Inpat ORDER BY Inpat.Patient_id, Inpat.Event_ID;"
Set db = CurrentDb
Set rs = db.OpenRecordset(pstrSQL)
dummyId = 0
With rs
If Not .EOF Then
'first record
.MoveFirst
patientID = rs.Fields(1) '
eventID = rs.Fields(2) '
.Edit
rs.Fields(0) = dummyId + 1
.Update
.MoveNext
Do While Not .EOF
'store the values from the last record
lastpatientID = patientID
lasteventID = eventID
'get the new values
patientID = rs.Fields(1) '
eventID = rs.Fields(2) '
'new patient or new hospital stay
If patientID <> lastpatientID Or eventID <> lasteventID Then
dummyId = 0 'reset back to 1
Else
dummyId = dummyId + 1
End If
.Edit
rs.Fields(0) = dummyId + 1
.Update
.MoveNext
Loop
End If
End With
rs.Close
Set rs = Nothing
Set dbs = Nothing
MsgBox "Finished", vbExclamation
End Sub
那么如果使用交叉表显示数据:
TRANSFORM First(Inpat.[Diagnosis_Code]) AS FirstOfDiagnosis_Code
SELECT Inpat.[Patient_id], Inpat.[Event_ID], Count(Inpat.[Diagnosis_Code])
AS [Total Of Diagnosis_Code]
FROM Inpat
GROUP BY Inpat.[Patient_id], Inpat.[Event_ID]
PIVOT Inpat.[Dummy_id];
感谢 Rajat Jaiswal、LeasMaps 和 Tim Biegeleisen 的贡献。非常感谢。
将 附加列添加到原始 table 以用作转换后的 table 中的列标题的建议是关键。结果证明这相对容易做到(我在 MS Excel 中做到了)。
所以我原来的 table 被编辑成这样...
--------------------------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code | DiagCode_Counter |
--------------------------------------------------------------
Pers001 | HospStay001 | C139 | Diagnosis_Code_1 |
Pers001 | HospStay001 | I245 | Diagnosis_Code_2 |
Pers001 | HospStay001 | D456 | Diagnosis_Code_3 |
Pers001 | HospStay002 | C139 | Diagnosis_Code_1 |
Pers001 | HospStay002 | J123 | Diagnosis_Code_2 |
Pers555 | HospStay001 | D312 | Diagnosis_Code_1 |
Pers999 | HospStay001 | C120 | Diagnosis_Code_1 |
Pers999 | HospStay001 | E101 | Diagnosis_Code_2 |
--------------------------------------------------------------
在他新添加的"DiagCode_Counter"字段中,数字后缀加1,每增加一个Diagnosis_Code 值是根据唯一的 "Event_ID".
记录的
然后我能够在 MS Access 中创建一个 Crosstab 查询,使用 "Patient_ID" 和 "Event_ID" 字段作为 ROW 标题; COLUMN 标题的 "DiagCode_Counter" 字段; "Diagnosis_Code" 条目为 VALUES.
我正在寻求帮助,将数据存储在行中的 table 转换为存储在 table 中的数据。
背景...我正在处理包含入院数据的 table。我们称 table "Inpatients".
数据当前格式为 table,有 3 列和 n 行。这 3 列包含以下数据:
对于单个患者(Patient_ID),每次住院(Event_ID)表示为一个或多个[ table 中的 =49=] 行,其中一行用于记录给定住院期间的每个诊断。
因此,任何给定的住院时间都可以在 table(一个记录的诊断) 或 table 中的多行中捕获(与多项诊断相关)。
下面给出当前"Inpatients"table的例子...
-------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code
-------------------------------------------
Pers001 | HospStay001 | C139
Pers001 | HospStay001 | I245
Pers001 | HospStay001 | D456
Pers001 | HospStay002 | C139
Pers001 | HospStay002 | J123
Pers555 | HospStay001 | D312
Pers999 | HospStay001 | C120
Pers999 | HospStay001 | E101
这是我真正想做的事情:我想转换数据,以便我只有 一行,每个患者每次住院,这样上面的table就格式化为:
----------------------------------------------------------------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code_1 | Diagnosis_Code_2 | Diagnosis_Code_3 | Diagnosis_Code_n
----------------------------------------------------------------------------------------------------
Pers001 | HospStay001 | C139 | I245 | D456 |
Pers001 | HospStay002 | C139 | J123 | |
Pers555 | HospStay001 | D312 | | |
Pers999 | HospStay001 | C120 | E101 | |
我怀疑解决方案需要一些动态 sql...恐怕不是我的强项之一。
谢谢!
CREATE table #source (Patient_ID varchar(100), Event_ID varchar (100) ,Diagnosis_Code VARCHAR(100),Dig_Number INT)
insert into #source (Patient_ID, Event_ID,Diagnosis_Code,Dig_Number) values
('Pers001','HospStay001','I245',2),
('Pers001','HospStay001','D456',3),
('Pers001','HospStay002','C139',1),
('Pers001','HospStay002','J123',2),
('Pers555','HospStay001','D312',1),
('Pers999','HospStay001','C120',1),
('Pers999','HospStay001','E101',2),
('Pers001','HospStay001','C139',1)
--DROP TABLE tempdb..#source
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
SELECT @cols = STUFF
(
(
SELECT ',' + QUOTENAME( CONVERT(VARCHAR(10),Dig_Number))
FROM #source
GROUP BY Dig_Number
ORDER BY Dig_Number
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'),
1,1,''
);
SET @query = 'SELECT Patient_ID,Event_ID,' + @cols + '
FROM
(
SELECT Patient_ID,Event_ID,Diagnosis_Code,dig_number
FROM #source
) x
PIVOT
(
MAX(Diagnosis_Code)
FOR Dig_Number IN (' + @cols + ')
) p ';
EXECUTE(@query);
再增加一列即分房数就可以了
Rajat 是正确的 - 您需要某种列来创建 diagnosis_column_1、dignosis_column_2...等等
要在 ms-access 中执行此操作,我会: 1.创建一个虚拟列来计算诊断列 2. 使用 VBA 填充它(对于大型数据库更快),如下所示
Sub Update_Diagnosis_Code_ID()
Dim db As DAO.Database
'Dim qdf As DAO.QueryDef
Dim rs As DAO.Recordset
Dim TmpRecord As String
Dim dummyId As Integer
Dim patientID As String
Dim eventID As String
Dim lastDummyId As Integer
Dim lastpatientID As String
Dim lasteventID As String
Dim i As Integer
pstrSQL = "SELECT Inpat.Dummy_id, Inpat.Patient_id, Inpat.Event_ID, Inpat.Diagnosis_Code FROM Inpat ORDER BY Inpat.Patient_id, Inpat.Event_ID;"
Set db = CurrentDb
Set rs = db.OpenRecordset(pstrSQL)
dummyId = 0
With rs
If Not .EOF Then
'first record
.MoveFirst
patientID = rs.Fields(1) '
eventID = rs.Fields(2) '
.Edit
rs.Fields(0) = dummyId + 1
.Update
.MoveNext
Do While Not .EOF
'store the values from the last record
lastpatientID = patientID
lasteventID = eventID
'get the new values
patientID = rs.Fields(1) '
eventID = rs.Fields(2) '
'new patient or new hospital stay
If patientID <> lastpatientID Or eventID <> lasteventID Then
dummyId = 0 'reset back to 1
Else
dummyId = dummyId + 1
End If
.Edit
rs.Fields(0) = dummyId + 1
.Update
.MoveNext
Loop
End If
End With
rs.Close
Set rs = Nothing
Set dbs = Nothing
MsgBox "Finished", vbExclamation
End Sub
那么如果使用交叉表显示数据:
TRANSFORM First(Inpat.[Diagnosis_Code]) AS FirstOfDiagnosis_Code
SELECT Inpat.[Patient_id], Inpat.[Event_ID], Count(Inpat.[Diagnosis_Code])
AS [Total Of Diagnosis_Code]
FROM Inpat
GROUP BY Inpat.[Patient_id], Inpat.[Event_ID]
PIVOT Inpat.[Dummy_id];
感谢 Rajat Jaiswal、LeasMaps 和 Tim Biegeleisen 的贡献。非常感谢。
将 附加列添加到原始 table 以用作转换后的 table 中的列标题的建议是关键。结果证明这相对容易做到(我在 MS Excel 中做到了)。
所以我原来的 table 被编辑成这样...
--------------------------------------------------------------
Patient_ID | Event_ID | Diagnosis_Code | DiagCode_Counter |
--------------------------------------------------------------
Pers001 | HospStay001 | C139 | Diagnosis_Code_1 |
Pers001 | HospStay001 | I245 | Diagnosis_Code_2 |
Pers001 | HospStay001 | D456 | Diagnosis_Code_3 |
Pers001 | HospStay002 | C139 | Diagnosis_Code_1 |
Pers001 | HospStay002 | J123 | Diagnosis_Code_2 |
Pers555 | HospStay001 | D312 | Diagnosis_Code_1 |
Pers999 | HospStay001 | C120 | Diagnosis_Code_1 |
Pers999 | HospStay001 | E101 | Diagnosis_Code_2 |
--------------------------------------------------------------
在他新添加的"DiagCode_Counter"字段中,数字后缀加1,每增加一个Diagnosis_Code 值是根据唯一的 "Event_ID".
记录的然后我能够在 MS Access 中创建一个 Crosstab 查询,使用 "Patient_ID" 和 "Event_ID" 字段作为 ROW 标题; COLUMN 标题的 "DiagCode_Counter" 字段; "Diagnosis_Code" 条目为 VALUES.