SQL - Convert/Transpose 行与列的编码文本值

SQL - Convert/Transpose Rows with Coded Text Values to Columns

我正在寻求帮助,将数据存储在行中的 table 转换为存储在 table 中的数据。

背景...我正在处理包含入院数据的 table。我们称 table "Inpatients".

数据当前格式为 table,有 3 列和 n 行。这 3 列包含以下数据:

  • "Patient_ID" = 唯一的 patient/person 标识符。将此视为患者的名字;
  • "Event_ID" = 唯一入场事件标识符。识别独特的医院护理事件;
  • "Diagnosis_Code" = ICD-10 code 用来记录病人住院的原因

    对于单个患者(Patient_ID),次住院(Event_ID)表示为一个或多个[ table 中的 =49=] 行,其中一行用于记录给定住院期间的每个诊断。

    因此,任何给定的住院时间都可以在 table(一个记录的诊断) table 中的多行中捕获(与多项诊断相关)。

    下面给出当前"Inpatients"table的例子...

    -------------------------------------------
    Patient_ID |  Event_ID   |  Diagnosis_Code
    -------------------------------------------
    Pers001    | HospStay001 |     C139
    Pers001    | HospStay001 |     I245
    Pers001    | HospStay001 |     D456
    Pers001    | HospStay002 |     C139
    Pers001    | HospStay002 |     J123
    Pers555    | HospStay001 |     D312
    Pers999    | HospStay001 |     C120
    Pers999    | HospStay001 |     E101
    

    这是我真正想做的事情:我想转换数据,以便我只有 一行,每个患者每次住院,这样上面的table就格式化为:

    ----------------------------------------------------------------------------------------------------
    Patient_ID |  Event_ID   | Diagnosis_Code_1 | Diagnosis_Code_2 | Diagnosis_Code_3 | Diagnosis_Code_n
    ----------------------------------------------------------------------------------------------------
    Pers001    | HospStay001 |       C139       |       I245       |       D456       |
    Pers001    | HospStay002 |       C139       |       J123       |                  |
    Pers555    | HospStay001 |       D312       |                  |                  |
    Pers999    | HospStay001 |       C120       |       E101       |                  |
    

    我怀疑解决方案需要一些动态 sql...恐怕不是我的强项之一。

    谢谢!

  • CREATE  table #source (Patient_ID varchar(100), Event_ID varchar (100) ,Diagnosis_Code VARCHAR(100),Dig_Number INT)
    insert into #source (Patient_ID, Event_ID,Diagnosis_Code,Dig_Number) values
    ('Pers001','HospStay001','I245',2),
    ('Pers001','HospStay001','D456',3),
    ('Pers001','HospStay002','C139',1),
    ('Pers001','HospStay002','J123',2),
    ('Pers555','HospStay001','D312',1),
    ('Pers999','HospStay001','C120',1),
    ('Pers999','HospStay001','E101',2),
    ('Pers001','HospStay001','C139',1)
    
    
    --DROP TABLE tempdb..#source
    
    
    DECLARE @cols AS NVARCHAR(MAX),
            @query AS NVARCHAR(MAX)
    
    SELECT @cols = STUFF
            (
              (
                SELECT ',' + QUOTENAME( CONVERT(VARCHAR(10),Dig_Number))
                FROM #source
                GROUP BY Dig_Number
    
                ORDER BY Dig_Number
                FOR XML PATH(''), TYPE
              ).value('.', 'NVARCHAR(MAX)'),
              1,1,''
            );
    
    SET @query = 'SELECT Patient_ID,Event_ID,' + @cols + ' 
                  FROM
                  (
                    SELECT Patient_ID,Event_ID,Diagnosis_Code,dig_number
                    FROM #source
                 ) x
                 PIVOT
                 (
                    MAX(Diagnosis_Code)
                    FOR Dig_Number IN (' + @cols + ')
                 ) p ';
    
    EXECUTE(@query);
    

    再增加一列即分房数就可以了

    Rajat 是正确的 - 您需要某种列来创建 diagnosis_column_1、dignosis_column_2...等等

    要在 ms-access 中执行此操作,我会: 1.创建一个虚拟列来计算诊断列 2. 使用 VBA 填充它(对于大型数据库更快),如下所示

    Sub Update_Diagnosis_Code_ID()
    
    Dim db As DAO.Database
    'Dim qdf As DAO.QueryDef
    Dim rs As DAO.Recordset
    
    Dim TmpRecord As String
    
    Dim dummyId As Integer
    Dim patientID As String
    Dim eventID As String
    
    Dim lastDummyId As Integer
    Dim lastpatientID As String
    Dim lasteventID As String
    
    Dim i As Integer
    
    pstrSQL = "SELECT Inpat.Dummy_id, Inpat.Patient_id, Inpat.Event_ID, Inpat.Diagnosis_Code FROM Inpat ORDER BY Inpat.Patient_id, Inpat.Event_ID;"
    Set db = CurrentDb
    
    Set rs = db.OpenRecordset(pstrSQL)
    
    dummyId = 0
    
    
    With rs
        If Not .EOF Then
        'first record
          .MoveFirst
    
            patientID = rs.Fields(1) '
            eventID = rs.Fields(2) '
            .Edit
            rs.Fields(0) = dummyId + 1
            .Update
            .MoveNext
    
            Do While Not .EOF
              'store the values from the last record
              lastpatientID = patientID
              lasteventID = eventID
    
              'get the new values
    
              patientID = rs.Fields(1) '
              eventID = rs.Fields(2) '
    
              'new patient or new hospital stay
              If patientID <> lastpatientID Or eventID <> lasteventID Then
                dummyId = 0 'reset back to 1
              Else
                dummyId = dummyId + 1
              End If
    
              .Edit
              rs.Fields(0) = dummyId + 1
              .Update
              .MoveNext
    
            Loop
          End If
        End With
    
    rs.Close
    
    Set rs = Nothing
    Set dbs = Nothing
    
    MsgBox "Finished", vbExclamation
    
    End Sub
    

    那么如果使用交叉表显示数据:

    TRANSFORM First(Inpat.[Diagnosis_Code]) AS FirstOfDiagnosis_Code
    SELECT Inpat.[Patient_id], Inpat.[Event_ID], Count(Inpat.[Diagnosis_Code]) 
    AS [Total Of Diagnosis_Code]
    FROM Inpat
    GROUP BY Inpat.[Patient_id], Inpat.[Event_ID]
    PIVOT Inpat.[Dummy_id];
    

    感谢 Rajat Jaiswal、LeasMaps 和 Tim Biegeleisen 的贡献。非常感谢。

    附加列添加到原始 table 以用作转换后的 table 中的列标题的建议是关键。结果证明这相对容易做到(我在 MS Excel 中做到了)。

    所以我原来的 table 被编辑成这样...

    --------------------------------------------------------------
    Patient_ID |  Event_ID   | Diagnosis_Code | DiagCode_Counter |
    --------------------------------------------------------------
    Pers001    | HospStay001 |     C139       | Diagnosis_Code_1 |
    Pers001    | HospStay001 |     I245       | Diagnosis_Code_2 |
    Pers001    | HospStay001 |     D456       | Diagnosis_Code_3 |
    Pers001    | HospStay002 |     C139       | Diagnosis_Code_1 |
    Pers001    | HospStay002 |     J123       | Diagnosis_Code_2 |
    Pers555    | HospStay001 |     D312       | Diagnosis_Code_1 |
    Pers999    | HospStay001 |     C120       | Diagnosis_Code_1 |
    Pers999    | HospStay001 |     E101       | Diagnosis_Code_2 |
    --------------------------------------------------------------
    

    在他新添加的"DiagCode_Counter"字段中,数字后缀加1,每增加一个Diagnosis_Code 值是根据唯一的 "Event_ID".

    记录的

    然后我能够在 MS Access 中创建一个 Crosstab 查询,使用 "Patient_ID""Event_ID" 字段作为 ROW 标题; COLUMN 标题的 "DiagCode_Counter" 字段; "Diagnosis_Code" 条目为 VALUES.