将结构未知的 table 转换为 Key/Value

Convert a table with unknown structure into Key/Value

我们从分析师那里收到的报告数据采用具有任意结构的 Table 格式。我们所知道的是每一行都有一个 CustomerId 列。但其他的,我们不知道,每次都可能有所不同。

接收此数据的目标系统仅以 Key/Value 格式执行,因此我们必须将报告 tables 转换为 Key/Value.

因此,例如,如果源报告 table 具有以下结构:

CREATE TABLE [dbo].[SampleSourceTable](
    [CustomerId] [bigint] NULL,
    [Column1] [nchar](10) NULL,
    [Column2] [int] NULL,
    [Column3] [datetime] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[SampleSourceTable] ([CustomerId], [Column1], [Column2], [Column3]) VALUES (1, N'aaa', 123, CAST(N'2019-01-01T00:00:00.000' AS DateTime))
GO
INSERT [dbo].[SampleSourceTable] ([CustomerId], [Column1], [Column2], [Column3]) VALUES (2, N'bbb', 456, CAST(N'2018-01-01T00:00:00.000' AS DateTime))
GO

我们希望将此数据转换为以下结构:

CREATE TABLE [dbo].[SampleDestinationTable](
    [CustomerId] [bigint] NULL,
    [Attribute] [nvarchar](255) NULL,
    [Value] [nvarchar](max) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (1, N'Column1', N'aaa')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (1, N'Column2', N'123')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (1, N'Column3', N'2019-01-01 00:00:00.000')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (2, N'Column1', N'bbb')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (2, N'Column2', N'456')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (2, N'Column3', N'2018-01-01 00:00:00.000')
GO

然而,这里的挑战是源报告 table 没有固定的结构。

起初,我考虑过使用游标遍历每一行,然后使用嵌套游标遍历该行中的所有列。但显然,there is no way of processing a row with an unknown structure using cursors。所以现在,我想知道是否可以使用 PIVOT/UNPIVOT。但话又说回来,我认为他们也需要列列表。

我是运行SQLServer 2017.

如何转换结构未知的数据?

一种可能的方法是使用来自 INFORMATION_SCHEMA.COLUMNS:

的信息生成动态语句
-- Declarations
DECLARE @stm nvarchar(max)

-- Dynamic part 
SELECT 
    @stm = STUFF((
        SELECT CONCAT(
            N' UNION ALL SELECT CustomerID, ''', 
            [COLUMN_NAME],
            N''' AS [Attribute], CONVERT(nvarchar(max), ',
            QUOTENAME([COLUMN_NAME]),
            CASE 
                WHEN DATA_TYPE = 'datetime' THEN N', 121'
                -- Add additional conversion rules for other data types
                ELSE N''
            END,
            N') AS [Value]', 
            N' FROM [SampleSourceTable]'
        )
        FROM INFORMATION_SCHEMA.COLUMNS
        WHERE (TABLE_NAME = 'SampleSourceTable') AND (COLUMN_NAME <> 'CustomerId')
    FOR XML PATH('')
    ), 1, 11, N'')

-- Whole statement and execution
SET @stm = @stm + N'ORDER BY CustomerID'
PRINT @stm 
EXEC (@stm)

输出:

CustomerID  Attribute   Value
1           Column1     aaa       
1           Column2     123
1           Column3     2019-01-01 00:00:00.000
2           Column3     2018-01-01 00:00:00.000
2           Column2     456
2           Column1     bbb       

啊,你开了第二个问题,我刚把答案放在你第一个...

所以我将使用这个地方提供与我的其他答案相同的技术,但不需要动态创建 SQL。试试这个:

DECLARE @xml XML =(SELECT TOP 10 o.object_id,o.* FROM sys.objects o FOR XML RAW, ELEMENTS XSINIL);

SELECT r.value('*[1]/text()[1]','nvarchar(max)') AS RowID
        ,c.value('local-name(.)','nvarchar(max)') AS ColumnKey
        ,c.value('text()[1]','nvarchar(max)') AS ColumnValue
FROM @xml.nodes('/row') A(r)
CROSS APPLY A.r.nodes('*[position()>1]') B(c);

集合的第一列将作为 RowID 返回。如果这不正确,您可以通过执行与我上面所做的相同的操作来强制执行此操作,首先强制 o.object_id 。结果的所有列都将作为 EAV 返回。

部分结果

+-------+---------------------+-------------------------+
| RowID | ColumnKey           | ColumnValue             |
+-------+---------------------+-------------------------+
| 3     | name                | sysrscols               |
+-------+---------------------+-------------------------+
| 3     | object_id           | 3                       |
+-------+---------------------+-------------------------+
| 3     | principal_id        | NULL                    |
+-------+---------------------+-------------------------+
| 3     | schema_id           | 4                       |
+-------+---------------------+-------------------------+
| 3     | parent_object_id    | 0                       |
+-------+---------------------+-------------------------+
| 3     | type                | S                       |
+-------+---------------------+-------------------------+
| 3     | type_desc           | SYSTEM_TABLE            |
+-------+---------------------+-------------------------+
| 3     | create_date         | 2017-08-22T19:38:02.860 |
+-------+---------------------+-------------------------+
| 3     | modify_date         | 2017-08-22T19:38:02.867 |
+-------+---------------------+-------------------------+
| 3     | is_ms_shipped       | 1                       |
+-------+---------------------+-------------------------+
| 3     | is_published        | 0                       |
+-------+---------------------+-------------------------+
| 3     | is_schema_published | 0                       |
+-------+---------------------+-------------------------+
| 5     | name                | sysrowsets              |
+-------+---------------------+-------------------------+
| ... more rows ...