正在 SQL 服务器的存储过程中解析 JSON 的动态变量

Parsing a dynamic variable of JSON in stored procedure of SQL Server

我在存储过程中使用 API 后得到 JSON 数据。我正在尝试解析此 JSON,但在尝试解析它时,出现错误

JSON path is not properly formatted. Unexpected character '@' is found at position 2

我想知道如何将这个 @variant_description 动态变量传递给我的解析器?

提前致谢

ALTER PROCEDURE [dbo].[usp_CallAPI]
    @genome_build nvarchar(50),
    @variant_description nvarchar(50),
    @select_transcripts  nvarchar(50)
AS
BEGIN
    DECLARE @Object AS Int;
    DECLARE @hr int, @APIfunction nvarchar(max)
    DECLARE @json AS TABLE (Json_Table nvarchar(max))
    DECLARE @Json_response AS nvarchar(max) 

    SET @APIfunction = 'https://rest.variantvalidator.org/VariantValidator/variantvalidator/'+ @genome_build+'/'+@variant_description+'/'+@select_transcripts

    EXEC @hr = sp_OACreate 'MSXML2.ServerXMLHTTP.6.0', @Object OUT;

    IF @hr <> 0 
        EXEC sp_OAGetErrorInfo @Object

    EXEC @hr = sp_OAMethod @Object, 'open', NULL, 'get', @APIfunction
                 , --Your Web Service Url (invoked)
                 'false'
    IF @hr <> 0 
        EXEC sp_OAGetErrorInfo @Object

    EXEC @hr = sp_OAMethod @Object, 'send'

    IF @hr <> 0 
        EXEC sp_OAGetErrorInfo @Object

    EXEC @hr = sp_OAMethod @Object, 'responseText', @json OUTPUT

    IF @hr <> 0 
        EXEC sp_OAGetErrorInfo @Object

    INSERT INTO @json (Json_Table) 
        EXEC sp_OAGetProperty @Object, 'responseText'

    -- select the JSON string
    -- SELECT '"elements": ['+Json_Table+ ']' FROM @json
    -- SELECT * FROM @json

    -- Parse the JSON string
    SELECT * 
    FROM OPENJSON((SELECT CONCAT('{', QUOTENAME('elements', '"'), ':['+Json_Table+ ']}') FROM @json), N'$.elements')
    WITH (   
          -- [seqrepo_db] nvarchar(max) N'$.seqrepo_db'   ,
          [variant_description] nvarchar(max) N'$.@variant_description' AS JSON,
          [flag] nvarchar(max) N'$.flag',
          [metadata] nvarchar(max) N'$.metadata' AS JSON
         )

    EXEC sp_OADestroy @Object
END

比如我要执行下面的

EXEC [dbo].usp_CallAPI @genome_build ='GRCh37',@variant_description = 'NM_000088.3:c.589G>T', @select_transcripts ='all';

例如,这是我从 API 请求

返回的测试 JSON
{
  "NM_000088.3:c.589G>T": {
    "alt_genomic_loci": [],
    "gene_ids": {
      "ccds_ids": [
        "CCDS11561"
      ],
      "ensembl_gene_id": "ENSG00000108821",
      "entrez_gene_id": "1277",
      "hgnc_id": "HGNC:2197",
      "omim_id": [
        "120150"
      ],
      "ucsc_id": "uc002iqm.4"
    },
    "gene_symbol": "COL1A1",
    "genome_context_intronic_sequence": "",
    "hgvs_lrg_transcript_variant": "LRG_1t1:c.589G>T",
    "hgvs_lrg_variant": "LRG_1:g.8638G>T",
    "hgvs_predicted_protein_consequence": {
      "slr": "NP_000079.2:p.(G197C)",
      "tlr": "NP_000079.2(LRG_1p1):p.(Gly197Cys)"
    },
    "hgvs_refseqgene_variant": "NG_007400.1:g.8638G>T",
    "hgvs_transcript_variant": "NM_000088.3:c.589G>T",
    "primary_assembly_loci": {
      "grch37": {
        "hgvs_genomic_description": "NC_000017.10:g.48275363C>A",
        "vcf": {
          "alt": "A",
          "chr": "17",
          "pos": "48275363",
          "ref": "C"
        }
      },
      "grch38": {
        "hgvs_genomic_description": "NC_000017.11:g.50198002C>A",
        "vcf": {
          "alt": "A",
          "chr": "17",
          "pos": "50198002",
          "ref": "C"
        }
      },
      "hg19": {
        "hgvs_genomic_description": "NC_000017.10:g.48275363C>A",
        "vcf": {
          "alt": "A",
          "chr": "chr17",
          "pos": "48275363",
          "ref": "C"
        }
      },
      "hg38": {
        "hgvs_genomic_description": "NC_000017.11:g.50198002C>A",
        "vcf": {
          "alt": "A",
          "chr": "chr17",
          "pos": "50198002",
          "ref": "C"
        }
      }
    },
    "reference_sequence_records": {
      "lrg": "http://ftp.ebi.ac.uk/pub/databases/lrgex/LRG_1.xml",
      "protein": "https://www.ncbi.nlm.nih.gov/nuccore/NP_000079.2",
      "refseqgene": "https://www.ncbi.nlm.nih.gov/nuccore/NG_007400.1",
      "transcript": "https://www.ncbi.nlm.nih.gov/nuccore/NM_000088.3"
    },
    "refseqgene_context_intronic_sequence": "",
    "submitted_variant": "NM_000088.3:c.589G>T",
    "transcript_description": "Homo sapiens collagen type I alpha 1 chain (COL1A1), mRNA",
    "validation_warnings": []
  },
  "flag": "gene_variant",
  "metadata": {
    "seqrepo_db": "2018-08-21",
    "uta_schema": "uta_20180821",
    "variantvalidator_hgvs_version": "1.2.5.vv1",
    "variantvalidator_version": "1.0.4.dev17+gd16b9ef.d20200422"
  }
}

我认为当您使用 OPENJSON() 和显式模式(WITH 子句)时,您不能为 path 表达式使用变量。但是,如果 JSON 响应具有固定结构 - 具有三个 key\value 对的 JSON 对象(不是数组)(首先带有变量名,"flag""metadata"),您可以尝试更改解析 JSON 的方式并使用 JSON_VALUE()JSON_QUERY()。在这种情况下,从 SQL Server 2017 开始,您可以提供一个变量作为 path 的值。此外,您不需要构建新的 JSON 对象 ({"elements:[...]"}),只需解析返回的 JSON 响应即可。

在存储过程的末尾使用以下语句:

SELECT 
   variant_description = JSON_QUERY(@json, '$."' + @variant_description + '"'),
   flag = JSON_VALUE(@json, '$."flag"'),
   metadata = JSON_QUERY(@json, '$."metadata"')

而不是:

SELECT * 
FROM OPENJSON(
   (SELECT CONCAT('{', QUOTENAME('elements', '"'), ':['+Json_Table+ ']}') FROM @json), 
   N'$.elements'
) WITH (   
   -- [seqrepo_db] nvarchar(max) N'$.seqrepo_db'   ,
   [variant_description] nvarchar(max) N'$.@variant_description' AS JSON,
   [flag] nvarchar(max) N'$.flag',
   [metadata] nvarchar(max) N'$.metadata' AS JSON
)

作为旁注(以及您未来的工作),"JSON 路径格式不正确的实际原因。在位置 2 处发现了意外的字符“@”” 错误是事实,您将 N'$.@variant_description' 用作 path expression。如 documentation 中所述,如果 ... 键名以美元符号开头或包含空格等特殊字符,请用引号将其括起来 (例如 N'$."@variant_description"').