从网站抓取库存可用性数据

Crawl stock availability data from website

我想从以下网站抓取某个产品的库存情况。

[{"@type":"报价","可用性":"https://schema.org/InStock","价格":"479.00","priceCurrency" :"EUR","url":"https://www.mantel.com/cube-aim-pro&spec[]=9470&spec[]=2756&spec[]=285"}, {"@type":"Offer","availability":"http://schema.org/OutOfStock","price":"479.00","priceCurrency":"EUR","url ":"https://www.mantel.com/cube-aim-pro&spec[]=9470&spec[]=2768&spec[]=285"},{"@type":"优惠","可用性":"http ://schema.org/OutOfStock","price":"479.00","priceCurrency":"EUR","url":"https://www.mantel.com/cube-aim-pro&spec[]=9470&spec[]=2811&spec[]=285"},{"@type":"Offer","availability":"http://schema.org/OutOfStock","price": "479.00","priceCurrency":"EUR","url":"https://www.mantel.com/cube-aim-pro&spec[]=9470&spec[]=2757&spec[]=285" }],"aggregateRating":{"@type":"AggregateRating","ratingValue":"9.0","ratingCount":"6","bestRating":"10"}}

我需要架构。org/Instock 或架构。org/OutOfStock 以便最终在产品有货时收到通知,以便我可以购买。 这是我个人的,因为目前山地自行车的可用性非常有限。 所以我想构建一个快速程序,以便在库存中安装 MTB 尺寸时收到通知 如果我有一个脚本来获取这个特定产品的数据,我可以使用 sql 服务器创建一个 ssis,并在“库存”字段为“库存”时设置电子邮件通知。 我熟悉 SSIS 和 SQL 服务器。有人可以帮我从网站上获取数据吗?

您可以直接在 SSIS 中执行 json,但您也可以使用 SQL 服务器

使用 ssis 在 table 中插入 Json,然后使用 Openjson:

解析它

在这里,我将您的示例 json 插入临时文件 table,并使用 tsql 进行查询:

DECLARE @json NVARCHAR(MAX) =
N'
[{"@type":"Offer","availability":"https://schema.org/InStock","price":"479.00","priceCurrency":"EUR","url":"https://www.mantel.com/cube-aim-pro&spec[]=9470&spec[]=2756&spec[]=285"}
,{"@type":"Offer","availability":"http://schema.org/OutOfStock","price":"479.00","priceCurrency":"EUR","url":"https://www.mantel.com/cube-aim-pro&spec[]=9470&spec[]=2768&spec[]=285"}
,{"@type":"Offer","availability":"http://schema.org/OutOfStock","price":"479.00","priceCurrency":"EUR","url":"https://www.mantel.com/cube-aim-pro&spec[]=9470&spec[]=2811&spec[]=285"}
,{"@type":"Offer","availability":"http://schema.org/OutOfStock","price":"479.00","priceCurrency":"EUR","url":"https://www.mantel.com/cube-aim-pro&spec[]=9470&spec[]=2757&spec[]=285"}]
,"aggregateRating":{"@type":"AggregateRating","ratingValue":"9.0","ratingCount":"6","bestRating":"10"}}'


CREATE TABLE #tmp (
      id INT IDENTITY (1, 1) NOT NULL
    , json NVARCHAR(MAX) NOT NULL
)

INSERT INTO #tmp (json)
VALUES (@json)

SELECT [AdType]
     , [availability]
     , [price]
     , [priceCurrency]
     , [url]
FROM (
    SELECT TOP 1 json
    FROM #tmp
    ORDER BY id DESC
) a
    OUTER APPLY OPENJSON(a.json)
    WITH
    (
    AdType VARCHAR(100) '$."@type"'
    , availability NVARCHAR(256)
    , price DECIMAL(19, 2)
    , priceCurrency NVARCHAR(3)
    , url NVARCHAR(512)
    )

您的标签中有 python。如果您使用 Python 获取数据,您可以直接将 json 解析为 python 对象,而无需使用 SSIS 或 SQL 服务器