抓取数据并将其组合以形成单个变量

Scraping data and combining it to form a single variable

我的目标是从该站点抓取尺寸和数据 sku,并将每个 sizingsizeID 的数据组合成单独的单个变量。 site 共有 3 个尺码 8.5-9.5,每个尺码都有自己独特的 data-sku。我如何将 sizingsizeID 的数据合并到每个具有 3 组值的变量中。

理想的结果

variable for sizing = 8.5,9,9.5
variable for sizeID = 16139989_jdsportssg.2905173,16139989_jdsportssg.2905175,16139989_jdsportssg.2905176

当前代码:

import requests
from bs4 import BeautifulSoup

#scraping the for the product size ID
stocksource = requests.get('https://m.jdsports.com.sg/product/red-jordan-air-1-mid/16139989_jdsportssg/stock/').text
stockpage = BeautifulSoup(stocksource, "lxml")

for size in stockpage.select('#productSizeStock > button'):
    global sizing
    global sizeID
    sizing = size.text
    sizeID = size['data-sku']
    print(sizing)
    print(sizeID) 

也不太清楚为什么,但是如果您现在 运行 我的代码,大小调整结果似乎在开头和结尾有很多空格。无论如何,我们将不胜感激! :)

您可以创建一个字典,其中键是尺寸,值是 SKU:

import requests
from bs4 import BeautifulSoup

stocksource = requests.get(
    "https://m.jdsports.com.sg/product/red-jordan-air-1-mid/16139989_jdsportssg/stock/"
).text
stockpage = BeautifulSoup(stocksource, "lxml")

out = {}
for button in stockpage.select("#productSizeStock > button"):
    out[button.get_text(strip=True)] = button["data-sku"]

print(out)

打印:

{'8.5': '16139989_jdsportssg.2905173', 
 '9': '16139989_jdsportssg.2905175', 
 '9.5': '16139989_jdsportssg.2905176', 
 '10': '16139989_jdsportssg.2905179'}

编辑:将数据加载到单独的列表中:

import requests
from bs4 import BeautifulSoup

stocksource = requests.get(
    "https://m.jdsports.com.sg/product/red-jordan-air-1-mid/16139989_jdsportssg/stock/"
).text
stockpage = BeautifulSoup(stocksource, "lxml")

sizes = []
skus = []
for button in stockpage.select("#productSizeStock > button"):
    sizes.append(button.get_text(strip=True))
    skus.append(button["data-sku"])

print(sizes)
print(skus)

打印:

['8.5', '9', '9.5', '10']
['16139989_jdsportssg.2905173', '16139989_jdsportssg.2905175', '16139989_jdsportssg.2905176', '16139989_jdsportssg.2905179']