Selenium 只抓取它找到的第一个项目
Selenium scrapes only the first item that it finds
我使用以下代码块来抓取网站
driver = webdriver.Chrome(executable_path=r'C:/Users/USER/Downloads/chromedriver_win32/chromedriver.exe')
url = 'https://mamikos.com/cari/ugm/all/bulanan/0-15000000'
driver.get(url)
kamar = driver.find_elements_by_class_name('kost-rc__content')
for desc in kamar :
nama = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[1]').text
kecamatan = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[2]').text
harga = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[4]/div/div[2]/div/span[1]').text
print(nama, kecamatan, harga)
在运行之后,输出似乎只打印了该页的第一个结果。我试图将 xpath 更改为此
for desc in kamar :
nama = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[1]').text
kecamatan = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[2]').text
harga = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[4]/div/div[2]/div/span[1]').text
print(nama, kecamatan, harga)
但是只报错,请大家帮忙
旁注:google chrome 版本 95.0.4638.69(官方构建)(64 位)和使用的驱动程序是 ChromeDriver 95.0.4638.69
要抓取 名称、信息 和 价格 信息,您可以使用 :
代码块:
driver.get("https://mamikos.com/cari/ugm/all/bulanan/0-15000000")
names = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='kost-rc__info']//span[contains(@class, 'rc-info__name')]")))]
infos = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='kost-rc__info']//span[contains(@class, 'rc-info__location')]")))]
prices = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='rc-price__real']//span[contains(@class, 'rc-price__text')]")))]
for i,j,k in zip(names, infos, prices):
print(f"Name:{i} Title:{j} Price:{k}")
driver.quit()
控制台输出:
Name:Kost Singgahsini Sakura Karanggayam Sleman Yogyakarta Title:Kecamatan Depok Price:Rp1.370.000
Name:Kost Singgahsini Granada UGM Yogyakarta Title:Kecamatan Depok Price:Rp1.790.000
Name:Kost Kurnia Terban Tipe A UGM Yogyakarta RMZ Title:Kecamatan Gondokusuman Price:Rp606.000
Name:Kost Singgahsini Maleo UGM Kaliurang Yogyakarta Title:Kecamatan Depok Price:Rp1.973.000
Name:Kost AB-AE Tipe B Gejayan Yogyakarta RMZ Title:Depok Price:Rp1.710.000
Name:Kost AB-AE Tipe A Gejayan Yogyakarta RMZ Title:Depok Price:Rp1.425.000
Name:Kost Pogung Familia Tipe C Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.900.000
Name:Kost Pogung Familia Tipe B Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.710.000
Name:Kost Pogung Familia Tipe A Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.425.000
Name:Kost Hanung Tipe B UGM Yogyakarta RMZ Title:Mlati Price:Rp736.000
Name:Kost Apik Tapak Dara Tipe B Deresan Yogyakarta Title:Depok Price:Rp1.620.000
Name:Kost Singgahsini Putri Maoni Tipe A Gejayan Yogyakarta Title:Depok Price:Rp1.520.000
Name:Kost Singgahsini Omah Khiar Tipe F Karang Gayam Yogyakarta Title:Depok Price:Rp1.720.000
Name:Kost Apik Tapak Dara Tipe C Deresan Yogyakarta Title:Kecamatan Depok Price:Rp2.205.000
Name:Kost Singgahsini Putri Maoni Tipe B Gejayan Yogyakarta Title:Depok Price:Rp1.720.000
Name:Kost Wisma Yudhistira Tipe C Mlati Sleman Yogyakarta Title:Mlati Price:Rp2.250.000
Name:Kost Pondok Bugenvil 3 Caturtunggal Depok Sleman Title:Depok Price:Rp1.800.000
Name:Kost Pranasmara 34C Tipe B Depok Sleman Title:Depok Price:Rp1.200.000
Name:Kost Pondok Bugenvil 2 Caturtunggal Depok Sleman Yogyakarta Title:Depok Price:Rp1.800.000
Name:Kost Rahayu Residence Tipe C Depok Sleman Yogyakarta Title:Depok Price:Rp1.150.000
这是解决您问题的完整 C# 代码。您可以根据您的语言调整它,尤其是 xpath 部分。
var els = driver.findElements(By.Xpath("//div[@class='kost-rc__content']"));
foreach(var el in els){
var nama = el.findElement(By.Xpath(".//span[@class='rc-info__name bg-c-text bg-c-text--title-4 ']"));
console.log("nama:"+nama.Text());
var kecamatan = el.findElement(By.Xpath(".//span[@class='rc-info__location bg-c-text bg-c-text--body-1 ']"));
console.log("kecamatan:"+kecamatan.Text());
}
我使用以下代码块来抓取网站
driver = webdriver.Chrome(executable_path=r'C:/Users/USER/Downloads/chromedriver_win32/chromedriver.exe')
url = 'https://mamikos.com/cari/ugm/all/bulanan/0-15000000'
driver.get(url)
kamar = driver.find_elements_by_class_name('kost-rc__content')
for desc in kamar :
nama = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[1]').text
kecamatan = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[2]').text
harga = desc.find_element_by_xpath('//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[4]/div/div[2]/div/span[1]').text
print(nama, kecamatan, harga)
在运行之后,输出似乎只打印了该页的第一个结果。我试图将 xpath 更改为此
for desc in kamar :
nama = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[1]').text
kecamatan = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[2]/div/span[2]').text
harga = desc.find_element_by_xpath('.//*[@id="app"]/div/div[5]/div/div[1]/div/div/div[1]/div[1]/div[1]/div/div[2]/div/div[2]/div[4]/div/div[2]/div/span[1]').text
print(nama, kecamatan, harga)
但是只报错,请大家帮忙
旁注:google chrome 版本 95.0.4638.69(官方构建)(64 位)和使用的驱动程序是 ChromeDriver 95.0.4638.69
要抓取 名称、信息 和 价格 信息,您可以使用
代码块:
driver.get("https://mamikos.com/cari/ugm/all/bulanan/0-15000000")
names = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='kost-rc__info']//span[contains(@class, 'rc-info__name')]")))]
infos = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='kost-rc__info']//span[contains(@class, 'rc-info__location')]")))]
prices = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='rc-price__real']//span[contains(@class, 'rc-price__text')]")))]
for i,j,k in zip(names, infos, prices):
print(f"Name:{i} Title:{j} Price:{k}")
driver.quit()
控制台输出:
Name:Kost Singgahsini Sakura Karanggayam Sleman Yogyakarta Title:Kecamatan Depok Price:Rp1.370.000
Name:Kost Singgahsini Granada UGM Yogyakarta Title:Kecamatan Depok Price:Rp1.790.000
Name:Kost Kurnia Terban Tipe A UGM Yogyakarta RMZ Title:Kecamatan Gondokusuman Price:Rp606.000
Name:Kost Singgahsini Maleo UGM Kaliurang Yogyakarta Title:Kecamatan Depok Price:Rp1.973.000
Name:Kost AB-AE Tipe B Gejayan Yogyakarta RMZ Title:Depok Price:Rp1.710.000
Name:Kost AB-AE Tipe A Gejayan Yogyakarta RMZ Title:Depok Price:Rp1.425.000
Name:Kost Pogung Familia Tipe C Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.900.000
Name:Kost Pogung Familia Tipe B Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.710.000
Name:Kost Pogung Familia Tipe A Sleman Yogyakarta RMZ Title:Mlati Price:Rp1.425.000
Name:Kost Hanung Tipe B UGM Yogyakarta RMZ Title:Mlati Price:Rp736.000
Name:Kost Apik Tapak Dara Tipe B Deresan Yogyakarta Title:Depok Price:Rp1.620.000
Name:Kost Singgahsini Putri Maoni Tipe A Gejayan Yogyakarta Title:Depok Price:Rp1.520.000
Name:Kost Singgahsini Omah Khiar Tipe F Karang Gayam Yogyakarta Title:Depok Price:Rp1.720.000
Name:Kost Apik Tapak Dara Tipe C Deresan Yogyakarta Title:Kecamatan Depok Price:Rp2.205.000
Name:Kost Singgahsini Putri Maoni Tipe B Gejayan Yogyakarta Title:Depok Price:Rp1.720.000
Name:Kost Wisma Yudhistira Tipe C Mlati Sleman Yogyakarta Title:Mlati Price:Rp2.250.000
Name:Kost Pondok Bugenvil 3 Caturtunggal Depok Sleman Title:Depok Price:Rp1.800.000
Name:Kost Pranasmara 34C Tipe B Depok Sleman Title:Depok Price:Rp1.200.000
Name:Kost Pondok Bugenvil 2 Caturtunggal Depok Sleman Yogyakarta Title:Depok Price:Rp1.800.000
Name:Kost Rahayu Residence Tipe C Depok Sleman Yogyakarta Title:Depok Price:Rp1.150.000
这是解决您问题的完整 C# 代码。您可以根据您的语言调整它,尤其是 xpath 部分。
var els = driver.findElements(By.Xpath("//div[@class='kost-rc__content']"));
foreach(var el in els){
var nama = el.findElement(By.Xpath(".//span[@class='rc-info__name bg-c-text bg-c-text--title-4 ']"));
console.log("nama:"+nama.Text());
var kecamatan = el.findElement(By.Xpath(".//span[@class='rc-info__location bg-c-text bg-c-text--body-1 ']"));
console.log("kecamatan:"+kecamatan.Text());
}