Python 抓取网页
Python Scraping Web page
我试图从这个示例 link:https://www.fundoodata.com/companies-in/list-of-apparel-stores-companies-in-india-i239
中删除所有 'a' 标签占位符值
我需要的是只复制 'a' 标签的名称并需要将其保存到 csv 文件
我是初学者,谁能帮我解决下面是我错误的代码:
# importing the modules
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import pandas as pd
import time
import os
link = "https://www.fundoodata.com/companies-in/list-of-apparel-stores-companies-in-india-i239"
# instantiating empty lists
nameList = []
for i in range(1) :
driver = webdriver.Chrome(ChromeDriverManager().install())
# fetching all the store details
storeDetails = driver.find_elements_by_class_name('search-result')
# iterating the storeDetails
for j in range(len(storeDetails)):
# fetching the name, address and contact for each entry
name = storeDetails[j].find_element_by_class_name('heading').text
myList = []
nameList.append(name)
driver.close()
# initialize data of lists.
data = {'Company Name': nameList,}
# Create DataFrame
df = pd.DataFrame(data)
print(df)
# Save Data as .csv
df.to_csv("D:\xxx\xxx\xx\xxx\demo.csv", mode='w+', header = False)
这里有很多问题首先是:
## to open webpage
driver.get(link)
其次,您根本不需要第一个 for 循环。最后:
from selenium.webdriver.common.by import By
## find the a tags inside the search results
storedetails = driver.find_elements(By.CSS_SELECTOR, 'div.heading a')
## iterating over elements list
for name in storedetails:
## appending the a tag text
nameList.append(name.text)
希望对您有所帮助! :)
我试图从这个示例 link:https://www.fundoodata.com/companies-in/list-of-apparel-stores-companies-in-india-i239
中删除所有 'a' 标签占位符值我需要的是只复制 'a' 标签的名称并需要将其保存到 csv 文件
我是初学者,谁能帮我解决下面是我错误的代码:
# importing the modules
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import pandas as pd
import time
import os
link = "https://www.fundoodata.com/companies-in/list-of-apparel-stores-companies-in-india-i239"
# instantiating empty lists
nameList = []
for i in range(1) :
driver = webdriver.Chrome(ChromeDriverManager().install())
# fetching all the store details
storeDetails = driver.find_elements_by_class_name('search-result')
# iterating the storeDetails
for j in range(len(storeDetails)):
# fetching the name, address and contact for each entry
name = storeDetails[j].find_element_by_class_name('heading').text
myList = []
nameList.append(name)
driver.close()
# initialize data of lists.
data = {'Company Name': nameList,}
# Create DataFrame
df = pd.DataFrame(data)
print(df)
# Save Data as .csv
df.to_csv("D:\xxx\xxx\xx\xxx\demo.csv", mode='w+', header = False)
这里有很多问题首先是:
## to open webpage
driver.get(link)
其次,您根本不需要第一个 for 循环。最后:
from selenium.webdriver.common.by import By
## find the a tags inside the search results
storedetails = driver.find_elements(By.CSS_SELECTOR, 'div.heading a')
## iterating over elements list
for name in storedetails:
## appending the a tag text
nameList.append(name.text)
希望对您有所帮助! :)