如何在 scrapy-spider 中使用全局定义的变量?
How to use a global defined variable in scrapy-spider?
如何在 scrapy-spider 中使用全局定义变量(pandas 数据框)df
?
import scrapy
import pandas as pd
import re
df = pd.read_csv('home/test.csv')
class Spider:
name = 'test'
start_urls = 'https://test.org'
def parse(self, response):
data = response.css('get-data-here').extract()
for i in data:
final_output = **df**[(**df**[0]==re.search(r'[test]', i).group(1)), 1].item()
You need to declare variable inside class, if you want to
initialize do that in constructor.
import scrapy
import pandas as pd
import re
class Spider:
name = 'test'
start_urls = 'https://test.org'
def __init__(self, *args, **kwargs):
self.df = pd.read_csv('home/test.csv')
def parse(self, response):
data = response.css('get-data-here').extract()
## use data frame with self.df
如何在 scrapy-spider 中使用全局定义变量(pandas 数据框)df
?
import scrapy
import pandas as pd
import re
df = pd.read_csv('home/test.csv')
class Spider:
name = 'test'
start_urls = 'https://test.org'
def parse(self, response):
data = response.css('get-data-here').extract()
for i in data:
final_output = **df**[(**df**[0]==re.search(r'[test]', i).group(1)), 1].item()
You need to declare variable inside class, if you want to initialize do that in constructor.
import scrapy
import pandas as pd
import re
class Spider:
name = 'test'
start_urls = 'https://test.org'
def __init__(self, *args, **kwargs):
self.df = pd.read_csv('home/test.csv')
def parse(self, response):
data = response.css('get-data-here').extract()
## use data frame with self.df