Scrapy:没有项目输出 |调试:爬取(200)... (referer:none)

Scrapy: no item output | Debug: crawled (200)... (referer:none)

我正在尝试从此 site 中提取出价信息。我是一个 Scrapy 新手,对于为什么我没有得到任何输出有点困惑,相反,我得到了 Crawled (200)...(referer: None) 并且没有输出。我无法弄清楚我缺少或需要更改的内容。我真的不知道问题出在哪里。谁能帮忙解决这个问题?

谢谢!!

这是我的爬虫代码:

from ..items import GovernmentItem
import scrapy, urllib.parse

class GeorgiaSpider(scrapy.Spider):
    name = 'georgia'
    allowed_domains = ['ssl.doas.state.ga.us']

    def start_requests(self):
        url = 'https://ssl.doas.state.ga.us/gpr/'

        yield scrapy.Request(url=url, callback=self.parse)

    def parse(self, response):
        for row in response.xpath('//*[@class="table table-striped table-bordered"]//tbody//tr'):
            item = GovernmentItem()

            item['description'] = row.xpath('./td[@class=" all"][2]').extract_first()
            item['begin_date'] = row.xpath('./td[@class=" desktop"]').extract_first()
            item['end_date'] = row.xpath('./td[@class="desktop tablet mobile sorting_1"]').extract_first()
            item['file_urls'] = row.xpath('./td[@class=" all]/a//@href').extract_first()

            yield item
            

这是我的抓取日志文件:

2021-07-23 05:49:13 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: government)
    2021-07-23 05:49:13 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.2.0, Python 3.8.10 (default, Jun  2 2021, 10:49:15) - [GCC 9.4.0], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k  25 Mar 2021), cryptography 3.4.7, Platform Linux-5.8.0-63-generic-x86_64-with-glibc2.29
    2021-07-23 05:49:13 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
    2021-07-23 05:49:13 [scrapy.crawler] INFO: Overridden settings:
    {'BOT_NAME': 'government',
     'DOWNLOAD_DELAY': 1,
     'NEWSPIDER_MODULE': 'government.spiders',
     'SPIDER_MODULES': ['government.spiders']}
    2021-07-23 05:49:13 [scrapy.extensions.telnet] INFO: Telnet Password: 1196e88aa45a90c1
    2021-07-23 05:49:13 [scrapy.middleware] INFO: Enabled extensions:
    ['scrapy.extensions.corestats.CoreStats',
     'scrapy.extensions.telnet.TelnetConsole',
     'scrapy.extensions.memusage.MemoryUsage',
     'scrapy.extensions.feedexport.FeedExporter',
     'scrapy.extensions.logstats.LogStats']
    2021-07-23 05:49:13 [scrapy.middleware] INFO: Enabled downloader middlewares:
    ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
     'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
     'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
     'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
     'scrapy.downloadermiddlewares.retry.RetryMiddleware',
     'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
     'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
     'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
     'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
     'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
     'scrapy.downloadermiddlewares.stats.DownloaderStats']
    2021-07-23 05:49:13 [scrapy.middleware] INFO: Enabled spider middlewares:
    ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
     'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
     'scrapy.spidermiddlewares.referer.RefererMiddleware',
     'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
     'scrapy.spidermiddlewares.depth.DepthMiddleware']
    2021-07-23 05:49:13 [scrapy.middleware] INFO: Enabled item pipelines:
    ['government.pipelines.GovernmentPipeline',
     'scrapy.pipelines.files.FilesPipeline']
    2021-07-23 05:49:13 [scrapy.core.engine] INFO: Spider opened
    2021-07-23 05:49:13 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
    2021-07-23 05:49:13 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
    2021-07-23 05:49:14 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ssl.doas.state.ga.us/gpr/unsupported?browser=> from <GET https://ssl.doas.state.ga.us/gpr/>
    2021-07-23 05:49:15 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://ssl.doas.state.ga.us/gpr/unsupported?browser=> (referer: None)
    2021-07-23 05:49:15 [scrapy.core.engine] INFO: Closing spider (finished)
    2021-07-23 05:49:15 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
    {'downloader/request_bytes': 468,
     'downloader/request_count': 2,
     'downloader/request_method_count/GET': 2,
     'downloader/response_bytes': 6169,
     'downloader/response_count': 2,
     'downloader/response_status_count/200': 1,
     'downloader/response_status_count/302': 1,
     'elapsed_time_seconds': 1.564505,
     'finish_reason': 'finished',
     'finish_time': datetime.datetime(2021, 7, 23, 10, 49, 15, 561300),
     'log_count/DEBUG': 2,
     'log_count/INFO': 10,
     'memusage/max': 55824384,
     'memusage/startup': 55824384,
     'response_received_count': 1,
     'scheduler/dequeued': 2,
     'scheduler/dequeued/memory': 2,
     'scheduler/enqueued': 2,
     'scheduler/enqueued/memory': 2,
     'start_time': datetime.datetime(2021, 7, 23, 10, 49, 13, 996795)}
    2021-07-23 05:49:15 [scrapy.core.engine] INFO: Spider closed (finished)

如您所见,您得到了 https://ssl.doas.state.ga.us/gpr/unsupported?browser= 的响应,因此相应地设置您的 user_agent(例如带有 chrome 浏览器的 windows 机器)。

将 settings.py 中的 USER_AGENT 更改(并取消注释)为:

USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36"

SuperUser 所述,您原来的 URL 正在重定向,因为网站需要来自真实浏览器的请求。要通过 scrapy 模仿与浏览器相同的行为,您应该通过 setting.py 或作为 spider.py 文件中的 header 传递 user-agent,这将 return 您的页面源html.

您的 XPath 仍然无法正常工作,因为您要查找的数据是动态生成的。因此,您应该使用浏览器开发工具重现请求以获得 API,然后利用它来获得所需的结果。

您将从以下代码中获得 JSON 响应。为了演示,我只提取了一个字段。同样可以获取其他字段。

代码

import scrapy
import json
from ..items import GovernmentItem

class Test(scrapy.Spider):
    name = 'test'

    headers = {
        "authority": "ssl.doas.state.ga.us",
        "pragma": "no-cache",
        "cache-control": "no-cache",
        "sec-ch-ua": "\"Chromium\";v=\"92\", \" Not A;Brand\";v=\"99\", \"Google Chrome\";v=\"92\"",
        "accept": "application/json, text/javascript, */*; q=0.01",
        "x-requested-with": "XMLHttpRequest",
        "sec-ch-ua-mobile": "?0",
        "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36",
        "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
        "origin": "https://ssl.doas.state.ga.us",
        "sec-fetch-site": "same-origin",
        "sec-fetch-mode": "cors",
        "sec-fetch-dest": "empty",
        "referer": "https://ssl.doas.state.ga.us/gpr/",
        "accept-language": "en-US,en;q=0.9"
    }

    body = 'draw=1&columns%5B0%5D%5Bdata%5D=function&columns%5B0%5D%5Bname%5D=&columns%5B0%5D%5Bsearchable%5D=true&columns%5B0%5D%5Borderable%5D=false&columns%5B0%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B0%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B1%5D%5Bdata%5D=function&columns%5B1%5D%5Bname%5D=&columns%5B1%5D%5Bsearchable%5D=true&columns%5B1%5D%5Borderable%5D=true&columns%5B1%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B1%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B2%5D%5Bdata%5D=title&columns%5B2%5D%5Bname%5D=&columns%5B2%5D%5Bsearchable%5D=true&columns%5B2%5D%5Borderable%5D=true&columns%5B2%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B2%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B3%5D%5Bdata%5D=agencyName&columns%5B3%5D%5Bname%5D=&columns%5B3%5D%5Bsearchable%5D=true&columns%5B3%5D%5Borderable%5D=true&columns%5B3%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B3%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B4%5D%5Bdata%5D=postingDateStr&columns%5B4%5D%5Bname%5D=&columns%5B4%5D%5Bsearchable%5D=true&columns%5B4%5D%5Borderable%5D=true&columns%5B4%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B4%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B5%5D%5Bdata%5D=closingDateStr&columns%5B5%5D%5Bname%5D=&columns%5B5%5D%5Bsearchable%5D=true&columns%5B5%5D%5Borderable%5D=true&columns%5B5%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B5%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B6%5D%5Bdata%5D=function&columns%5B6%5D%5Bname%5D=&columns%5B6%5D%5Bsearchable%5D=true&columns%5B6%5D%5Borderable%5D=false&columns%5B6%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B6%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B7%5D%5Bdata%5D=status&columns%5B7%5D%5Bname%5D=&columns%5B7%5D%5Bsearchable%5D=true&columns%5B7%5D%5Borderable%5D=false&columns%5B7%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B7%5D%5Bsearch%5D%5Bregex%5D=false&order%5B0%5D%5Bcolumn%5D=5&order%5B0%5D%5Bdir%5D=asc&start=0&length=50&search%5Bvalue%5D=&search%5Bregex%5D=false&responseType=ALL&eventStatus=OPEN&eventIdTitle=&govType=ALL&govEntity=&eventProcessType=ALL&dateRangeType=&rangeStartDate=&rangeEndDate=&isReset=false&persisted=&refreshSearchData=false'

    def start_requests(self):
       url = 'https://ssl.doas.state.ga.us/gpr/eventSearch'
       yield scrapy.Request(url=url,method='POST', headers=self.headers,body=self.body, callback=self.parse)

    def parse(self,response):
        item = GovernmentItem()
        response = json.loads(response.body)
        for i in response.get('data'):
            item['title'] = i.get('title')
            yield item

这是完整的工作解决方案:

 import scrapy
 import json
    # base_url = https://ssl.doas.state.ga.us/gpr/
    
    class GeorgiaSpider(scrapy.Spider):
    
        name = 'georgia'
        body = 'draw=1&columns%5B0%5D%5Bdata%5D=function&columns%5B0%5D%5Bname%5D=&columns%5B0%5D%5Bsearchable%5D=true&columns%5B0%5D%5Borderable%5D=false&columns%5B0%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B0%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B1%5D%5Bdata%5D=function&columns%5B1%5D%5Bname%5D=&columns%5B1%5D%5Bsearchable%5D=true&columns%5B1%5D%5Borderable%5D=true&columns%5B1%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B1%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B2%5D%5Bdata%5D=title&columns%5B2%5D%5Bname%5D=&columns%5B2%5D%5Bsearchable%5D=true&columns%5B2%5D%5Borderable%5D=true&columns%5B2%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B2%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B3%5D%5Bdata%5D=agencyName&columns%5B3%5D%5Bname%5D=&columns%5B3%5D%5Bsearchable%5D=true&columns%5B3%5D%5Borderable%5D=true&columns%5B3%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B3%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B4%5D%5Bdata%5D=postingDateStr&columns%5B4%5D%5Bname%5D=&columns%5B4%5D%5Bsearchable%5D=true&columns%5B4%5D%5Borderable%5D=true&columns%5B4%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B4%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B5%5D%5Bdata%5D=closingDateStr&columns%5B5%5D%5Bname%5D=&columns%5B5%5D%5Bsearchable%5D=true&columns%5B5%5D%5Borderable%5D=true&columns%5B5%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B5%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B6%5D%5Bdata%5D=function&columns%5B6%5D%5Bname%5D=&columns%5B6%5D%5Bsearchable%5D=true&columns%5B6%5D%5Borderable%5D=false&columns%5B6%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B6%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B7%5D%5Bdata%5D=status&columns%5B7%5D%5Bname%5D=&columns%5B7%5D%5Bsearchable%5D=true&columns%5B7%5D%5Borderable%5D=false&columns%5B7%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B7%5D%5Bsearch%5D%5Bregex%5D=false&order%5B0%5D%5Bcolumn%5D=5&order%5B0%5D%5Bdir%5D=asc&start=0&length=50&search%5Bvalue%5D=&search%5Bregex%5D=false&responseType=ALL&eventStatus=OPEN&eventIdTitle=&govType=ALL&govEntity=&eventProcessType=ALL&dateRangeType=&rangeStartDate=&rangeEndDate=&isReset=false&persisted=&refreshSearchData=false'
    
        def start_requests(self):
            yield scrapy.Request(
                url='https://ssl.doas.state.ga.us/gpr/eventSearch',
                callback=self.parse,
                body=self.body,
                method="POST",
                headers={
                    'authority': 'ssl.doas.state.ga.us',
                    'path': '/gpr/eventSearch',
                    'scheme': 'https',
                    'accept': 'application/json, text/javascript, */*; q=0.01',
                    'accept-encoding': 'gzip, deflate, br',
                    'accept-language': 'en-US,en;q=0.9,bn;q=0.8,es;q=0.7,ar;q=0.6',
                    'content-length': '2030',
                    'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
                    'origin': 'https://ssl.doas.state.ga.us',
                    'referer': 'https://ssl.doas.state.ga.us/gpr/',
                    'sec-ch-ua': '"Chromium";v="92", " Not A;Brand";v="99", "Google Chrome";v="92"',
                    'sec-ch-ua-mobile': '?0',
                    'sec-fetch-dest': 'empty',
                    'sec-fetch-mode': 'cors',
                    'pragma': 'no-cache',
                    'cache-control': 'no-cache',
                    'sec-fetch-site': 'same-origin',
                    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36',
                    'x-requested-with': 'XMLHttpRequest'
                    }
                )
    
        def parse(self, response):
            response = json.loads(response.body)
            for resp in response['data']:
                yield {
                    'title': resp['title'],
                    'begin_date':resp['postingDateStr'],
                    'end_date':resp['closingDateStr']
                    }
              

                 OUTPUT:

2021-07-25 10:48:01 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2021-07-25 10:48:04 [scrapy.core.engine] DEBUG: Crawled (200) <POST https://ssl.doas.state.ga.us/gpr/eventSearch> (referer: https://ssl.doas.state.ga.us/gpr/)
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>
{'title': 'Ford Transit Connect Cargo Van', 'begin_date': 'Jul 12, 2021 @ 05:19 PM', 'end_date': 'Jul 26, 2021 @ 09:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>
{'title': '2020 CDBG Water System Improvement', 'begin_date': 'Jun 14, 2021 @ 11:58 AM', 'end_date': 'Jul 26, 2021 @ 10:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Fire Station 20 Renovations', 'begin_date': 'Jun 30, 2021 @ 08:54 AM', 'end_date': 'Jul 26, 2021 @ 10:00 
AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Food Service Produce', 'begin_date': 'Jun 28, 2021 @ 07:24 AM', 'end_date': 'Jul 26, 2021 @ 10:00 AM'}   
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'LMIG 2021 ROAD RESURFACING PORJECT', 'begin_date': 'Jul 01, 2021 @ 02:36 PM', 'end_date': 'Jul 26, 2021 @ 10:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'MGA Robinson R44 Helicopter Cadet Overhaul', 'begin_date': 'Jul 07, 2021 @ 12:00 PM', 'end_date': 'Jul 26, 2021 @ 10:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Renovations to Old Hickory Flat Gymnasium', 'begin_date': 'Jun 24, 2021 @ 04:51 PM', 'end_date': 'Jul 26, 2021 @ 10:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'eCard Services', 'begin_date': 'Jul 09, 2021 @ 12:00 PM', 'end_date': 'Jul 26, 2021 @ 12:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Ford Police Interceptor', 'begin_date': 'Jul 01, 2021 @ 12:17 PM', 'end_date': 'Jul 26, 2021 @ 02:00 PM'}2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'North End Roadway Safety Analysis', 'begin_date': 'Jun 23, 2021 @ 10:18 AM', 'end_date': 'Jul 26, 2021 @ 
02:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Offset Printing & Finishing Services for Bulldog Print + Design', 'begin_date': 'Jun 30, 2021 @ 03:15 PM', 'end_date': 'Jul 26, 2021 @ 02:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>
{'title': 'RECREATION, PARKS, HISTORIC AND CULTURAL AFFAIRS 5', 'begin_date': 'Jun 25, 2021 @ 10:37 AM', 'end_date': 'Jul 26, 2021 @ 02:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Surgical Sterilization of Adoptable Pets', 'begin_date': 'Jun 24, 2021 @ 03:12 PM', 'end_date': 'Jul 26, 
2021 @ 03:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Remount Ambulance', 'begin_date': 'Jun 23, 2021 @ 10:43 AM', 'end_date': 'Jul 26, 2021 @ 04:00 PM'}      
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'DOCO CDBG Rehab/Elevation/Reconstruction', 'begin_date': 'Jun 23, 2021 @ 11:12 AM', 'end_date': 'Jul 26, 
2021 @ 05:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'T32-D1-Veg Rem-SR 211 Barrow-121766', 'begin_date': 'Jul 08, 2021 @ 04:58 PM', 'end_date': 'Jul 26, 2021 
@ 05:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Carrollton City Hall Renovation & Addition', 'begin_date': 'Jun 25, 2021 @ 09:14 AM', 'end_date': 'Jul 27, 2021 @ 10:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'HUB Transformation Project', 'begin_date': 'Jun 17, 2021 @ 05:01 PM', 'end_date': 'Jul 27, 2021 @ 10:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Extrication Equipment', 'begin_date': 'Jun 22, 2021 @ 12:08 AM', 'end_date': 'Jul 27, 2021 @ 11:00 AM'}  
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Goodyear Pressure Improvement Natural Gas Main Ext', 'begin_date': 'Jul 01, 2021 @ 02:46 PM', 'end_date': 'Jul 27, 2021 @ 11:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Septic Tank and Grease Trap Pumping', 'begin_date': 'Jun 29, 2021 @ 05:53 PM', 'end_date': 'Jul 27, 2021 
@ 11:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'T-Hangar Area-Sitework', 'begin_date': 'Jun 29, 2021 @ 08:27 AM', 'end_date': 'Jul 27, 2021 @ 11:00 AM'} 
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Toccoa Airport - Apron / Ramp Seal Coat', 'begin_date': 'Jun 30, 2021 @ 04:03 PM', 'end_date': 'Jul 27, 2021 @ 11:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>
{'title': 'UWG Nursing Building Parking Lot Expansion', 'begin_date': 'Jun 25, 2021 @ 09:08 AM', 'end_date': 'Jul 27, 2021 @ 11:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'WYCKOFF RAW WATER PIPELINE REPLACEMENT', 'begin_date': 'Jun 25, 2021 @ 09:11 AM', 'end_date': 'Jul 27, 2021 @ 11:00 AM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Purchase 1 One Ton 4x2 Extended Cab Truck', 'begin_date': 'Jul 09, 2021 @ 09:22 AM', 'end_date': 'Jul 27, 2021 @ 12:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Purchase One 4x2 Two Ton Crew Cab Truck', 'begin_date': 'Jul 09, 2021 @ 09:33 AM', 'end_date': 'Jul 27, 2021 @ 12:00 PM'}
2021-07-25 10:48:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ssl.doas.state.ga.us/gpr/eventSearch>    
{'title': 'Purchase One 4x4 Crew Cab Two Ton Truck', 'begin_date': 'Jul 09, 2021 @ 09:28 AM', 'end_date': 'Jul 27, 2021 @ 12:00 PM'}