我想让列表中的每个 url 成为一个字符串

I want to make each url in a list a string

我有 35000 个 url,我不能单独添加 ""(给每个 url)并在每个 url 的末尾添加,[= 中是否有任何快捷键24=] 帮助我 select 列表内代码块中的所有 url,然后字符串,取消 url 的字符串,就像我们可以评论取消注释

我希望 35000 url 看起来像这样:

start_urls=[
        'https://dawaai.pk/all-medicines/a',
        'https://dawaai.pk/all-medicines/b',
        'https://dawaai.pk/all-medicines/c',
        'https://dawaai.pk/all-medicines/d',
        'https://dawaai.pk/all-medicines/e',
        'https://dawaai.pk/all-medicines/f',
        'https://dawaai.pk/all-medicines/g',
        'https://dawaai.pk/all-medicines/h',
        'https://dawaai.pk/all-medicines/i',
        'https://dawaai.pk/all-medicines/j',
        'https://dawaai.pk/all-medicines/k',
        'https://dawaai.pk/all-medicines/l',
        'https://dawaai.pk/all-medicines/m',
        'https://dawaai.pk/all-medicines/n',
        'https://dawaai.pk/all-medicines/o',
        'https://dawaai.pk/all-medicines/p',
        'https://dawaai.pk/all-medicines/q',
        'https://dawaai.pk/all-medicines/r',
        'https://dawaai.pk/all-medicines/s',
        'https://dawaai.pk/all-medicines/t',
        'https://dawaai.pk/all-medicines/u',
        'https://dawaai.pk/all-medicines/v',
        'https://dawaai.pk/all-medicines/w',
        'https://dawaai.pk/all-medicines/x',
        'https://dawaai.pk/all-medicines/y',
        'https://dawaai.pk/all-medicines/z',
        # 'https://dawaai.pk/all-medicines/',

        ]

这就是抓取工具的当前代码库的样子:

import scrapy

class DawaaiSpider(scrapy.Spider):
    name='dawaai'
 start_urls=[ 
https://dawaai.pk/medicine/vitrum-1-38514.html
https://dawaai.pk/medicine/ventek-38552.html
https://dawaai.pk/medicine/valid-1-41158.html
https://dawaai.pk/medicine/verger-2-38699.html
https://dawaai.pk/medicine/valvin-1-38910.html
https://dawaai.pk/medicine/verger-5-38953.html
https://dawaai.pk/medicine/vexnil-8-39028.html
https://dawaai.pk/medicine/virocil-41083.html
https://dawaai.pk/medicine/voltral-emulgel-2-39942.html
https://dawaai.pk/medicine/vasocord-40099.html
https://dawaai.pk/medicine/vasocord-1-40100.html
https://dawaai.pk/medicine/Zestril-Tablet10-55.html
https://dawaai.pk/medicine/Zestril-Tablet20-56.html
https://dawaai.pk/medicine/zultra-1-12104.html
https://dawaai.pk/medicine/Zofrantab-Tablet8-128.html
https://dawaai.pk/medicine/Zeegapcap-Capsule50-176.html
https://dawaai.pk/medicine/Zeegapcap-Capsule75-177.html
https://dawaai.pk/medicine/Zeegapcap-Capsule150-178.html
https://dawaai.pk/medicine/zopent-40mg-590.html
https://dawaai.pk/medicine/zopent-40mg-591.html
https://dawaai.pk/medicine/zoloft-50mg-592.html
https://dawaai.pk/medicine/zocor-10mg-593.html
https://dawaai.pk/medicine/zocor-10mg-594.html

]

def parse(self,response):
        for medicine in response.css('div.card-body'):
                yield{
                'name': medicine.css('a::text').get(),
                'price_now': medicine.css('h4::text').get().replace('Rs ','')  }
                

问题是当所有 url 都是一个字符串并且中间有一个逗号时 start_urls 将开始被抓取

有 3 种方法可以做到这一点:-

  1. 在 VSCode 中使用正则表达式(我不是正则表达式专业人士)
  2. 将 url 放入文本文件中,让 python 脚本对其进行迭代,将 " 添加到行首和行尾,并将 , 添加到每行末尾。
  3. 使用多select :-

转到 URL 的左上角,按住 Alt+Shift 并拖动到 URL 的右下角,这将为您提供多个光标,您可以一次编辑所有行。

然后按左箭头键将所有光标移到左侧,然后输入'

现在再次 select 所有 url,这次按向右箭头键将所有光标移至每行的末尾,然后键入 ',

完成!