TypeError: init() missing 1 required positional argument with scrapy passing params to pipeline

Question

通常我知道这个错误是什么意思，但不知何故我相信我确实传递了参数

我正在玩 scrapy 和内部管道，我想如果我正在抓取几个不同的网站或页面，我希望他们说所有输出 json 文件但具有不同的 json当然这样我就可以知道哪个 json 属于哪个网站

所以我创建了一个服务文件夹，里面有一个名为 pipeline 的文件

所以在这个里面 pipeline.py

我创建了一个class如下

import json
import os

class JsonWriterPipeline(object):
    """
    write all items to a file, most likely json file
    """
    def __init__(self, filename):
        print(filename)  # this does prints the filename though
        self.file = open(filename, 'w')

    def open_spider(self, spider):
        self.file.write('[')

    def close_spider(self, spider):
        # remove the last two char which is ',\n' then add closing bracket ']'
        self.file.seek(self.file.seek(0, os.SEEK_END) - 2)
        self.file.write(']')

    def process_item(self, item, spider):
        line = json.dumps(dict(item)) + ",\n"
        self.file.write(line)
        return item

然后在原来的 pipeline.py 根文件夹下我有这样的东西

from scrape.services.pipeline import JsonWriterPipeline



JsonWriterPipeline('testing.json')  # so I have passed the filename argument as `'testing.json'`

但我还是不断收到错误消息，同样如上所述，当我执行 print(filename) 时，它会正确打印出来。

如果我没有传递文件名而不是静态文件名，它可以完美地工作，但当然我希望它是动态的，这就是为什么我创建了一个 class 以便我可以重用它

任何人都有想法

编辑：正如下面提到的 Gallaecio 然后意识到管道不带参数，我对这些答案进行了一些谷歌搜索，说管道以这种方式接受参数，如果参数是通过命令行而不是在代码本身内部传递的

感谢您提出的任何建议。

Answer 1

我想到了一个替代方案，即不创建新对象并在创建时传递参数。也许尝试继承

示例如下

里面service/pipeline.py

import json
import os


class JsonWriterPipeline(object):
    """
    write all items to a file, most likely json file
    """
    filename = 'demo.json'  # instead of passing argument create variable for the class

    def __init__(self):
        self.file = open(self.filename, 'w+')

    def open_spider(self, spider):
        self.file.write('[')

    def close_spider(self, spider):
        # remove the last two char which is ',\n' then add closing bracket ']'
        self.file.seek(self.file.seek(0, os.SEEK_END) - 2)
        self.file.write(']')
        return

    def process_item(self, item, spider):
        line = json.dumps(dict(item)) + ",\n"
        self.file.write(line)
        return item

里面原来pipeline.py

from scrape.services.pipeline import JsonWriterPipeline

class JsonWriterPipelineA(JsonWriterPipeline):
    filename = 'a.json'

    def __init__(self):
        super().__init__()


class JsonWriterPipelineB(JsonWriterPipeline):
    filename = 'b.json'

    def __init__(self):
        super().__init__()

这是我能想到的替代方法，希望对你有帮助

TypeError: init() missing 1 required positional argument with scrapy passing params to pipeline