Python 和压缩文件模块

Python and the zipfile module

根据 Python 文档:

ZipFile.extract(member[, path[, pwd]]) Extract a member from the archive to the current working directory; member must be its full name or a ZipInfo object). Its file information is extracted as accurately as possible. path specifies a different directory to extract to. member can be a filename or a ZipInfo object. pwd is the password used for encrypted files.

我有大量压缩文件,每个压缩文件中包含 1000 个存档文件。使用上面的函数,我可以只从每个压缩存档中提取我需要的文件:

def getAIDlist(aidlist_to_keep,ifile,folderName):

    archive = zipfile.ZipFile(ifile) #
    aidlist=archive.namelist() # gets the names of all files in the zipped archive

    print "AIDs to keep",aidlist_to_keep

    print  "Number of AIDs in the zipped archive ",len(aidlist)

    path='/2015/MyCODE/'+folderName

    for j in aidlist_to_keep:
        for k in aidlist:
            if j in k:
                try:
                    archive.extract(k,path)
                except:
                    print "Could Not Extract file ",(j)
                    pass

    return
if __name__ == '__main__':
    getAIDlist(['9593','9458','9389'],"0009001_0010000.zip","TestingFolder")

理想情况下,我希望将提取的文件存储到 TestingFolder 中,但它们存储在 TestingFolder 中新创建的文件夹 0009001_0010000.zip 中。

如何将提取的文件直接导入 TestingFolder 而无需创建新文件夹 0009001_0010000.zip

而不是使用 extract(),您可以使用 ZipFile.open() and copy the file to a filename of your own choosing; use shutil.copyfileobj() 来高效地复制数据:

import shutil

archive = zipfile.ZipFile(ifile)
path = os.path.join('/2015/MyCODE', folderName)

for name in aidlist_to_keep:
    try:
        archivefile = archive.open(name)
    except KeyError:
        # no such file in the archive
        continue
    with open(os.path.join(path, name), 'wb') as targetfile:
        shutil.copyfileobj(archivefile, targetfile)