PyTorch DataLoader 返回列表而不是自定义数据集上的张量

PyTorch DataLoader returning list instead of tensor on custom Dataset

我对我的数据集进行了试验,这是我的完整代码:

data_root='D:/AuxiliaryDocuments/NYU/'
raw_data_transforms=transforms.Compose([#transforms.ToPILImage(),
                           transforms.CenterCrop((224,101)),
                           transforms.ToTensor()])
depth_data_transforms=transforms.Compose([transforms.CenterCrop((74,55)),
                                      transforms.ToTensor()])

filename_txt={'image_train':'image_train.txt','image_test':'image_test.txt',
          'depth_train':'depth_train.txt','depth_test':'depth_test.txt'}


class Mydataset(Dataset):
  def __init__(self,data_root,transformation,data_type):
     self.transform=transformation
     self.image_path_txt=filename_txt[data_type]
     self.sample_list=list()
     f=open(data_root+'/'+data_type+'/'+self.image_path_txt)
     lines=f.readlines()
     for line in lines:
        line=line.strip()
        line=line.replace(';','')
        self.sample_list.append(line)
     f.close()

  def __getitem__(self, index):
     item=self.sample_list[index]
     img=Image.open(item)
     if self.transform is not None:
        img=self.transform(img)
     idx=index
     print(type(img))
    return idx,img

 def __len__(self):
    return len(self.sample_list)

我打印的img类型是,然后我使用了下面的编码:

test=Mydataset(data_root,raw_data_transforms,'image_train')
test_1=Mydataset(data_root,depth_data_transforms,'depth_train')
test2=DataLoader(test,batch_size=4,num_workers=0,shuffle=False)
test_2=DataLoader(test_1,batch_size=4,num_workers=0,shuffle=False)

打印信息:

for idx,data in enumerate(test_2):
   print(idx,data)
   print(type(data))

但是数据类型是'',我需要的是tensor.

这是预期的输出。在您的情况下,DataLoader 应该 return 一个列表。 DataLoader 的输出是 (inputs batch, labels batch).

例如

for idx, data in enumerate(test_dataloader):
  if idx == 0:
    print(type(data))
    print(len(data), data[0].shape, data[1].shape)


<class 'list'>
2 torch.Size([64, 1, 28, 28]) torch.Size([64])

这里,64个标签对应批次中的64张图片。

为了将它传递给模型,你可以这样做

#If you return img first in your Dataset
    return img, idx

# Either
for idx, data in enumerate(test_dataloader):
    # pass inputs to model
    out = model(data[0])
    # your labels are data[1]

# Or
for idx, (inputs, labels) in enumerate(test_dataloader):
    # pass inputs to model
    out = model(inputs)
    # your labels are in "labels" variable