Python imports image dataset_pytorch loads its own image dataset instance

Study beforeDeep Learningalgorithm, all use ready-made data sets on the Internet, and they all have corresponding codes. When I started writing papers and doing experiments and used my own image dataset, I found that I had no idea how to start. I believe that many newbies will encounter such problems.

The following code implements reading all images from a folder, performing normalization and standardization operations and converting the images intotensor. Finally, the first image is read and displayed.

# Data Processing

import os

import torch

from import data

from PIL import Image

import numpy as np

from torchvision import transforms

transform = ([

(), # Convert the image to Tensor and normalize to [0,1]

# (mean=[.5, .5, .5], std=[.5, .5, .5]) # Standardized to [-1,1]

])

#Define your own data collection

class FlameSet(data.Dataset):

def __init__(self,root):

# Absolute path to all pictures

imgs=(root)

=[(root,k) for k in imgs]

=transform

def __getitem__(self, index):

img_path = [index]

pil_img = (img_path)

if :

data = (pil_img)

else:

pil_img = (pil_img)

data = torch.from_numpy(pil_img)

return data

def __len__(self):

return len()

if __name__ == '__main__':

dataSet=FlameSet('./test')

print(dataSet[0])

Show results:

Supplementary knowledge: UsePytorchRead the local MINIST dataset and load it

The MINIST data set in pytorch can be directly called to obtain, or you can customize your own Dataset class to read local data and initialize data.

1. Directly use MNIST that comes with pytorch to download:

Disadvantages: The download speed is slow, and if the download fails in the middle, it is usually necessary to re-execute the code to download:

# # Download training data and test data

# Download training data and test data

trainDataset = ( # torchvision can realize the download of the training set and test set of the data set

root="./data", # Download the data and store it in the data folder

train=True, # train is used to specify which part of data to be loaded after the dataset is downloaded. If set to True, it means that the training set part of the dataset is loaded; if set to False, it means that the test set part of the dataset is loaded.

transform=(), # data standardization and other operations are all in transforms, here is the conversion

download=True # If the process is interrupted, or it runs again after the download is completed, an error will occur.

)

testDataset = (

root="./data",

train=False,

transform=(),

download=True

)

2. Customize the dataset class to read and initialize data.

The contents of the MINIST dataset you downloaded are as follows:

The dataset class you define yourself needs to inherit: Dataset

Necessary magic methods are needed:

Read data files in __init__ magic method

__getitem__Magic method to support subscript access

__len__Magic method returns the size of the custom dataset, which facilitates later traversal

Examples are as follows:

class DealDataset(Dataset):

"""

Read data, initialize data

"""

def __init__(self, folder, data_name, label_name,transform=None):

(train_set, train_labels) = load_minist_data.load_data(folder, data_name, label_name) # In fact, you can also use() directly, and the result after reading is in form

self.train_set = train_set

self.train_labels = train_labels

= transform

def __getitem__(self, index):

img, target = self.train_set[index], int(self.train_labels[index])

if is not None:

img = (img)

return img, target

def __len__(self):

return len(self.train_set)

Among them, load_minist_data.load_data is also a function that we write to read data files, that is, it is placed in the load_data function in load_minist_data.py. The specific implementation is as follows:

def load_data(data_folder, data_name, label_name):

"""

data_folder: file directory

data_name: data file name

label_name: tag data file name

"""

with ((data_folder,label_name), 'rb') as lbpath: # rb means reading binary data

y_train = ((), np.uint8, offset=8)

with ((data_folder,data_name), 'rb') as imgpath:

x_train = (

(), np.uint8, offset=16).reshape(len(y_train), 28, 28)

return (x_train, y_train)

After writing the custom dataset, you can instantiate the class and load the data:

# Instantiate this class, and then we get data of type Dataset, write it down and pass this class to DataLoader, and that's fine.

trainDataset = DealDataset('MNIST_data/', "","",transform=())

testDataset = DealDataset('MNIST_data/', "","",transform=())

# Loading of training data and test data

train_loader = (

dataset=trainDataset,

batch_size=100, # A batch can be considered a package, each package contains 100 pictures

shuffle=False,

)

test_loader = (

dataset=testDataset,

batch_size=100,

shuffle=False,

)

Build simpleNeural NetworkAnd train and test:

class NeuralNet():

def __init__(self, input_num, hidden_num, output_num):

super(NeuralNet, self).__init__()

self.fc1 = (input_num, hidden_num)

self.fc2 = (hidden_num, output_num)

= ()

def forward(self,x):

x = self.fc1(x)

x = (x)

y = self.fc2(x)

return y

# Parameter initialization

epoches = 5

lr = 0.001

input_num = 784

hidden_num = 500

output_num = 10

device = ("cuda" if .is_available() else "cpu")

# Generate training model objects and define loss functions and optimization functions

model = NeuralNet(input_num, hidden_num, output_num)

(device)

criteria = () # Use cross entropy as loss function

optimizer = ((), lr=lr)

# Start loop training

for epoch in range(epoches): # An epoch can be considered a training loop

for i, data in enumerate(train_loader):

(images, labels) = data

images = (-1, 28*28).to(device)

labels = (device)

output = model(images) # Output is generated through the model object

loss = criteria(output, ()) # Parameters passed in: output value (predicted value), actual value (label)

optimizer.zero_grad() # Gradient clearing

()

if (i+1) % 100 == 0: # i indicates the number of the sample

print('Epoch [{}/{}], Loss: {:.4f}'

.format(epoch + 1, epoches, ())) # {} is the variable that needs to be passed in later

# Start the test

with torch.no_grad():

correct = 0

total = 0

for images, labels in test_loader:

images = (-1, 28*28).to(device) # Here -1 generally means automatic matching, that is, you don't know how many rows there are, but the number of columns is determined to be 28 * 28

# In fact, since 28 * 28 is already equal to the original tensor size, the number of lines is determined, which is 1

labels = (device)

output = model(images)

_, predicted = (output, 1)

total += (0) # The size() here is similar to numpy's shape: (train_images)[0]

correct += (predicted == labels).sum().item()

print("The accuracy of total {} images: {}%".format(total, 100 * correct/total))

The above example of pytorch loading its own image dataset is all the content I share with you. I hope you can give you a reference and I hope you can support it.