You have to learn to read the source code...
Recently, I can't go to Google, so I used bing as an example. Generally speaking, after reading various example codes, I found thattransformerThere is a totensor in it, so search directly, look at the document, search for to_tensor in the document, and there is a question for thisfunctionand then check the source code, the source code is as follows:
def to_tensor(pic):
"""Convert a ``PIL Image`` or ```` to tensor.
See ``ToTensor`` for more details.
Args:
pic (PIL Image or numpy.ndarray): Image to be converted to tensor.
Returns:
Tensor: Converted image.
"""
if not(_is_pil_image(pic) or _is_numpy_image(pic)):
raise TypeError('pic should be PIL Image or ndarray. Got {}'.format(type(pic)))
if isinstance(pic, ):
# handle numpy array
if == 2:
pic = pic[:, :, None]
img = torch.from_numpy(((2, 0, 1)))
# backward compatibility
if isinstance(img, ):
return ().div(255)
else:
return img
if accimage is not None and isinstance(pic, ):
nppic = ([, , ], dtype=np.float32)
(nppic)
return torch.from_numpy(nppic)
# handle PIL Image
if == 'I':
img = torch.from_numpy((pic, np.int32, copy=False))
elif == 'I;16':
img = torch.from_numpy((pic, np.int16, copy=False))
elif == 'F':
img = torch.from_numpy((pic, np.float32, copy=False))
elif == '1':
img = 255 * torch.from_numpy((pic, np.uint8, copy=False))
else:
img = (.from_buffer(()))
# PIL image mode: L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK
if == 'YCbCr':
nchannel = 3
elif == 'I;16':
nchannel = 1
else:
nchannel = len()
img = ([1], [0], nchannel)
# put it from HWC to CHW format
# yikes, this transpose takes 80% of the loading time/CPU
img = (0, 1).transpose(0, 2).contiguous()
if isinstance(img, ):
return ().div(255)
else:
return img
You can see that basically calls the function torch.from_numpy(), and use a PIL image to convert it into numpyArray, then use the view function, then use transpose directly, and finally divide by 255.
PS: Easter egg, look at the latest torch vision, there is an interesting comment: "# yikes, this transpose takes 80% of the loading time/CPU" means that this code still has room for improvement :)