jdit.dataset¶
Dataloaders_factory¶
-
class
jdit.dataset.DataLoadersFactory(root: str, batch_size: int, num_workers=-1, shuffle=True, subdata_size=1)[source]¶ This is a super class of dataloader.
It defines same basic attributes and methods.
- For training data:
train_dataset,loader_train,nsteps_train. Others such asvalid_epochandtesthave the same naming format. - For transform, you can define your own transforms.
- If you don’t have test set, it will be replaced by valid_epoch dataset.
It will build dataset following these setps:
build_transforms()To build transforms for training dataset and valid_epoch. You can rewrite this method for your own transform. It will be used inbuild_datasets()build_datasets()You must rewrite this method to load your own dataset by passing datasets toself.dataset_trainandself.dataset_valid.self.dataset_testis optional. If you don’t pass a test dataset, it will be replaced byself.dataset_valid.Example:
def build_transforms(self, resize=32): self.train_transform_list = self.valid_transform_list = [ transforms.Resize(resize), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])] # Inherit this class and write this method. def build_datasets(self): self.dataset_train = datasets.CIFAR10(root, train=True, download=True, transform=transforms.Compose(self.train_transform_list)) self.dataset_valid = datasets.CIFAR10(root, train=False, download=True, transform=transforms.Compose(self.valid_transform_list))
build_loaders()It will use dataset, and passed parameters to build dataloaders forself.loader_train,self.loader_validandself.loader_test.
rootis the root path of datasets.batch_shapeis the size of data loader. shape is(Batchsize, Channel, Height, Width)num_workersis the number of threads, using to load data. If you pass -1, it will use the max number of threads, according to your cpu. Default: -1shuffleis whether shuffle the data. Default:True
-
build_datasets()[source]¶ You must to rewrite this method to load your own datasets.
self.dataset_train. Assign a trainingdatasetto this.self.dataset_valid. Assign a valid_epochdatasetto this.self.dataset_testis optional. Assign a testdatasetto this. If not, it will be replaced byself.dataset_valid.
Example:
self.dataset_train = datasets.CIFAR10(root, train=True, download=True, transform=transforms.Compose(self.train_transform_list)) self.dataset_valid = datasets.CIFAR10(root, train=False, download=True, transform=transforms.Compose(self.valid_transform_list))
-
build_loaders()[source]¶ Build datasets The previous function
self.build_datasets()has created datasets. Use these datasets to build their’s dataloaders
-
build_transforms(resize: int = 32)[source]¶ This will build transforms for training and valid_epoch.
You can rewrite this method to build your own transforms. Don’t forget to register your transforms to
self.train_transform_listandself.valid_transform_listThe following is the default set.
self.train_transform_list = self.valid_transform_list = [ transforms.Resize(resize), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]
- For training data:
HandMNIST¶
-
class
jdit.dataset.HandMNIST(root='datasets/hand_data', batch_size=64, num_workers=-1)[source]¶ Hand writing mnist dataset.
Example:
>>> data = HandMNIST(r"../datasets/mnist") use 8 thread! Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz Processing... Done! >>> data.dataset_train Dataset MNIST Number of datapoints: 60000 Split: train Root Location: data Transforms (if any): Compose( Resize(size=32, interpolation=PIL.Image.BILINEAR) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) ) Target Transforms (if any): None >>> # We don't set test dataset, so they are the same. >>> data.dataset_valid is data.dataset_test True >>> # Number of steps at batch size 128. >>> data.nsteps_train 469 >>> # Total samples of training datset. >>> len(data.dataset_train) 60000 >>> # The batch size of sample load is 1. So, we get length of loader is equal to samples amount. >>> len(data.samples_train) 6000
-
build_transforms(resize: int = 32)[source]¶ This will build transforms for training and valid_epoch.
You can rewrite this method to build your own transforms. Don’t forget to register your transforms to
self.train_transform_listandself.valid_transform_listThe following is the default set.
self.train_transform_list = self.valid_transform_list = [ transforms.Resize(resize), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]
-
FashionMNIST¶
-
class
jdit.dataset.FashionMNIST(root='datasets/fashion_data', batch_size=64, num_workers=-1)[source]¶ -
build_datasets()[source]¶ You must to rewrite this method to load your own datasets.
self.dataset_train. Assign a trainingdatasetto this.self.dataset_valid. Assign a valid_epochdatasetto this.self.dataset_testis optional. Assign a testdatasetto this. If not, it will be replaced byself.dataset_valid.
Example:
self.dataset_train = datasets.CIFAR10(root, train=True, download=True, transform=transforms.Compose(self.train_transform_list)) self.dataset_valid = datasets.CIFAR10(root, train=False, download=True, transform=transforms.Compose(self.valid_transform_list))
-
build_transforms(resize: int = 32)[source]¶ This will build transforms for training and valid_epoch.
You can rewrite this method to build your own transforms. Don’t forget to register your transforms to
self.train_transform_listandself.valid_transform_listThe following is the default set.
self.train_transform_list = self.valid_transform_list = [ transforms.Resize(resize), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]
-
Cifar10¶
-
class
jdit.dataset.Cifar10(root='datasets/cifar10', batch_size=32, num_workers=-1)[source]¶ -
build_datasets()[source]¶ You must to rewrite this method to load your own datasets.
self.dataset_train. Assign a trainingdatasetto this.self.dataset_valid. Assign a valid_epochdatasetto this.self.dataset_testis optional. Assign a testdatasetto this. If not, it will be replaced byself.dataset_valid.
Example:
self.dataset_train = datasets.CIFAR10(root, train=True, download=True, transform=transforms.Compose(self.train_transform_list)) self.dataset_valid = datasets.CIFAR10(root, train=False, download=True, transform=transforms.Compose(self.valid_transform_list))
-
Lsun¶
-
class
jdit.dataset.Lsun(root, batch_size=32, num_workers=-1)[source]¶ -
build_datasets()[source]¶ You must to rewrite this method to load your own datasets.
self.dataset_train. Assign a trainingdatasetto this.self.dataset_valid. Assign a valid_epochdatasetto this.self.dataset_testis optional. Assign a testdatasetto this. If not, it will be replaced byself.dataset_valid.
Example:
self.dataset_train = datasets.CIFAR10(root, train=True, download=True, transform=transforms.Compose(self.train_transform_list)) self.dataset_valid = datasets.CIFAR10(root, train=False, download=True, transform=transforms.Compose(self.valid_transform_list))
-
build_transforms(resize: int = 32)[source]¶ This will build transforms for training and valid_epoch.
You can rewrite this method to build your own transforms. Don’t forget to register your transforms to
self.train_transform_listandself.valid_transform_listThe following is the default set.
self.train_transform_list = self.valid_transform_list = [ transforms.Resize(resize), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]
-