Employing the albumentation library in PyTorch workflows. Bonus: Helper for selecting appropriate values!

This brief blog post sees a modified release of the previous segmentation and classification pipelines. These versions leverage an increasingly popular augmentation library called albumentations.


As you may know, augmentations are typically used when training a deep learning network to help prevent overfitting and improve the robustness of the classifier to variations in color, rotation, etc.

The torchvision library is typically employed for this process, and was used in our previous tutorials. Used torchvision as a basis, albumentations provides significant additional functionality in terms of both additional augmentations as well as code readability via improved function prototypes (see below). Some of the stated benefits:

  • The library is faster than other libraries on most of the transformations.
  • Based on numpy, OpenCV, imgaug picking the best from each of them.
  • Simple, flexible API that allows the library to be used in any computer vision pipeline.
  • Large, diverse set of transformations.
  • Easy to extend the library to wrap around other libraries.
  • Easy to extend to other tasks.
  • Supports transformations on images, masks, key points and bounding boxes.
  • Supports python 2.7-3.7
  • Easy integration with PyTorch.
  • Easy transfer from torchvision.
  • Was used to get top results in many DL competitions at Kaggle, topcoder, CVPR, MICCAI.
  • Written by Kaggle Masters.

Seems like a win!

We’ll briefly review the implementation in the context of our U-net segmentation tutorial.

As expected we replaced the previous augmentation code:

  1. #note that since we need the transofrmations to be reproducible for both masks and images
  2. #we do the spatial transformations first, and afterwards do any color augmentations
  3. img_transform = transforms.Compose([
  4.      transforms.ToPILImage(),
  5.     transforms.RandomVerticalFlip(),
  6.     transforms.RandomHorizontalFlip(),
  7.     transforms.RandomCrop(size=(patch_size,patch_size),pad_if_needed=True), #these need to be in a reproducible order, first affine transforms and then color
  8.     transforms.RandomResizedCrop(size=patch_size),
  9.     transforms.RandomRotation(180),
  10.     transforms.ColorJitter(brightness=0, contrast=0, saturation=0, hue=.5),
  11.     transforms.RandomGrayscale(),
  12.     transforms.ToTensor()
  13.     ])
  16. mask_transform = transforms.Compose([
  17.     transforms.ToPILImage(),
  18.     transforms.RandomVerticalFlip(),
  19.     transforms.RandomHorizontalFlip(),
  20.     transforms.RandomCrop(size=(patch_size,patch_size),pad_if_needed=True), #these need to be in a reproducible order, first affine transforms and then color
  21.     transforms.RandomResizedCrop(size=patch_size,interpolation=PIL.Image.NEAREST),
  22.     transforms.RandomRotation(180),
  23.     ])

With the new albumentations code:

  1. #https://github.com/albu/albumentations/blob/master/notebooks/migrating_from_torchvision_to_albumentations.ipynb
  2. transforms = Compose([
  3.        VerticalFlip(p=.5),
  4.        HorizontalFlip(p=.5),
  5.        HueSaturationValue(hue_shift_limit=(-25,0),sat_shift_limit=0,val_shift_limit=0,p=1),
  6.        Rotate(p=1, border_mode=cv2.BORDER_CONSTANT,value=0),
  7.        #ElasticTransform(always_apply=True, approximate=True, alpha=150, sigma=8,alpha_affine=50),
  8.        RandomSizedCrop((patch_size,patch_size), patch_size,patch_size),
  9.        ToTensor()
  10.     ])

This is an example limited set of augmentations, for a full list please refer here.

One important note here, the “ToTensor()” function is not the one provided by torchvision, but is a augmentations specific version which performs the same task within the augmentation pipeline.

I had mentioned improved code readability. When looking at the execution of the augmentation in the torchvision version:

  1. seed = random.randrange(sys.maxsize) #get a random seed so that we can reproducibly do the transofrmations
  2.         if self.img_transform is not None:
  3.             random.seed(seed) # apply this seed to img transforms
  4.             img_new = self.img_transform(img)
  6.         if self.mask_transform is not None:
  7.             random.seed(seed)
  8.             mask_new = self.mask_transform(mask)
  9.             mask_new = np.asarray(mask_new)[:,:,0].squeeze()
  11.             random.seed(seed)
  12.             weight_new = self.mask_transform(weight)
  13.             weight_new = np.asarray(weight_new)[:,:,0].squeeze()

We see that the mask and image needed to be operated on separately, and as well we had to take special care to ensure that the mask only underwent spatial augmentations and not color augmentations.

Albumentations provides us with a cleaner interface, by allowing us to indicate both images and their associated masks within a single function call. The innerworkings of albumentations address the special mask requirements for us, resulting in a much cleaner presentation:

  1. if self.transforms:
  2.             augmented = self.transforms(image=img, masks=[mask,weight])
  3.             img_new = augmented['image']
  4.             mask_new,weight_new = augmented['masks']

Parameter tuning via visual inspection

One may very reasonably ask, how do I identify good parameters for these augmentation functions?

To aid in the parameter search, CCIPD Lab member Sacheth Chandramouli developed a very handy tool:


Note that most of these augmentation functions require specifying the minimum and maximal values from within which a uniform random selection is performed. For example from the docs:

  • hue_shift_limit((int, int) or int) – range for changing hue. If hue_shift_limit is a single int, the range will be (-hue_shift_limit, hue_shift_limit). Default: 20.

As such, it stands to reason that determining the operating extrema would allow for most randomly selected values within that range to be acceptable.

This tool provides a quick and efficient way for visually appreciating what the extrema of different augmentation values would look like. Additionally, with the “plot” option selected, a number of randomly chosen values within the parameter distribution are chosen and executed for review.

The code provides examples for HSV, Scaling, Rotating, and Elastic augmentations which can easily be adapted to the other transforms in the toolbox.

Happy augmenting!

Code for segmentation, classification, and the tool are available at their respective links.





3 thoughts on “Employing the albumentation library in PyTorch workflows. Bonus: Helper for selecting appropriate values!”

  1. I think that is one of the most important info for me.
    And i’m satisfied studying your article. But want
    to observation on few basic issues, The website style is wonderful, the articles
    is truly nice :D. Good process, cheers.

Leave a Reply

Your email address will not be published. Required fields are marked *