Skip to content

Multiple image augmentation for training dataset #768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ronykalfarisi opened this issue Jul 11, 2018 · 35 comments
Open

Multiple image augmentation for training dataset #768

ronykalfarisi opened this issue Jul 11, 2018 · 35 comments

Comments

@ronykalfarisi
Copy link

Dear All and @waleedka ,
I've been using this repo to detect and create masking for crack damage on bridge structures. My training dataset has 850 images and overall I got decent result. As you can see in two images below, the model can detect and segment horizontal and vertical crack well. But it fails in detecting diagonal crack. So, I'm thinking that I could solve this problem if I have more images.
5
0
I noticed that we could use image augmentation in the training stage using the keyword option augmentation as follow

model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=30,
            layers='all', 
            augmentation = imgaug.augmenters.Sequential([ 
                imgaug.augmenters.Fliplr(1), 
                imgaug.augmenters.Flipud(1), 
                imgaug.augmenters.Affine(rotate=(-45, 45)), 
                imgaug.augmenters.Affine(rotate=(-90, 90)), 
                imgaug.augmenters.Affine(scale=(0.5, 1.5))]))

However, from what I understand, this augmentation is applied consecutively to each image. In other words, for each image, the augmentation apply flip LR, and then followed by flip UD, then followed by rotation of -45 and 45, then followed by another rotation of -90 and 90, and lastly followed by scaling with factor 0.5 and 1.5.

So, my question is, Is there a way to apply each augmentation separately for each image? What I meant by this is, I want each augmentation to generate one extra data (and mask) alongside with the original. If this can be achieved, the augmentation will generate 6x total images when I apply 5 image augmentation making the whole dataset contains 5100 images.

Thank you and I really appreciate the helps.

@zungam
Copy link

zungam commented Jul 11, 2018

Is there a way to apply each augmentation separately for each image?

Dont use .Sequential([ as it means "do all these augmentations after each other".

I want each augmentation to generate one extra data (and mask) alongside with the original.

I couldnt personally get around the problem so I made sure the database was used six times, that is in practice:
epoch = at least set to six times size of database. But more could be done.
shuffle = False
and made augmentation 5/6 probability of happening. It does not create exactly 5 extra copys for each image, but statistically it would do the same over a large database.

imgaug.Sometimes(5/6,aug.OneOf(
                                            [
                                            imgaug.augmenters.Fliplr(1), 
                                            imgaug.augmenters.Flipud(1), 
                                            imgaug.augmenters.Affine(rotate=(-45, 45)), 
                                            imgaug.augmenters.Affine(rotate=(-90, 90)), 
                                            imgaug.augmenters.Affine(scale=(0.5, 1.5))
                                             ]
                                        )
                                   )
                                            

Note! Whats good about this is also that the effect of augmentations is random, which creates a more safe gradient descent. Training 6 times in a row on the same image, I believe would make your training slower as the gradient descent take bigger leaps in the direction of each batch, thus zig zaging down to local minima.

@ronykalfarisi
Copy link
Author

ronykalfarisi commented Jul 12, 2018

Dear Magnus (@zungam ), thank you so much for your suggestion. I'll give it a try and I'll let you know about the result when it's done. However, there are couple of things I'd like to clarify from you.

  1. As I understand, the definition of one epoch is one forward and backward pass through all the images in the dataset. So, when you said "epoch = at least set to six times size of database.", does it mean epoch = 6*850 (Since I have 850 images in the dataset)?

  2. You said that "Training 6 times in a row on the same image, would make your training slower as the gradient descent take bigger leaps". While this is true, but this is not my intention. What I really want is, for example, image1 + flipLR(image1) + flipUD(image1) + rotate45(image1) + rotate90(image1) + scale(image1). I believe, all those 6 images are not really the same.

It would be nice if we can achieve this easily, for example, by feeding list of augmentation in the augmentation keyword when we call the train method.

@zungam
Copy link

zungam commented Jul 15, 2018

  1. Yes, this is what i ment.
  2. They are not the same, but so similar that I believe they will push the gradient in the similar direction. Using flipLR and rotate would not do this however, so you are right.

If really want to try this. You could go into load_img_gt and data_generator and make them repeat the same image 6 times in a row. You have to be creative to achieve this. Perhaps something like:

initalize
state = 0
in while loop of data_generator

state += 1
if state = 6:
      state = 0
      image_index =  (image_index + 1) % len(image_ids)

instead of

image_index = (image_index + 1) % len(image_ids)

Then you can define 6 types of augmentation in load_img_gt, and make it switch to what augmentation it uses as a function of that state it is in (0,1,2,3,4 or 5).

@ronykalfarisi
Copy link
Author

Thanks @zungam, I'll give this a try.

@waspinator
Copy link

@ronykalfarisi I'm facing a similar issue trying to train on plant stems. Did you manage to find a solution to diagonal objects?

The difference in my dataset is that all the stems are already diagonal in my training dataset, but it still can't detect them. I have a feeling that maybe it's because the detection box area is much larger than the actual object inside.

@ronykalfarisi
Copy link
Author

Hi @waspinator,
I haven't found an effective solution for thin diagonal objects yet. However, I got relatively better result when I trained the network with more mask loss weight. I suggest you get some of horizontal and vertical objects as well in your training dataset to make it more general.

@fastlater
Copy link

@ronykalfarisi how many cracks do you have in your dataset? less than 2000?
Curious how you get so high accuracy with 850 images only.
I am always working with little data (original + augmentation) and it is hard for me to get good results. Any advise will be appreciate.

@ronykalfarisi
Copy link
Author

Hi @fastlater ,
My dataset is only 893 (850 for training, 43 for testing). I used resnet101 and I increased the mask weight to 10 since I have unbalance dataset. Hope this can help, in case you need more help, you need to describe your problem first

@fastlater
Copy link

fastlater commented Aug 28, 2018

@ronykalfarisi Well, I have only 75 images (65 for training + 5 for val + 5 for testing). From those 65 training images, I only have 1000 and I though maybe because of that I was not getting good results. But you are proving the opposite with your little data.
Do you think increase the mask weight will help me in my case too?

@ronykalfarisi
Copy link
Author

@fastlater, I believe you need more images, I tried using Faster RCNN before and it worked with 300 images. I haven't tried with less data though

@fastlater
Copy link

if I set STEPS_PER_EPOCH larger than my original dataset, does the script generates some extra augmented images to complete the steps? Does model takes some augmented images automatically each epoch?

@ronykalfarisi
Copy link
Author

From my experience, increasing steps_per_epoch doesn't affect the performance. So, it's better to leave it at default (100 in my case). If you use augmentation option, it depends on the probability. You can modify the script how often it does the augmentation, but I think the default is 0.5 or in other words, it only augment your dataset by half

@patrickcgray
Copy link

Hey @ronykalfarisi three questions for you if you have a moment:

  • how much did this change in augmentation method boost your performance?
  • why did @zungam have you set the epoch count to "at least set to six times size of database"? I would've thought he meant STEPS_PER_EPOCH not the number of epochs. Did this seem to increase the performance? Does each epoch just have a single augmentation applied and thus this makes it likely that all six variations will be applied? I'm just confused here because 6*850 seems like way too many epochs.
  • Finally, you said "I used resnet101 and I increased the mask weight to 10 since I have unbalance dataset." can you clarify what you mean here? Did you increase the mrcnn_mask_loss value to 10 and keep all the rest at 1? Why would this help in an unbalanced dataset?

Thanks!!

@ronykalfarisi
Copy link
Author

@patrickcgray,

  1. what change?
  2. Actually, I didn't do what zungam suggested seems it sounds very weird to me. I used tensorflow implementation instead.
  3. It's imbalance data sets since the number of pixel inside the ground truth bounding box is much less than the background pixels. Yes, that's exactly what I did. It helped because we give more penalty to wrong pixel when they make mistakes. In other words, we tell the algorithm to stress more on the mask loss

@patrickcgray
Copy link

@ronykalfarisi

  1. I meant the augmentation code that @zungam suggested. I was curious if that improved your model.
  2. Okay that makes sense. I wish I knew what he meant. I assume he meant steps_per_epoch
  3. Okay thanks for the clarification, I think I need to do the same.

@xDzai94
Copy link

xDzai94 commented Apr 18, 2019

Dear All and @waleedka ,
I've been using this repo to detect and create masking for crack damage on bridge structures. My training dataset has 850 images and overall I got decent result. As you can see in two images below, the model can detect and segment horizontal and vertical crack well. But it fails in detecting diagonal crack. So, I'm thinking that I could solve this problem if I have more images.
5
0
I noticed that we could use image augmentation in the training stage using the keyword option augmentation as follow

model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=30,
            layers='all', 
            augmentation = imgaug.augmenters.Sequential([ 
                imgaug.augmenters.Fliplr(1), 
                imgaug.augmenters.Flipud(1), 
                imgaug.augmenters.Affine(rotate=(-45, 45)), 
                imgaug.augmenters.Affine(rotate=(-90, 90)), 
                imgaug.augmenters.Affine(scale=(0.5, 1.5))]))

However, from what I understand, this augmentation is applied consecutively to each image. In other words, for each image, the augmentation apply flip LR, and then followed by flip UD, then followed by rotation of -45 and 45, then followed by another rotation of -90 and 90, and lastly followed by scaling with factor 0.5 and 1.5.

So, my question is, Is there a way to apply each augmentation separately for each image? What I meant by this is, I want each augmentation to generate one extra data (and mask) alongside with the original. If this can be achieved, the augmentation will generate 6x total images when I apply 5 image augmentation making the whole dataset contains 5100 images.

Thank you and I really appreciate the helps.

Hi, have you found anyways to solve your problem in how to make multiple image augmentation for training dataset? If you have found any, truly appreciate if you can share the solution. Thanks in advance.

@soheilsadeghi90
Copy link

Hi @ronykalfarisi ! I'm working on a dataset with nearly same size as yours and dealing with thin and long masks (horizontal, vertical, and diagonal). I was wondering how you modified the configs to get a good performance. You already mentioned that mrcnn_mask_loss weight is effective. But what about other config elements (unmold_mask threshold, RPN_ANCHOR_RATIOS for example)? I would appreciate it you advise me on that.

@ronykalfarisi
Copy link
Author

@soheilsadeghi90 & @xDzai94 ,
I believe I wrote what I did in the previous comments. I moved toward TensorFlow implementation of Mask-RCNN. They have multiple data augmentations.

@jordanvandijk9
Copy link

@ronykalfarisi What do you exactly mean by "I moved toward TensorFlow implementation of Mask-RCNN"? I believe I am also using the TensorFlow implementation. However, looking at the code it only has one (combination of) data augmentation(s) for each single image. I cannot increase the number of total images by using this function of data augmentation.

Am I just not seeing something (e.g. missing a functionality in the TensorFlow implementation) or is the TensorFlow implementation something totally different than the standard way of using Mask-RCNN?

@ronykalfarisi
Copy link
Author

ronykalfarisi commented Jun 19, 2019

@jordanvandijk9 , I'm sorry, what I meant was "TensorFlow Team" implementation. They have several implementations in their "research" folder and one of them is object detection API.

@Amrimn
Copy link

Amrimn commented Oct 18, 2019

Hi @fastlater ,
My dataset is only 893 (850 for training, 43 for testing). I used resnet101 and I increased the mask weight to 10 since I have unbalance dataset. Hope this can help, in case you need more help, you need to describe your problem first

Hi @ronykalfarisi ,
What do you mean by I increased the mask weight to 10
Here is the mask loss i have
mask_loss = KL.Lambda(lambda x: mrcnn_mask_loss_graph(*x), name="mrcnn_mask_loss")(
[target_mask, target_class_ids, mrcnn_mask])

how can I increase it by10 ?

@ronykalfarisi
Copy link
Author

Hi @Amrimn ,
In config.py inside mrcnn folder, you'll fine something like this

LOSS_WEIGHTS = {
        "rpn_class_loss": 1.,
        "rpn_bbox_loss": 1.,
        "mrcnn_class_loss": 1.,
        "mrcnn_bbox_loss": 1.,
        "mrcnn_mask_loss": 1.
    }

replace the number as you like.

@rubeea
Copy link

rubeea commented Nov 14, 2019

@soheilsadeghi90 & @xDzai94 ,
I believe I wrote what I did in the previous comments. I moved toward TensorFlow implementation of Mask-RCNN. They have multiple data augmentations.

Hi, I am researching on a similar problem. Would highly appreciate if you could provide me the link for the TensorFlow Team Implementation of Mask-RCNN which improved accuracy in your case.

Thanks :)

@ronykalfarisi
Copy link
Author

@kapil0kumar
Copy link

Is there a way to apply each augmentation separately for each image?

Dont use .Sequential([ as it means "do all these augmentations after each other".

I want each augmentation to generate one extra data (and mask) alongside with the original.

I couldnt personally get around the problem so I made sure the database was used six times, that is in practice:
epoch = at least set to six times size of database. But more could be done.
shuffle = False
and made augmentation 5/6 probability of happening. It does not create exactly 5 extra copys for each image, but statistically it would do the same over a large database.

imgaug.Sometimes(5/6,aug.OneOf(
                                            [
                                            imgaug.augmenters.Fliplr(1), 
                                            imgaug.augmenters.Flipud(1), 
                                            imgaug.augmenters.Affine(rotate=(-45, 45)), 
                                            imgaug.augmenters.Affine(rotate=(-90, 90)), 
                                            imgaug.augmenters.Affine(scale=(0.5, 1.5))
                                             ]
                                        )
                                   )
                                            

Note! Whats good about this is also that the effect of augmentations is random, which creates a more safe gradient descent. Training 6 times in a row on the same image, I believe would make your training slower as the gradient descent take bigger leaps in the direction of each batch, thus zig zaging down to local minima.

Hi @zungam, can you explain why have to set shuffle=False in here.

@1chimaruGin
Copy link

Dear @ronykalfarisi

Can I see your repo for this case?

@ronykalfarisi
Copy link
Author

@1chimaruGin , sorry bro I used it for work so I can't show it. However, I used the repo from Tensorflow research team with several modifications.

@Altimis
Copy link

Altimis commented May 27, 2020

@1chimaruGin , sorry bro I used it for work so I can't show it. However, I used the repo from Tensorflow research team with several modifications.

Hi, thank you for suggesting to use the tensorflow team implementation for mask rcnn. Do you confirm that this implementation is more accurate than the matteport one ?

@Adithia99
Copy link

Anybody here , know how to augmentation mask rcnn ?
So we get images and masking (label) in our folder not just in traning case ?

@rubeea
Copy link

rubeea commented Oct 13, 2020

Anybody here , know how to augmentation mask rcnn ?
So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

@Adithia99
Copy link

Both

Anybody here , know how to augmentation mask rcnn ?
So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Both , training case and validation case

@rubeea
Copy link

rubeea commented Oct 14, 2020

Both

Anybody here , know how to augmentation mask rcnn ?
So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Both , training case and validation case

You can use Augmentor python api for augmenting training and validation image data along with the respective masks. Link below:
https://github.com/mdbloice/Augmentor

@Adithia99
Copy link

Both

Anybody here , know how to augmentation mask rcnn ?
So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Both , training case and validation case

You can use Augmentor python api for augmenting training and validation image data along with the respective masks. Link below:
https://github.com/mdbloice/Augmentor

Thanks bro, but i need . Json not just masking like this

@rubeea
Copy link

rubeea commented Oct 14, 2020

Both

Anybody here , know how to augmentation mask rcnn ?
So we get images and masking (label) in our folder not just in traning case ?

Didn't quite understand your question. You want to do augmentation for the training data?

Both , training case and validation case

You can use Augmentor python api for augmenting training and validation image data along with the respective masks. Link below:
https://github.com/mdbloice/Augmentor

Thanks bro, but i need . Json not just masking like this

If you want to label your own data you can use any appropriate online annotation tool such as https://labelbox.com/ which will give you all the images and respective annotations in json file.

@raficabral
Copy link

raficabral commented Sep 11, 2021

Is there a way to apply each augmentation separately for each image?

Dont use .Sequential([ as it means "do all these augmentations after each other".

I want each augmentation to generate one extra data (and mask) alongside with the original.

I couldnt personally get around the problem so I made sure the database was used six times, that is in practice:
epoch = at least set to six times size of database. But more could be done.
shuffle = False
and made augmentation 5/6 probability of happening. It does not create exactly 5 extra copys for each image, but statistically it would do the same over a large database.

imgaug.Sometimes(5/6,aug.OneOf(
                                            [
                                            imgaug.augmenters.Fliplr(1), 
                                            imgaug.augmenters.Flipud(1), 
                                            imgaug.augmenters.Affine(rotate=(-45, 45)), 
                                            imgaug.augmenters.Affine(rotate=(-90, 90)), 
                                            imgaug.augmenters.Affine(scale=(0.5, 1.5))
                                             ]
                                        ) 
                                   )
                                            

Note! Whats good about this is also that the effect of augmentations is random, which creates a more safe gradient descent. Training 6 times in a row on the same image, I believe would make your training slower as the gradient descent take bigger leaps in the direction of each batch, thus zig zaging down to local minima.

Hey guys. I have a question regardless data augmentation using imgaug
If I use this sintaxe @zungam "imgaug.Sometimes(5/6,aug.OneOf", in my interpretation (am I right?), I will be applying to only 5/6 part of my dataset and will be only one transformation applied ("Oneof").
I wanted as @ronykalfarisi apply the five transformation in my whole dataset (5x more data). How can I do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests