Skip to main content

Perform an optimized Data Augmentation

Adopting AI is about proving to your clients and your stakeholders that risks can be managed, that is why we provide you tools to ensure proper validation and documentation.

What does data augmentation mean?

Data augmentation is used for improving deep learning robustness. For example, it can be applied on images for classification solutions. It is often necessary when confronted with a massive domain of use like an open world as it improves the robustness of a neural network. Data augmentation is useful but can be time-consuming thus slowing down your development. In order to scale down the cost of training, you need adequate data augmentation.

When performing your data augmentation Saimple can:
  • Tailor the process effectively in order to boost robustness

  • Control and document the benefits of each augmentation

  • Cut down the number of generated images to speed-up training

Customize your data augmentation techniques

Data augmentation can be done using python within a framework such as pytorch, tensorflow or keras. And it can help your neural network have a better performance on a specific kind of perturbation. It can also help it to generalize more easily on new data. But there are many ways to choose the data augmentation useful for your network implementation, and they usually come also with their own settings that need to be customized. Facing a multidimensional problem, data augmentation can suddenly become a much more difficult problem to solve than anticipated. One wrong setup of a parameter of one data augmentation can have an impact on the outcome. And when training time is not fast enough, you might end up doing blindly your data augmentation on your deep learning model in order to still manage to meet your deadline.

Saimple can help you dramatically decrease the efforts needed to tailor your data augmentation. First by observing the features learned by the neural network you can detect what is already learned and what needs to be added through data augmentation. Then you can see the overall impact on the robustness of your neural network of each data augmentation. Selecting its correct parameters independently will stabilize quickly the setting of your data augmentation process.

Measure the impact of each data augmentation on deep learning models

Each data augmentation can have a different impact in terms of the overall robustness it will bring to a neural network. Some data augmentations will improve the stability of decisions of your neural network, others can have a detrimental effect. For now, the only way to measure the adequacy of the data augmentation is to calculate again and again the accuracy/recall analysis and observe if there is an improvement on a testing dataset. However, in the case of images, testing the image augmentation output on a testing set is not enough to build a strong proof of the robustness of the neural network.

With Saimple you can observe how much your newly trained neural network has improved in terms of stability. The more your neural network can handle noise the more efficient your data augmentation has been. In the following figure, several variations of data augmentation are done to the neural network. The more the curve is shifted to the right the more the neural network can handle noise in its input. Data augmentation in that case has improved the overall robustness of the neural network.

data augmentation measuring robstness

Fig 1 – Different data augmentation can lead to very different stability performance for the neural network. The more the neural network can handle noise in its dataset the better the data augmentation had a positive impact during the training phase.

Limit the computation time spent in training

Using data augmentation with python framework like Keras, Tensorflow or Pytorch for instance, during the training phase will have an impact on the training duration. Indeed, each new data you generate by data augmentation will have to be learned and might render the overall training more difficult. Adding more data points to be learned could create artificial clusters on which the neural network might be efficient but not very good at generalizing outside. Each time the process is not satisfactory, the whole learning process has to be done again from scratch. Therefore, having the right amount of data augmentation for the desired robustness can help you save a lot of time.

With Saimple you can easily measure the impact of your data augmentation for deep learning neural network models and thus control the amount of it you want in your training process. Once the settings of your data augmentation are set, you do not need to change then and you can still control over time that there is no regression in terms of stability.

Related Use Cases

Measure robustness of data augmentation

Does data augmentation make a neural network more robust and how to measure it?Saimple allows us to measure the robustness of a neural network th…

Improve model robustness with data augmentation for e-commerce

With the development of online retailing, artificial intelligence is becoming increasingly important for the fashion industry.According to a repo…

Evaluation of model robustness for traffic sign classification

Among the current challenges facing the automotive industry, demonstrating the reliability of AI embedded systems in autonomous cars is paramount.

Discover more Saimple applications to help you build reliable and trustworthy AI

Image processing

Neural networks are a game changer technology for any use cases involving image processing. It might not be easy, but that’s clearly the way forward. Discover here how you can use our technology to improve all along the way your process to build your image processing systems.

Predictive maintenance

With the rise of the IoT, now is the time to consider predictive maintenance as a way to cut your costs and boost your profitability. For that, you need to process all the data and build a reliable AI system, as a lot will depend on it. Saimple technology can help you build trust in this predictive system.

Get AI You Can Trust: Faster & Easier.

Get a Personalized Demo: Secure & Streamline Your AI Delivery in 30 minutes.