E.D.S.O Leadership Ecuation

There are many books that promise that they will guide you to understand business, be a persuasive person, understand people’s behavior, how to create a successful organization or how to lead a team…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




How to Curate Data for Computer Vision Models

This guide introduces the steps to curate data for a new computer vision model. Suitable for those who are new to artificial intelligence.

Here at GOVTECH’s Data Science and Artificial Intelligence Division, we collaborate with Singapore government agencies to build Artificial Intelligence projects. Often, we work with public officers who see value in Artificial Intelligence adoption but are new to the technology. We learnt that for those new to Artificial Intelligence, starting a project can be difficult and questions usually centre on data. In this article, we will talk about the initial steps in building a Computer Vision model, focusing on data curation.

A Computer Vision project starts with the intent (what you want to gain) and the data (what images or videos you have). Data curation is a critical part of model development as Computer Vision models are derived by learning from the data they see. We define data curation as the process of selecting, preparing and organising a collection of data such that the value of the data can be maintained over time. Any experienced AI Engineer will tell you that it is common to spend 80% of computer vision model development time in data acquisition, curation and annotation.

The focus of data curation is on having good data and not just getting more data.

Good data is:

Proper data curation will ensure good data is used for model training, which in turn will help optimise the time required for model training and development.

People who are familiar with Computer Vision appear to be able to intuitively determine the nature of data to use and the volume of data required. The truth is that for any given problem, there is no fixed formula for data curation. Data curation requires experimentation and constant refinement to achieve better model training results. Fortunately, it is a skill that can be honed over time with a bit of willingness to try and data availability. Here are some tips from us to get started.

In this article, we assume that the intent or problem has been defined and that it can be addressed using Computer Vision techniques. The first step will be to determine the suitable model type for the problem on hand.

Here, we will walk through the data curation process using two common computer vision problem types — Object Detection and Image Classification.

Object Detection models are typically used to count, locate, track or identify objects in the same scene. They are to address the question “Where is the [object of interest] in the scene?” and are suitable if there are multiple objects of interests to be identified from an image.

Image of artemia
(Image courtesy of Singapore Food Agency for illustration purposes)

One example is the identification and counting of rotifers or artemia as shown above. For this image, we are interested in the total count of artemia and shells. Intuitively, you can see that manual counting is tedious and time-consuming. Over an extended period of time, manual counting can be prone to errors and become inconsistent. Object Detection models can be used to automate this manual process at scale with consistent count accuracies.

Image Classification models are used to sort or identify images. They are to address the question “Is there an [object of interest] in the scene?”.

Images of different manhole covers

For example, we use image classification to differentiate images with different manhole cover types without the need to locate exactly where the manhole cover is within the image as above. Or, it can be used to identify if the image is a picture of a bench or city.

After determining the model type, it is useful to establish a consistent image annotation process. It is ideal to have an annotation guide to:

Whether you are an individual or part of a team, the guide will help to build consistent and relevant data to train and test a usable Computer Vision model.

Included below is a sample of what can be included in the guide. The guide should contain these elements:

Look through your data and group your images into distinctive scenes.

Distinctive scenes usually arise when there are different camera sources. You can consider different scenes as images of different backgrounds. For example, scenes capturing roads, buildings or grass patches. Each scene can also vary in environments (lighting condition, weather) or viewpoints (pose, field of views).

Total the amount of data for each scene and aim for parity in the amount of data per scene. This is to ensure that the data is distributed fairly.

Prioritise data that are the same or similar to the scene used for operation or production. For example, if the problem is to identify the type of manhole covers on roads, prioritise having more data with scenes of manhole covers on roads rather than on grass patches.

Name the images with an intuitive convention. For example, an image name can have a convention location-date-img-number (“ubin-210101-img-0001.jpg”) or class-scene-number (“sewage-road-0001.png”). This will help to:

Sample images should be included as they help the team visualise how the classes should be labelled.

I have divided this section into Annotation Guidelines for Object Detection models followed by Annotation Guidelines for Image Classification models for ease of reference.

Before you start annotating, you need to define object classes. When the images are being annotated, the objects of interest are labelled according to the classes defined.

It is advisable to use a descriptive name for each class for ease of identification. Rather than naming your classes as A, B etc…, use names such as artemia or SewageCover.

As most models are case sensitive, it is important to determine the class naming convention to be either all small, all big or mixed caps at the start to avoid any unexpected errors.

To annotate images for Object Detection models, we draw a bounding box over each object of interest. We then give that part of the image a pre-determined class name. In the following diagram, we annotated two classes of objects, naming each object either a ‘Shell’ or an ‘Artemia’.

Object Detection classes of Artemia and Shell
(Image courtesy of Singapore Food Agency for illustration purposes)

When drawing the bounding boxes, we recommend the following guidelines:

Defining concise bounding boxes
(Image courtesy of Singapore Food Agency for illustration purposes)
Determining the minimum size
(Image courtesy of NParks for illustration purposes)
Defining how a boar looks like
(Image courtesy of NParks for illustration purposes)

Sometimes, it is time-consuming to annotate a busy scene of many objects. It is therefore tempting to draw a box over a cluster as one object. This is acceptable if the intent is to detect if the scene has the object or not. But if the intent is to count individual objects, then you should not annotate a cluster as one object — that is, you have to painstakingly annotate every one of them.

For Image Classification models, there is no need to draw bounding boxes on the training images. You only need to label the entire image as its corresponding class. For some applications like VideoIO, the process of annotation is as simple as tagging images with the same class.

We also recommend including these guidelines:

Image Classification classes of Sewage and Telecom
Poor image samples for classification

It is important to have a review process, especially if more than one person is annotating. A review process minimises annotation errors and helps the team to systematically produce good annotated data. The following flow can be considered:

Now with a structure in place, the next question to answer is much data is required?

Typically, the volume of training data required to train a good model is in the tens of thousands or more. But with techniques like transfer learning and data augmentation, the quantity of training data required can be greatly reduced. That said, it is still necessary to build your own dataset progressively and incrementally so that you can observe the increase in model performance with each addition.

Example of building a model by adding data progressively and observing the performance using VideoIO

You need to split the data into non-overlapping training, validation and test sets. It is important to note that each dataset is to be used solely for one purpose, either training, validation or testing. Training data must not be used for validation or testing. Likewise, test data must not be used for training or validation. We suggest dividing your data into 70% (training set)-15% (validation set)-15% (test set) for each class. Other ratios such as 60%-20%-20% or 70%-20%-10% work as well.

They are for the following purposes:

This dataset is used to train (also known as fit) the model. The model will see and learn from this dataset during the training process.

This dataset is used to provide an unbiased evaluation of a model fit on the training dataset. It is used to ‘validate’ the model accuracies as the training progresses. Validation dataset is usually used to fine-tune the model hyperparameters during training.

This dataset is used to provide an unbiased evaluation of a final model fit on the training dataset. It is not for fine-tuning of model hyperparameters but to understand how well the trained model performs. A robust set of test data can help identify where the model performs well and where it does not.

It is important to note that large quantity of similar image samples do not help the model training process and may in fact cause the trained model to be biased (towards those image samples).

Training a good Computer Vision model takes time. Merely increasing samples does not always directly lead to a better trained model. With careful data curation, one will be able to reap the benefits of saving time and effort while maximising model performance. The steps outlined above are by no means a magic bullet — you still require experimentation and learning. However, this guide will help you in your model training journey in a calibrated way.

Add a comment

Related posts:

What we can learn from Mr. Rogers

As with a documentary and, most recently, a major motion picture depicting the life of the beloved Mr. Fred Rogers reminded me, many of us hoped on top of hope that these films would not ruin the…

Do You Have What It Takes to Have a Fulfilling Career?

Choosing a career can be one of the most difficult decisions you’ll ever make. But with careful planning and thinking, you can find a career that is both fulfilling and pays well. There are many…

How to Create A Super Successful LinkedIn Profile

LinkedIn has been around for some time now and people are still figuring how to cover up the gaps in their resume section. More importantly, standing out on that platform is tricky. It’s like an…