Skip to content

The importance of establishing annotation rules: Introduction to examples and key points

 

image (2)-2

 

To improve the quality of annotation, it is important to establish clear annotation rules.
Annotation is the process of adding ‘meaning’ to data such as text, audio and images using tags in AI development. By tagging data with annotations, AI can recognise the meaning of each piece and use it as training data for learning.

 

 

Providing clear and easy-to-understand rules is especially important when multiple annotators are involved in the annotation process, as this helps to avoid inconsistencies in the work performed by each annotator.

In this article, we will explain the necessity of annotation rules, give specific examples, and highlight key points for creating effective rules.

 

 

 

1. Why Annotation Rules Are Necessary

image (3)


So, why is it necessary to establish rules for the annotation process?


Let's say you are developing an AI that can distinguish between good and bad products, you will need image data for both good and bad products. However, images or videos captured naturally during operations, such as those from factory cameras, often do not indicate whether the products are good or bad.
In such cases, it is necessary to label the image or video data with tags that specify whether the product is good or bad. This labelling process is called annotation.

In some cases, annotation can sometimes be automated, but in more complex situations - such as when the boundary between good and bad products is difficult to determine - it often requires manual work. In addition, the performance of AI depends on the data, so a large amount of data is needed to develop AI with good performance. As a result, annotation work often needs to be carried out by multiple people when developing an AI.

When a large number of annotators (workers doing annotation) are involved, it is difficult to produce high-quality annotations without clear rules. If annotation rules are not established, it is not possible to produce consistent work, as the names used for tagging by each annotator will differ, and the criteria for judgement will also be different. Establishing clear annotation rules is essential so that annotators can perform annotations of a certain quality.

 

Nextremer offers data annotation services to achieve highly accurate AI models. If you are considering outsourcing annotation, free consultation is available. Please feel free to contact us.

 

2. Specific examples of annotation rules
image (4)


So, what kind of content needs to be set as annotation rules?
The rules that need to be defined will vary, depending on the target data and the AI being developed. Here, we will use camera image data captured for the development of an autonomous driving AI as an example of how to set up annotation rules.


・Object detection

image (35)


To achieve autonomous driving, the AI must detect and identify various objects around the vehicle. There are various objects on the road, such as cars, pedestrians, traffic lights and signs, all of which need to be recognised by the AI through annotation. Specifically, annotation involves distinguishing these objects by using bounding boxes and giving them names accordingly.

 

Example annotation rules:
Draw bounding boxes around “cars, buses, taxis, etc.” and label each as “vehicle.”

 

・Region classification

image (6)


As a rule, cars can only drive on roads with a few exceptions. Therefore, it is necessary to be able to recognise not only objects, but also areas such as roads and sidewalks. We need to give meaning to these regions by highlighting them with colour coding in the image as part of the annotation process so that the AI can distinguish them.

 

Example annotation rules:
Identify sidewalk areas separated by guardrails and kerbs etc., and label them as “sidewalk.”

 

・Image classification

image (7)


The image itself also needs to be classified. For example, depending on whether the image was taken in the daytime or at night, the conditions under which the AI learns may need to be adjusted.

Example annotation rules:
Distinguish between “day” and “night” based on image brightness or whether street lights are on or off.

As you can see, the annotation work assigns meaning to image data from a variety of perspectives, and tasks need to be standardised to ensure consistency across annotators. By establishing clear annotation guidelines, annotation work can be performed accurately, which in turn improves the accuracy of the AI. For the AI to be able to identify traffic lights of any shape as traffic lights, it is necessary to carry out appropriate annotation work that follows the annotation rules.

 

 

3. Things to keep in mind when establishing annotating rules
image (8)


What should be aware of when creating annotation rules? Here are some key points to consider:


Set clear rules

It is important to make sure that annotations are carried out in the same way by all annotators.
For example, when annotating “traffic lights”, you may have different notations, such as “traffic signals” and “signal lights”. Furthermore, we should consider variations, such as whether to distinguish between pedestrian signals and how to handle arrow-type traffic lights. Given this, it is important to clearly define rules that account for the various patterns likely to arise during the annotation process, ensuring that all annotations are performed consistently using the same set of rules.


Make the guidelines easy to understand

To improve annotation quality, the rules should be made in a way that is easy for annotators to understand.
For example, when annotating image data, it would be helpful to include actual sample images in the guidelines so that annotators are less likely to be confused when making decisions. In this case, for example, both arrow-type and regular traffic lights are to be treated the same way, it’s helpful to include photos of both types in the guidelines to cover the range of patterns.
As well as the strictness of the rules, the clarity of the rules will also lead to improved annotation quality.


Continuously improve based on annotation result

Even after annotation rules are initially established and annotation work begins, it is important to continue refining the rules.
In practice, you may encounter new classification criteria or irregular patterns that were not anticipated during the initial design of the rules. It can be particularly challenging to define all decision criteria at the outset, so it is often more effective to adjust the rules as you review actual data.

That is why it is essential to conduct periodic reviews after a certain volume of annotations has been completed. Check whether there are any rules that are difficult for annotators to interpret or any undefined conditions. Based on these findings, refine the annotation guidelines accordingly. Doing so will help ensure consistent, high-quality annotations.

 

4. Conclusion

In this article, we introduced the importance of establishing annotation rules and provided concrete examples and key points to bear in mind when creating them.

Although annotation is a common process in AI development, its necessity and significance may not be widely recognised. When planning an AI project, it is also important to consider how annotation will be carried out. While setting up annotation rules is crucial, this can be challenging for teams with limited AI development experience, especially when doing so in-house. In such cases, it may be worth considering outsourcing not only the annotation work itself, but also the creation and maintenance of annotation guidelines.

 

Nextremer offers data annotation services to achieve highly accurate AI models. If you are considering outsourcing annotation, free consultation is available. Please feel free to contact us.

 

 

Author

 

nextremer-toshiyuki-kita-author

 

Toshiyuki Kita
Nextremer VP of Engineering

After graduating from the Graduate School of Science at Tohoku University in 2013, he joined Mitsui Knowledge Industry Co., Ltd. As an engineer in the SI and R&D departments, he was involved in time series forecasting, data analysis, and machine learning. Since 2017, he has been involved in system development for a wide range of industries and scales as a machine learning engineer at a group company of a major manufacturer. Since 2019, he has been in his current position as manager of the R&D department, responsible for the development of machine learning systems such as image recognition and dialogue systems.

 

Latest Articles