Annotation work often becomes a bottleneck in AI development. A large amount of training data is required for AI development, and the annotation work to generate this data requires significant human resources and time.
Various research and development projects are underway to automate high-load annotation tasks in AI development. However, the current reality is that most research targets specific domains, and we have not yet reached the point where annotation work can be automated for general purposes.
Therefore, to improve annotation quality, the perspective of annotator skills is crucial. It can be said that human intervention is absolutely necessary to perform high-quality annotation work.
This article explains the skills required when performing annotation tasks.
For more details regarding the necessity of manual annotation, please refer to the following article as well.
| 【Table of Contents】 |
Human intervention is basically necessary when performing annotation work. In this situation, how highly skilled the annotators are becomes vital. It can be said that annotation quality depends on the skill of the annotator. So, what specific skills are required for humans to perform manual annotation? Below, we explain the skills necessary for proceeding with manual annotation work.
The first perspective is knowledge based on the implementation content. In annotation work, since each target object is assigned a label, appropriate judgment is difficult without knowledge of the subject.
For example, in a case developing AI to detect corrosion in factory equipment, images of piping facilities are judged as "corroded or not" and labeled. Naturally, judgment rules and work manuals are prepared for the task to provide support so that progress of corrosion can be determined; however, an annotator with actual factory work experience will be able to make more accurate judgments. If they have knowledge of factory equipment, not only quality but also work speed is likely to increase.
Also, if it is an annotation task such as extracting proper nouns from English text, English skills are naturally required. If one cannot read English sentences, it would be difficult to perform the annotation task in the first place.
In this way, especially in annotation work for datasets in specialized fields, a certain level of knowledge regarding that field is required.
In annotation where tagging must be performed for a large amount of data, it is common for multiple annotators to carry out the work. At this time, while it is important to gather highly skilled annotators, it is equally important to establish rules regarding the annotation work to guarantee a certain level of quality. This reduces situations where annotators are troubled by judgments and enables high-precision annotation work.
On the other hand, knowledge and experience are required regarding what kind of annotation rules should be established. A manager well-versed in annotation work must perform clear and easy-to-understand rule definitions.
The procedure for annotation work varies depending on the target data, such as images, audio, or text, and also differs depending on whether object detection or area analysis is performed on image data. Whether there is a manager with experience in various types of annotation work on the team will also be an important point when proceeding with manual annotation.
When performing annotation work, situations where judgment is difficult will always arise. In the example of corrosion detection mentioned above, a subtle judgment may be required as to whether a certain image of a piping facility should be judged as having progressed corrosion or if it is still acceptable.
In such cases, if an annotator performs the work based on guesswork, quality will inevitably be unstable. Communicating appropriately so that annotators do not proceed with work while harboring uncertainties is also an important skill.
The content of uncertainties varies; sometimes an annotator is simply troubled by a judgment, and sometimes a response aligned with the original purpose of AI development is required.
For example, assume a case where annotation is being performed on image data for sign recognition toward the development of an autonomous driving system. Suppose there is a type of sign that was not included in the annotation work requirements. In this case, to decide on the handling of that sign, confirmation with the developer or the business owner of that AI system may be necessary. Based on that, a judgment is received on whether to include the target sign as a new requirement or exclude it as an exception, and the work must be implemented after establishing new annotation rules based on this result.
In this way, conducting appropriate communication leads not only to annotation quality but also to the improvement of AI quality.
Many of the skills required for annotation are uniquely human and cannot be replaced by machines. While research and development for annotation automation are progressing, human intervention is absolutely necessary to perform high-quality annotation work and develop high-precision AI.
At Tesla in the U.S., there is a case where a team of 1,000 annotators was formed for AI development. This serves as an example showing that annotators are highly valued by advanced companies.
Even in technically advanced initiatives like AI development, it is necessary for skilled humans to proceed with work diligently. For annotation tasks that must be performed in large quantities, it is common to request the work from an outsourcing partner; in such cases, why not focus on the perspective of what kind of skills the members performing the work possess?
※ Tech Crunch "Top four highlights of Elon Musk’s Tesla AI Day"
This article introduced the skills required when performing annotation tasks. In AI development initiatives, preparing high-quality training data directly links to the quality of the AI. Creating high-quality training data through skilled annotators will be an important point affecting the success or failure of an AI development project.