Challenges of annotation

There are three main challenges relating to annotations: firstly, developing the annotation schema or deciding what existing annotations to apply to an object; second, applying these annotations to an object (or more usually a group of objects); and third, ensuring these annotations are applied accurately to the objects.

As a case study, these issues are interestingly illustrated in the implementation of ‘ReCAPTCHA’ (which for years in its original form was used to establish that a user was human rather than a ‘bot’ when signing up for products or services such as an email account, or posting a blog; https://en.wikipedia.org/wiki/ReCAPTCHA, also see the description of the older version https://en.wikipedia.org/wiki/CAPTCHA). Although developed by university researchers, ReCAPTCHA was eventually bought by Google who used it as part of their book digitization project to decode words from scanned texts that could not be automatically converted into text using the usual OCR processes because the scanned image was poor or distorted. In this case, since the objective is to ‘translate’ scanned images into language, then the first issue of annotation schema is simply all the words of that language (English or otherwise), with each image of a word annotated with that word in machine (and human) readable text. (More usually, the choice may more likely be a formal cataloguing system, or ontology, which we will cover in more detail in the next part of this module). The second issue of applying annotations to the object (in this case the image of a word) is achieved by providing an incentive for the task (e.g., the user wants to access the service protected by the ReCAPTCHA, and so needs to annotate the image in order to prove they are human). The third issue of agreement or accuracy of annotations was in this case addressed by providing two word images requiring annotation, one of which had already been annotated (known), and one which had not (unknown) – therefore if the annotation which the user provided matched the ‘known’ annotation, then it was likely that the annotation provided for the ‘unknown’ image was accurate (there was also likely to be an additional stage of validation of these annotations across multiple users).

To learn more about some of the issues and approaches relating to validity across different annotations, the introduction in the following text is recommended:

Gwet, Kilem L. (2014) Handbook of Inter-Rater Reliability, Fourth Edition, (Gaithersburg : Advanced Analytics, LLC) ISBN 978-0970806284. (Introductory chapter can be found at: http://www.agreestat.com/book4/9780970806284_prelim_chapter1.pdf)