Note that in the following, the names PeriCoDe and SALIC may be used interchangeably since they refer to the same software.
PeriCoDe is a sophisticated software designed to overcome one of the major issues relevant to automatic image annotation, namely the shortage of available training data. In the section ‘Challenges of image annotation’, we noted that the machine learning techniques used in automatic image recognition require large amounts of training data in order to successfully learn patterns from the images which they can then generalise to new, unseen images that the user wants to annotate. Although it is possible to collect more high quality human annotated images, these are very expensive and time consuming to generate; on the other hand, although there are massive data sets of annotated images from social media, these may be of low quality and not fit for the current purpose.
PeriCoDe uses active learning theory to overcome this challenge of lack of available training data by harnessing the quality of human annotations and combining these with the scale of social media data. Specifically, it uses the human annotations to train an initial classifier, which is then used to iteratively select further relevant images from a larger pool data set (e.g. from social media), which together can provide an expanded, high quality training data set for use by machine learning algorithms (and which it tests against a data set supplied by the user).
More information about the technical details of PeriCoDe/SALIC are presented in the following paper:
Chatzilari, E., S. Nikolopoulos, Y. Kompatsiaris and J. Kittler, “SALIC: Social Active Learning for Image Classification,” in IEEE Transactions on Multimedia, vol. 18, no. 8, pp. 1488-1503, Aug. 2016.
doi: 10.1109/TMM.2016.2565440
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7466862&isnumber=7514289
Other detailed information can be found in the deliverable D4.3 Content Semantics and Use Context Analysis Techniques [Maronidis, A. et al (2016)], with relevant highlights presented here (p. 29):
Please refer to the further reading list for the names referenced to in [ ].
Analysis of Visual Content
- SALIC (Social Active Learning for Image Classification) is an approach that automatically gathers training data without requiring significant annotation efforts, while at the same time minimizing the number of the required training instances and increasing the performance of the classification models by utilizing a smart sampling approach.
- PCS (Product Compressive Sampling) is a very fast method for dimensionality reduction, which yields similar results with the popular CS but only requires a small percentage of the time.