Advertisements

Wednesday, 12 September 2018

Facebook introduces Rosetta AI to identify text in images and videos

Since it is a tiring job for moderators to look at every image or video posted on the enormous platform, the company is building an AI to help them out. The system it built is called Rosetta and is said to extract text from more than a billion public Facebook and Instagram images and video frames in a wide variety of languages, daily and in real time, and inputs it into a text recognition model that has been trained on classifiers to understand the context of the text and the image together. It performs text extraction on an image in two independent steps; detection and recognition. It detects rectangular regions that potentially contain text in the first step and in the second step, it performs text recognition, where, for each of the detected regions, it uses a convolutional neural network (CNN) to recognize and transcribe the word in the region. For text detection, Facebook said that it has adopted an approach based on Faster R-CNN. In a nutshell, Faster R-CNN simultaneously performs detection and recognition by learning a CNN that can represent an image as a convolutional feature map. The whole detection system is trained jointly in a supervised, end-to-end manner. Facebook's text detection ...