Main Content

Automatically Detect and Recognize Text Using Pretrained CRAFT Network and OCR

This example shows how to perform text recognition by using a deep learning based text detector and OCR. In the example, you use a pretrained CRAFT (character region awareness for text) deep learning network to detect the text regions in the input image. You can modify the region threshold and the affinity threshold values of the CRAFT model to localise an entire paragraph, a sentence, or a word. Then, you use OCR to recognize the characters in the detected text regions.

Read Image

Read an image into the MATLAB® workspace.

I = imread("handicapSign.jpg");

Detect Text Regions

Detect text regions in the input image by using the detectTextCRAFT function. The CharacterThreshold value is the region threshold to use for localizing each character in the image. The LinkThreshold value is the affinity threshold that defines the score for grouping two detected texts into a single instance. You can fine-tune the detection results by modifying the region and affinity threshold values. Increase the value of the affinity threshold for more word-level and character-level detections. For information about the effect of the affinity threshold on the detection results, see the Detect Characters by Modifying Affinity Threshold example.

To detect each word on the parking sign, set the value of the region threshold to 0.3. The default value for the affinity threshold is 0.4. The output is a set of bounding boxes that localize the words in the image scene. The bounding box specifies the spatial coordinates of the detected text regions in the image.

bbox = detectTextCRAFT(I,CharacterThreshold=0.3);

Draw the output bounding boxes on the image by using the insertShape function.

Iout = insertShape(I,"rectangle",bbox,LineWidth=4);

Display the input image and the output text detections.

figure(Position=[1 1 600 600]);
ax = gca;
montage({I;Iout},Parent=ax);
title("Input Image | Detected Text Regions")

Figure contains an axes object. The axes object with title Input Image | Detected Text Regions contains an object of type image.

Recognize Text

The ocr function performs best on images that contain dark text on light background. Convert the input image to a binary image and invert it to obtain an image that contains dark text on a light background.

Igray = im2gray(I);
Ibinary = imbinarize(Igray);
Icomplement = imcomplement(Ibinary);

Display the binary image and the inverted binary image.

figure(Position=[1 1 600 600]);
ax = gca;
montage({Ibinary;Icomplement},Parent=ax);
title("Binary Image | Inverted Binary Image")

Figure contains an axes object. The axes object with title Binary Image | Inverted Binary Image contains an object of type image.

Recognize the text within the bounding boxes by using the ocr function. Set the LayoutAnalysis Name-Value argument to "Word" as the word regions are manually provided in the ROI input.

output = ocr(Icomplement,bbox,LayoutAnalysis="Word");

Display the recognized words.

recognizedWords = cat(1,output(:).Words);

figure
imshow(I)
zoom(2)
showShape("rectangle",bbox,Label=recognizedWords,Color="yellow")

Figure contains an axes object. The axes object contains an object of type image.

See Also

| | | | | |

Related Topics