Recommended in this issue is a PYTHON 3 Optical Character Recognition (OCR) toolkit – cnocr.
cnocr is mainly for printed text pictures with simple typography, such as screenshots and scanned copies. The current built-in text detection and branch module cannot handle complex text layout positioning. If it is to be used for scene text image recognition, it is necessary to use
in combination with other scene text detection engines.
Installation
pip install cnocr
If the installation speed is slow, you can specify the domestic installation source, such as the use of Douban source:
pip install cnocr -i https://pypi.doubanio.com/simple
Usage
Image recognition for multi-line text
If the image to be recognized contains multiple lines of text, or may contain multiple lines of text (as shown below), you can use CnOcr.ocr() for recognition.
from cnocr import CnOcr
ocr = CnOcr()
res = ocr.ocr('docs/examples/multi-line_cn1.png')
print("Predicted Chars:", res)
or
from cnocr.utils import read_img
from cnocr import CnOcr
ocr = CnOcr()
img_fp = 'docs/examples/multi-line_cn1.png'
img = read_img(img_fp)
res = ocr.ocr(img)
print("Predicted Chars:", res)
Image recognition for single line text
If you know that the image to be recognized contains a single line of text (as shown below), you can use CnOcr.ocr_for_single_line() for identification.
from cnocr import CnOcr
ocr = CnOcr()
res = ocr.ocr_for_single_line('docs/examples/helloworld.jpg')
print("Predicted Chars:", res)
or
from cnocr.utils import read_img
from cnocr import CnOcr
ocr = CnOcr()
img_fp = 'docs/examples/helloworld.jpg'
img = read_img(img_fp)
res = ocr.ocr_for_single_line(img)
print("Predicted Chars:", res)
Example

—END—
Open Source protocol: Apache2.0