适合小白的几个入门级Python ocr识别库
适合小白的几个入门级Python ocr识别库_起不好名字就不起了的博客-CSDN博客_ddddocr库
工作生活中经常会遇到需要提取图片中文字信息的情况,以前都是手动自己把图片里的字敲出来,但随着这几年人工智能技术的愈发成熟,市面上有越来越多的ocr产品了,基本上能大部分正常图片的文字提取需求。当然有时候需要提取文字的图片数量较多或者有某个应用程序编写需求时,就需要借助代码来实现了,这里介绍几个比较适合新手小白的python ocr库,简单实用,可满足绝大多数常规的图片文字提取、验证码识别需求。
pytesseract需要配合安装在本地的tesseract-ocr.exe文件一起使用,tesseract-ocr.exe安装教程可参考这里:Tesseract Ocr文字识别,需要注意的是安装时一定要选中中文包,默认是只支持英文识别。
python库安装命令如下:
1 |
pip install pytesseract |
待识别图片如下:
代码实现:
1 2 3 4 5 |
import pytesseract from PIL import Image text = pytesseract.image_to_string(Image.open(r"d:\Desktop\39DEE621-40EA-4ad1-90CC-79EB51D39347.png")) print(text) |
识别结果输出:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Using Tesseract OCR with Python from PIL import Image import pytesseract import ergperse import cv2 import os ap = argparse.ArgunentParser() ap.add_argument("-i", "--image", required-True, help="path to input image to be OCR'd") ap.add_argument("-p", "--preprocess", typesstr, default="thresh", helpe"type of preprocessing to be done") args = vars (ap.parse_args()) |
PaddleOCR是百度开源的一款基于深度学习的ocr识别库,对中文的识别精度相当不错,可以应付绝大多数的文字提取需求。
需要依次安装三个依赖库,安装命令如下,其中shapely库可能会受系统影响安装报错,具体解决方案参考这篇博客:百度OCR(文字识别)服务使用入坑指南
1 2 3 |
pip install paddlepaddle pip install shapely pip install paddleocr |
待识别图片如下:
代码实现:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
ocr = PaddleOCR(use_angle_cls=True, lang="ch") img_path = r"d:\Desktop\4A34A16F-6B12-4ffc-88C6-FC86E4DF6912.png" result = ocr.ocr(img_path, cls=True) for line in result: print(line) from PIL import Image image = Image.open(img_path).convert('RGB') boxes = [line[0] for line in result] txts = [line[1][0] for line in result] scores = [line[1][1] for line in result] im_show = draw_ocr(image, boxes, txts, scores) im_show = Image.fromarray(im_show) im_show.show() |
识别结果输出如下,会显示出每个区域字体识别的置信度,以及其坐标位置信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Namespace(cls=False, cls_batch_num=30, cls_image_shape='3, 48, 192', cls_model_dir='C:\\Users\\Administrator/.paddleocr/cls', cls_thresh=0.9, det=True, det_algorithm='DB', det_db_box_thresh=0.5, det_db_thresh=0.3, det_db_unclip_ratio=2.0, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_max_side_len=960, det_model_dir='C:\\Users\\Administrator/.paddleocr/det', enable_mkldnn=False, gpu_mem=8000, image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN', rec_batch_num=30, rec_char_dict_path='./ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='C:\\Users\\Administrator/.paddleocr/rec/ch', use_angle_cls=True, use_gpu=True, use_space_char=True, use_tensorrt=False, use_zero_copy_run=False) dt_boxes num : 16, elapse : 0.04799485206604004 cls num : 16, elapse : 0.1860027313232422 rec_res num : 16, elapse : 0.4859299659729004 [[[6.0, 2.0], [85.0, 2.0], [85.0, 31.0], [6.0, 31.0]], ['帮助文档', 0.99493873]] [[[309.0, 13.0], [324.0, 13.0], [324.0, 28.0], [309.0, 28.0]], ['X', 0.9667116]] [[[82.0, 50.0], [120.0, 50.0], [120.0, 71.0], [82.0, 71.0]], ['目录', 0.993418]] [[[136.0, 50.0], [176.0, 50.0], [176.0, 71.0], [136.0, 71.0]], ['标题', 0.99969745]] [[[13.0, 53.0], [60.0, 53.0], [60.0, 70.0], [13.0, 70.0]], ['快捷键', 0.9995322]] [[[191.0, 49.0], [314.0, 49.0], [314.0, 72.0], [191.0, 72.0]], ['文本样式列表', 0.9967863]] [[[61.0, 84.0], [120.0, 84.0], [120.0, 101.0], [61.0, 101.0]], ['代码片', 0.9997086]] [[[134.0, 81.0], [181.0, 84.0], [180.0, 104.0], [132.0, 101.0]], ['表格', 0.9891155]] [[[187.0, 84.0], [232.0, 84.0], [232.0, 101.0], [187.0, 101.0]], ['注脚', 0.99958]] [[[13.0, 115.0], [90.0, 115.0], [90.0, 135.0], [13.0, 135.0]], ['自定义列表', 0.99823236]] [[[109.0, 115.0], [219.0, 115.0], [219.0, 135.0], [109.0, 135.0]], ['LaTeX数学公式', 0.98812836]] [[[237.0, 115.0], [315.0, 115.0], [315.0, 135.0], [237.0, 135.0]], ['插入甘特图', 0.9982792]] [[[12.0, 148.0], [94.0, 148.0], [94.0, 167.0], [12.0, 167.0]], ['插入UML图', 0.9926085]] [[[113.0, 148.0], [249.0, 148.0], [249.0, 167.0], [113.0, 167.0]], ['插入Mermaid流程图', 0.996088]] [[[11.0, 176.0], [153.0, 176.0], [153.0, 200.0], [11.0, 200.0]], ['插入Flowchart流程图', 0.9780351]] [[[174.0, 179.0], [237.0, 179.0], [237.0, 200.0], [174.0, 200.0]], ['插入类图', 0.9519753]] |
github上一万多个star的开源ocr项目(github地址:EasyOCR),支持80多种语言的识别,识别精度超高。
python库安装命令如下:
1 |
pip install easyocr |
待识别图片如下:
代码实现:
1 2 3 4 5 |
import easyocr reader = easyocr.Reader(['ch_sim','en'], gpu = False) result = reader.readtext(r"d:\Desktop\4A34A16F-6B12-4ffc-88C6-FC86E4DF6912.png", detail = 0) print(result) |
初次运行需要在线下载检测模型和识别模型,建议在网速好点的环境运行:
1 2 3 |
Using CPU. Note: This module is much faster with a GPU. Downloading detection model, please wait. This may take several minutes depending upon your network connection. Downloading recognition model, please wait. This may take several minutes depending upon your network connection. |
识别结果输出如下,没有遗漏任何一个文字,精度甚至要优于前面的PaddleOCR:
1 |
<span class="token punctuation">[</span><span class="token string">'帮助文档'</span><span class="token punctuation">,</span> <span class="token string">'快捷键'</span><span class="token punctuation">,</span> <span class="token string">'目录'</span><span class="token punctuation">,</span> <span class="token string">'标题'</span><span class="token punctuation">,</span> <span class="token string">'文本样式'</span><span class="token punctuation">,</span> <span class="token string">'列表'</span><span class="token punctuation">,</span> <span class="token string">'链接'</span><span class="token punctuation">,</span> <span class="token string">'代码片'</span><span class="token punctuation">,</span> <span class="token string">'表格'</span><span class="token punctuation">,</span> <span class="token string">'注脚'</span><span class="token punctuation">,</span> <span class="token string">'注释'</span><span class="token punctuation">,</span> <span class="token string">'自定义列表'</span><span class="token punctuation">,</span> <span class="token string">'LaTex 数学公式'</span><span class="token punctuation">,</span> <span class="token string">'插入甘犄图'</span><span class="token punctuation">,</span> <span class="token string">'插入UML图'</span><span class="token punctuation">,</span> <span class="token string">'插入Mernaid流程图'</span><span class="token punctuation">,</span> <span class="token string">'插入 Flowchart流程图'</span><span class="token punctuation">,</span> <span class="token string">'插入类图'</span><span class="token punctuation">]</span> |
muggle_ocr是一款轻量级的ocr识别库,从名字也可以看出来,专为麻瓜设计!使用也非常简单,但其强项主要是用于识别各类验证码,一般文字提取效果就稍差了。
python库安装命令如下:
1 |
pip install muggle_ocr |
待识别验证码如下:
代码实现:
1 2 3 4 5 6 7 8 9 10 |
import muggle_ocr sdk = muggle_ocr.SDK(model_type=muggle_ocr.ModelType.Captcha) with open(r"d:\Desktop\四位验证码.png", "rb") as f: img = f.read() text = sdk.predict(image_bytes=img) print(text) |
识别结果输出如下:
1 2 3 |
MuggleOCR Session <span class="token punctuation">[</span>captcha<span class="token punctuation">]</span> Loaded<span class="token punctuation">.</span> 3n3d |
dddd_ocr也是一个用于识别验证码的开源库,又名带带弟弟ocr,爬虫界大佬sml2h3开发,识别效果也是非常不错,对一些常规的数字、字母验证码识别有奇效。
python库安装命令如下:
1 2 |
pip install dddd_ocr |
待识别验证码如下:
代码实现:
1 2 3 4 5 6 7 8 9 10 11 |
import ddddocr ocr = ddddocr.DdddOcr() with open("d:\Desktop\四位验证码2.png", 'rb') as f: img_bytes = f.read() res = ocr.classification(img_bytes) print(res) |
识别结果输出如下,可以看出即使有一些线条干扰,还是准确的识别出了四个字母:
1 |
jepv |
还有其他优秀的ocr识别库,以后慢慢更新
转载请注明:徐自远的乱七八糟小站 » 适合小白的几个入门级Python ocr识别库