适合小白的几个入门级Python ocr识别库
适合小白的几个入门级Python ocr识别库_起不好名字就不起了的博客-CSDN博客_ddddocr库
工作生活中经常会遇到需要提取图片中文字信息的情况,以前都是手动自己把图片里的字敲出来,但随着这几年人工智能技术的愈发成熟,市面上有越来越多的ocr产品了,基本上能大部分正常图片的文字提取需求。当然有时候需要提取文字的图片数量较多或者有某个应用程序编写需求时,就需要借助代码来实现了,这里介绍几个比较适合新手小白的python ocr库,简单实用,可满足绝大多数常规的图片文字提取、验证码识别需求。
pytesseract需要配合安装在本地的tesseract-ocr.exe文件一起使用,tesseract-ocr.exe安装教程可参考这里:Tesseract Ocr文字识别,需要注意的是安装时一定要选中中文包,默认是只支持英文识别。
1 |
pip install pytesseract |
1 2 3 4 5 |
import pytesseract from PIL import Image text = pytesseract.image_to_string(Image.open(r"d:\Desktop\39DEE621-40EA-4ad1-90CC-79EB51D39347.png")) print(text) |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Using Tesseract OCR with Python from PIL import Image import pytesseract import ergperse import cv2 import os ap = argparse.ArgunentParser() ap.add_argument("-i", "--image", required-True, help="path to input image to be OCR'd") ap.add_argument("-p", "--preprocess", typesstr, default="thresh", helpe"type of preprocessing to be done") args = vars (ap.parse_args()) |
1 2 3 |
pip install paddlepaddle pip install shapely pip install paddleocr |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
ocr = PaddleOCR(use_angle_cls=True, lang="ch") img_path = r"d:\Desktop\4A34A16F-6B12-4ffc-88C6-FC86E4DF6912.png" result = ocr.ocr(img_path, cls=True) for line in result: print(line) from PIL import Image image = Image.open(img_path).convert('RGB') boxes = [line[0] for line in result] txts = [line[1][0] for line in result] scores = [line[1][1] for line in result] im_show = draw_ocr(image, boxes, txts, scores) im_show = Image.fromarray(im_show) im_show.show() |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Namespace(cls=False, cls_batch_num=30, cls_image_shape='3, 48, 192', cls_model_dir='C:\\Users\\Administrator/.paddleocr/cls', cls_thresh=0.9, det=True, det_algorithm='DB', det_db_box_thresh=0.5, det_db_thresh=0.3, det_db_unclip_ratio=2.0, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_max_side_len=960, det_model_dir='C:\\Users\\Administrator/.paddleocr/det', enable_mkldnn=False, gpu_mem=8000, image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN', rec_batch_num=30, rec_char_dict_path='./ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='C:\\Users\\Administrator/.paddleocr/rec/ch', use_angle_cls=True, use_gpu=True, use_space_char=True, use_tensorrt=False, use_zero_copy_run=False) dt_boxes num : 16, elapse : 0.04799485206604004 cls num : 16, elapse : 0.1860027313232422 rec_res num : 16, elapse : 0.4859299659729004 [[[6.0, 2.0], [85.0, 2.0], [85.0, 31.0], [6.0, 31.0]], ['帮助文档', 0.99493873]] [[[309.0, 13.0], [324.0, 13.0], [324.0, 28.0], [309.0, 28.0]], ['X', 0.9667116]] [[[82.0, 50.0], [120.0, 50.0], [120.0, 71.0], [82.0, 71.0]], ['目录', 0.993418]] [[[136.0, 50.0], [176.0, 50.0], [176.0, 71.0], [136.0, 71.0]], ['标题', 0.99969745]] [[[13.0, 53.0], [60.0, 53.0], [60.0, 70.0], [13.0, 70.0]], ['快捷键', 0.9995322]] [[[191.0, 49.0], [314.0, 49.0], [314.0, 72.0], [191.0, 72.0]], ['文本样式列表', 0.9967863]] [[[61.0, 84.0], [120.0, 84.0], [120.0, 101.0], [61.0, 101.0]], ['代码片', 0.9997086]] [[[134.0, 81.0], [181.0, 84.0], [180.0, 104.0], [132.0, 101.0]], ['表格', 0.9891155]] [[[187.0, 84.0], [232.0, 84.0], [232.0, 101.0], [187.0, 101.0]], ['注脚', 0.99958]] [[[13.0, 115.0], [90.0, 115.0], [90.0, 135.0], [13.0, 135.0]], ['自定义列表', 0.99823236]] [[[109.0, 115.0], [219.0, 115.0], [219.0, 135.0], [109.0, 135.0]], ['LaTeX数学公式', 0.98812836]] [[[237.0, 115.0], [315.0, 115.0], [315.0, 135.0], [237.0, 135.0]], ['插入甘特图', 0.9982792]] [[[12.0, 148.0], [94.0, 148.0], [94.0, 167.0], [12.0, 167.0]], ['插入UML图', 0.9926085]] [[[113.0, 148.0], [249.0, 148.0], [249.0, 167.0], [113.0, 167.0]], ['插入Mermaid流程图', 0.996088]] [[[11.0, 176.0], [153.0, 176.0], [153.0, 200.0], [11.0, 200.0]], ['插入Flowchart流程图', 0.9780351]] [[[174.0, 179.0], [237.0, 179.0], [237.0, 200.0], [174.0, 200.0]], ['插入类图', 0.9519753]] |
1 |
pip install easyocr |
1 2 3 4 5 |
import easyocr reader = easyocr.Reader(['ch_sim','en'], gpu = False) result = reader.readtext(r"d:\Desktop\4A34A16F-6B12-4ffc-88C6-FC86E4DF6912.png", detail = 0) print(result) |
1 2 3 |
Using CPU. Note: This module is much faster with a GPU. Downloading detection model, please wait. This may take several minutes depending upon your network connection. Downloading recognition model, please wait. This may take several minutes depending upon your network connection. |
1 |
<span class="token punctuation">[</span><span class="token string">'帮助文档'</span><span class="token punctuation">,</span> <span class="token string">'快捷键'</span><span class="token punctuation">,</span> <span class="token string">'目录'</span><span class="token punctuation">,</span> <span class="token string">'标题'</span><span class="token punctuation">,</span> <span class="token string">'文本样式'</span><span class="token punctuation">,</span> <span class="token string">'列表'</span><span class="token punctuation">,</span> <span class="token string">'链接'</span><span class="token punctuation">,</span> <span class="token string">'代码片'</span><span class="token punctuation">,</span> <span class="token string">'表格'</span><span class="token punctuation">,</span> <span class="token string">'注脚'</span><span class="token punctuation">,</span> <span class="token string">'注释'</span><span class="token punctuation">,</span> <span class="token string">'自定义列表'</span><span class="token punctuation">,</span> <span class="token string">'LaTex 数学公式'</span><span class="token punctuation">,</span> <span class="token string">'插入甘犄图'</span><span class="token punctuation">,</span> <span class="token string">'插入UML图'</span><span class="token punctuation">,</span> <span class="token string">'插入Mernaid流程图'</span><span class="token punctuation">,</span> <span class="token string">'插入 Flowchart流程图'</span><span class="token punctuation">,</span> <span class="token string">'插入类图'</span><span class="token punctuation">]</span> |
1 |
pip install muggle_ocr |
1 2 3 4 5 6 7 8 9 10 |
import muggle_ocr sdk = muggle_ocr.SDK(model_type=muggle_ocr.ModelType.Captcha) with open(r"d:\Desktop\四位验证码.png", "rb") as f: img = f.read() text = sdk.predict(image_bytes=img) print(text) |
1 2 3 |
MuggleOCR Session <span class="token punctuation">[</span>captcha<span class="token punctuation">]</span> Loaded<span class="token punctuation">.</span> 3n3d |
1 2 |
pip install dddd_ocr |
1 2 3 4 5 6 7 8 9 10 11 |
import ddddocr ocr = ddddocr.DdddOcr() with open("d:\Desktop\四位验证码2.png", 'rb') as f: img_bytes = f.read() res = ocr.classification(img_bytes) print(res) |
1 |
jepv |
转载请注明:徐自远的乱七八糟小站 » 适合小白的几个入门级Python ocr识别库