pytesseractを使用して画像からテキストを認識する

Question

この画像からテキストを抽出するには、pytesseractを使用する必要があります。

そしてコード：

from PIL import Image, ImageEnhance, ImageFilter import pytesseract path = 'pic.gif' img = Image.open(path) img = img.convert('RGBA') pix = img.load() for y in range(img.size[1]): for x in range(img.size[0]): if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102: pix[x, y] = (0, 0, 0, 255) else: pix[x, y] = (255, 255, 255, 255) img.save('temp.jpg') text = pytesseract.image_to_string(Image.open('temp.jpg')) # os.remove('temp.jpg') print(text)

「temp.jpg」は

悪くありませんが、印刷の結果は,2 WWではなく、正しいテキスト2HHHなので、これらの黒い点を削除するにはどうすればよいですか？

Smith John · Answer

私の解決策は次のとおりです。

import pytesseract from PIL import Image, ImageEnhance, ImageFilter im = Image.open("temp.jpg") # the second one im = im.filter(ImageFilter.MedianFilter()) enhancer = ImageEnhance.Contrast(im) im = enhancer.enhance(2) im = im.convert('1') im.save('temp2.jpg') text = pytesseract.image_to_string(Image.open('temp2.jpg')) print(text)

Dinesh Chandra Kumawat · Answer

私たちのコミュニティには、何か別のpytesseractアプローチがあります。これが私のアプローチです

import pytesseract from PIL import Image text = pytesseract.image_to_string(Image.open("temp.jpg"), lang='eng', config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789') print(text)

SIM · Answer

Webからテキストを直接抽出するには、次の実装(making use of the first image)を試すことができます。

import io import requests import pytesseract from PIL import Image, ImageFilter, ImageEnhance response = requests.get('https://i.stack.imgur.com/HWLay.gif') img = Image.open(io.BytesIO(response.content)) img = img.convert('L') img = img.filter(ImageFilter.MedianFilter()) enhancer = ImageEnhance.Contrast(img) img = enhancer.enhance(2) img = img.convert('1') img.save('image.jpg') imagetext = pytesseract.image_to_string(img) print(imagetext)

nishit chittora · Answer

ノイズと特定の色周波数範囲内の任意のラインを除去する私の小さな進歩です。

import pytesseract from PIL import Image, ImageEnhance, ImageFilter im = Image.open(img) # img is the path of the image im = im.convert("RGBA") newimdata = [] datas = im.getdata() for item in datas: if item[0] < 112 or item[1] < 112 or item[2] < 112: newimdata.append(item) else: newimdata.append((255, 255, 255)) im.putdata(newimdata) im = im.filter(ImageFilter.MedianFilter()) enhancer = ImageEnhance.Contrast(im) im = enhancer.enhance(2) im = im.convert('1') im.save('temp2.jpg') text = pytesseract.image_to_string(Image.open('temp2.jpg'),config='-c tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz -psm 6', lang='eng') print(text)

nexoma · Answer

cv2.resizeだけ画像のサイズを大きくする必要があります

image = cv2.resize(image,(0,0),fx=7,fy=7)

私の写真200x40-> HZUBS

同じ画像のサイズを変更しました1400x300-> A 1234（したがって、これは正しいです）

その後、

retval, image = cv2.threshold(image,200,255, cv2.THRESH_BINARY) image = cv2.GaussianBlur(image,(11,11),0) image = cv2.medianBlur(image,9)

パラメータを変更して結果を向上させる

Page segmentation modes: 0 Orientation and script detection (OSD) only. 1 Automatic page segmentation with OSD. 2 Automatic page segmentation, but no OSD, or OCR. 3 Fully automatic page segmentation, but no OSD. (Default) 4 Assume a single column of text of variable sizes. 5 Assume a single uniform block of vertically aligned text. 6 Assume a single uniform block of text. 7 Treat the image as a single text line. 8 Treat the image as a single Word. 9 Treat the image as a single Word in a circle. 10 Treat the image as a single character. 11 Sparse text. Find as much text as possible in no particular order. 12 Sparse text with OSD. 13 Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific.