Pytesseract：「TesseractNotFoundエラー：tesseractがインストールされていないか、パスにありません」、これを修正するにはどうすればよいですか？

Question

私はPythonで基本的で非常にシンプルなコードを実行しようとしています。

from PIL import Image import pytesseract im = Image.open("sample1.jpg") text = pytesseract.image_to_string(im, lang = 'eng') print(text)

これはどのように見えるかです、私は実際にインストーラを介してWindows用のtesseractをインストールしました。私はPythonが初めてで、どのように進めたらいいのかわかりませんか？

ここでのガイダンスは非常に役立ちます。 Spyderアプリケーションを再起動しようとしましたが、役に立ちませんでした。

Nafeez Quraishi · Accepted Answer

さまざまな答えにステップが散らばっているのがわかります。 Windowsでのこのpytesseractエラーの最近の経験に基づいて、エラーを簡単に解決できるように、異なる手順を順番に記述します。

1。 https://github.com/UB-Mannheim/tesseract/wiki で利用可能なWindowsインストーラーを使用してtesseractをインストールします

2。インストールからのtesseractパスに注意してください。この編集時のデフォルトのインストールパスはC:\Users\USER\AppData\Local\Tesseract-OCRです。変更される可能性があるため、インストールパスを確認してください。

3。 pip install pytesseract

4。 image_to_stringを呼び出す前に、スクリプトでtesseractパスを設定します。

pytesseract.pytesseract.tesseract_cmd = r'C:\Users\USER\AppData\Local\Tesseract-OCR esseract.exe'

Kumar S · Answer

Windowsの場合：

pip install tesseract

pip install tesseract-ocr

そして、システムusr/appdata/local/programs/site-pakages/python/python36/lib/pytesseract/pytesseract.pyファイルに保存されているファイルを確認し、ファイルをコンパイルします

Ali · Answer

まず、バイナリをインストールする必要があります。

Linuxの場合

Sudo apt update Sudo apt install tesseract-ocr Sudo apt install libtesseract-dev

Mac

brew install tesseract

Windows

https://github.com/UB-Mannheim/tesseract/wiki からバイナリをダウンロードします。次に、pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR esseract.exe'をスクリプトに追加します。

次に、pipを使用してpythonパッケージをインストールする必要があります。

pip install tesseract pip install tesseract-ocr

参照： https://pypi.org/project/pytesseract/ （インストールセクション）および https://github.com/tesseract-ocr/tesseract/wiki#installation

GeorgPoe · Answer

https://pypi.org/project/pytesseract/ から：

pytesseract.pytesseract.tesseract_cmd = '<full_path_to_your_tesseract_executable>' # Include the above line, if you don't have tesseract executable in your PATH # Example tesseract_cmd: 'C:\Program Files (x86)\Tesseract-OCR\tesseract'

Mujeeb Ishaque · Answer

Windowsのみ

1-コンピューターにTesseract OCRをインストールする必要があります。

ここから入手してください。 https://github.com/UB-Mannheim/tesseract/wiki

適切なバージョンをダウンロードします。

2-システム環境へのtesseractパスを保存します。つまり、システム変数を編集します。

3-pip install pytesseractおよびpip install tesseract

4-この行をpythonスクリプトに追加します

pytesseract.pytesseract.tesseract_cmd = 'C:\OCR\Tesseract-OCR\tesseract.exe' ^パスは異なる場合があります。

5-コードを実行します。

Kumar S · Answer

このパッケージをインストールできます... https://github.com/UB-Mannheim/tesseract/wiki その後、このパスC：\ Program Files（x86）\ Tesseract-OCR \次に、tesseract.exeでtesseractファイルを実行します。これはあなたを助けると思う...

Kenstars · Answer

Tesseractをインストールする必要があります。

https://github.com/tesseract-ocr/tesseract/wiki

インストールに関する上記のドキュメントを確認してください。

fuwiak · Answer

Macで（私のために働く）

brew install tesseract

Sherifi · Answer

# {Windows 10 instructions} # before you use the script you need to install the dependence # 1. download the tesseract from the official link: # https://github.com/UB-Mannheim/tesseract/wiki # 2. install the tesseract # i chosed this path # *replace the user string in the below path with you name of user that you are using in your current machine # C:\Users\user\AppData\Local\Tesseract-OCR\ # 3. Install the pillow for your python version # * the best way for me is to install is this form(i'am using python3.7 version and in my CMD i run this version of python by typing py -3.7): # * if you are using another version of python first look how you start the python from you CMD # * for some machine the run of python from the CMD is different # [examples] # ================================= # PYTHON VERSION 3.7 # python # python3.7 # python -3.7 # python 3.7 # python3 # python -3 # python 3 # py3.7 # py -3.7 # py 3.7 # py3 # py -3 # py 3 # PYTHON VERSION 3.6 # python # python3.6 # python -3.6 # python 3.6 # python3 # python -3 # python 3 # py3.6 # py -3.6 # py 3.6 # py3 # py -3 # py 3 # PYTHON VERSION 2.7 # python # python2.7 # python -2.7 # python 2.7 # python2 # python -2 # python 2 # py2.7 # py -2.7 # py 2.7 # py2 # py -2 # py 2 # ================================ # we are using pip to install the dependences # because for me i start the python version 3.7 with the following line # py -3.7 # open the CMD in windows machine and type the following line: # py -3.7 -m pip install pillow # 4. Install the pytesseract and tesseract for your python version # * the best way for me is to install is this form(i'am using python3.7 version and in my CMD i run this version of python by typing py -3.7): # we are using pip to install the dependences # open the CMD in windows machine and type the following lines: # py -3.7 -m pip install pytesseract # py -3.7 -m pip install tesseract #!/usr/bin/python from PIL import Image import pytesseract import os import getpass def extract_text_from_image(image_file_name_arg): # IMPORTANT # if you have followed my instructions to install this dependence in above text explanatin # for my machine is # if you don't put the right path for tesseract.exe the script will not work username = getpass.getuser() # here above line get the username for your machine automatically tesseract_exe_path_installation="C:\Users\"+username+"\AppData\Local\Tesseract-OCR\tesseract.exe" pytesseract.pytesseract.tesseract_cmd=tesseract_exe_path_installation # specify the direction of your image files manually or use line bellow if the images are in the script directory in folder images # image_dir="D:\GIT\ai_example\extract_text_from_image\images" image_dir=os.getcwd()+"\images" dir_seperator="\" image_file_name=image_file_name_arg # if your image are in different format change the extension(ex. ".png") image_ext=".jpg" image_path_dir=image_dir+dir_seperator+image_file_name+image_ext print("=============================================================================") print("image used is in the following path dir:") print("	"+image_path_dir) print("=============================================================================") img=Image.open(image_path_dir) text=pytesseract.image_to_string(img, lang="eng") print(text) # change the name "image_1" whith the name without extension for your image name # image_file_name_arg="image_1" image_file_name_arg="image_2" # image_file_name_arg="image_3" # image_file_name_arg="image_4" # image_file_name_arg="image_5" extract_text_from_image(image_file_name_arg) # ================================== # CREATED BY: SHERIFI # e-mail: sherif_co@yahoo.com # git-link for script: https://github.com/sherifi/ai_example.git # ==================================

Codemaker · Answer

次のコマンドを使用してtesseractをインストールします

pip install tesseract

Desmond Tung · Answer

Windowsでは、デフォルトのWindows Tesseractインストールの場合、コマンドパスをリダイレクトする必要があります。

32ビットシステムでは、インポートコマンドの後にこの行を追加します。

pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR	esseract.exe'

64ビットシステムでは、代わりにこの行を追加します。

 pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files\Tesseract-OCR	esseract.exe'

user3642503 · Answer

ステップ1：

OSに従ってシステムにtesseractをインストールします。最新のインストーラーは https://github.com/UB-Mannheim/tesseract/wiki にあります。

ステップ2：次を使用して、次の依存ライブラリをインストールします。pip install pytesseract pip install opencv-python pip install numpy

ステップ3：サンプルコード

import cv2 import numpy as np import pytesseract from PIL import Image from pytesseract import image_to_string # Path of working folder on Disk Replace with your working folder src_path = "C:\Users\<user>\PycharmProjects\ImageToText\input\" # If you don't have tesseract executable in your PATH, include the following: pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract- OCR/tesseract' TESSDATA_PREFIX = 'C:/Program Files (x86)/Tesseract-OCR' def get_string(img_path): # Read image with opencv img = cv2.imread(img_path) # Convert to gray img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Apply dilation and erosion to remove some noise kernel = np.ones((1, 1), np.uint8) img = cv2.dilate(img, kernel, iterations=1) img = cv2.erode(img, kernel, iterations=1) # Write image after removed noise cv2.imwrite(src_path + "removed_noise.png", img) # Apply threshold to get image with only black and white #img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2) # Write the image after apply opencv to do some ... cv2.imwrite(src_path + "thres.png", img) # Recognize text with tesseract for python result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) # Remove template file #os.remove(temp) return result print('--- Start recognize text from image ---') print(get_string(src_path + "image.png") ) print("------ Done -------")