Ask coding questions

← Back to all posts
Is Pytesseract actually supported by Repl.it?
Noisewerk (2)

Hi everyone,

I have been trying to run Pytesseract the last few days to no avail, even with the most basic script possible:

There seems to be no one else who has worked on this in Repl.it, which leads me to believe you cannot actually use Pytesseract?

Hope someone knows more about this issue, have a good day and thanks!

Answered by 19wintersp (1121) [earned 5 cycles]
View Answer
Comments
hotnewtop
19wintersp (1121)

You can use Pytesseract, though you need to install the Tesseract executable. You also need to set the location in pytesseract.pytesseract.tesseract_cmd.

Noisewerk (2)

@19wintersp Yeah but how can you do that in Linux though? I don't believe it's possible

19wintersp (1121)

@Noisewerk Run:

install-pkg tesseract-ocr

Then, in your Python code, do:

pytesseract.pytesseract.tesseract_cmd = "tesseract"
Noisewerk (2)

@19wintersp I tried but get a language error:

I've tried to also run "install-pkg tesseract-ocr-eng" but not even that fixes it

Noisewerk (2)

@19wintersp I've looking at that as well and did the following:

>>> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
>>> mv -v /home/runner/Test/eng.traineddata
/home/runner/.apt/usr/share/tesseract-ocr/4.00/tessdata/

Nevertheless, the same error seems to appear. Maybe the path needs to be changed?

19wintersp (1121)

@Noisewerk Oh, I would do:

git clone https://github.com/tesseract-ocr/tessdata

but you need to add the path to Tessdata to TESSDATA_PREFIX:

import os
os.environ["TESSDATA_PREFIX"] = "wherever you put the Tessdata folder"

If you used git clone, that folder would be "~/Test/tessdata" for a repl called "Test".

Noisewerk (2)

@19wintersp In the end the following steps made it indeed work:

>>> install-pkg tesseract-ocr
>>> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
>>> mv -v /home/runner/<yourFolderName>/eng.traineddata
/home/runner/.apt/usr/share/tesseract-ocr/4.00/tessdata/

import os

pytesseract.pytesseract.tesseract_cmd = "tesseract"
os.environ["TESSDATA_PREFIX"] = "/home/runner/.apt/usr/share/tesseract-ocr/4.00/tessdata/"

Thank you so much for the help!

RYANTADIPARTHI (5999)

try pip install Pytesseract, if that doesn't work, then it's probably not available in repl.it

Noisewerk (2)

@RYANTADIPARTHI Yeah, Pytesseract is already installed so it's not that the problem I believe. Guess it is not supported after all although there is the correspondent package. Thanks anyways

Noisewerk (2)

@RYANTADIPARTHI I'm going to leave it open since it does not really solve the problem and maybe a solution is found in the future