Skip to content
← Back to Community
Is Pytesseract actually supported by Repl.it?
Profile icon
Noisewerk

Hi everyone,

I have been trying to run Pytesseract the last few days to no avail, even with the most basic script possible:

image

There seems to be no one else who has worked on this in Repl.it, which leads me to believe you cannot actually use Pytesseract?

Hope someone knows more about this issue, have a good day and thanks!

Answered by 19wintersp [earned 5 cycles]
View Answer
Voters
Profile icon
gouvs1
Profile icon
TheWeebMonkey
Profile icon
linksafe
Profile icon
Intenzi
Profile icon
cyb3rswapp3r
Profile icon
Noisewerk
Comments
hotnewtop
Profile icon
19wintersp

You can use Pytesseract, though you need to install the Tesseract executable. You also need to set the location in pytesseract.pytesseract.tesseract_cmd.

Profile icon
Noisewerk

@19wintersp Yeah but how can you do that in Linux though? I don't believe it's possible

Profile icon
19wintersp

@Noisewerk Run:

install-pkg tesseract-ocr

Then, in your Python code, do:

pytesseract.pytesseract.tesseract_cmd = "tesseract"
Profile icon
Noisewerk

@19wintersp I tried but get a language error:

image

I've tried to also run "install-pkg tesseract-ocr-eng" but not even that fixes it

Profile icon
19wintersp

@Noisewerk I believe you have to clone Tessdata.

Profile icon
Noisewerk

@19wintersp I've looking at that as well and did the following:

>>> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata >>> mv -v /home/runner/Test/eng.traineddata /home/runner/.apt/usr/share/tesseract-ocr/4.00/tessdata/

Nevertheless, the same error seems to appear. Maybe the path needs to be changed?

Profile icon
19wintersp

@Noisewerk Oh, I would do:

git clone https://github.com/tesseract-ocr/tessdata

but you need to add the path to Tessdata to TESSDATA_PREFIX:

import os os.environ["TESSDATA_PREFIX"] = "wherever you put the Tessdata folder"

If you used git clone, that folder would be "~/Test/tessdata" for a repl called "Test".

Profile icon
Noisewerk

@19wintersp In the end the following steps made it indeed work:

>>> install-pkg tesseract-ocr >>> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata >>> mv -v /home/runner/<yourFolderName>/eng.traineddata /home/runner/.apt/usr/share/tesseract-ocr/4.00/tessdata/ import os pytesseract.pytesseract.tesseract_cmd = "tesseract" os.environ["TESSDATA_PREFIX"] = "/home/runner/.apt/usr/share/tesseract-ocr/4.00/tessdata/"

Thank you so much for the help!

Profile icon
blacksmithop

@19wintersp the problem is, the entire tessdata exceeds the repl-it Disk quota

Profile icon
19wintersp

@blacksmithop If that happens, I don't think there's really a way to fix it; sorry.

Profile icon
Intenzi

@Noisewerk can you please specify on the third shell command that you did

Profile icon
19wintersp

@Intenzi What do you mean?

Profile icon
Intenzi

@19wintersp

>>> mv -v /home/runner/<yourFolderName>/eng.traineddata /home/runner/.apt/usr/share/tesseract-ocr/4.00/tessdata/

what does this line do, as I am unable to follow through with it

Profile icon
19wintersp

@Intenzi If you're following their steps, this is what moves the download into the correct folder; just use this:

mv -v ~/$REPL_SLUG/eng.traineddata ~/.apt/usr/share/tesseract-ocr/4.00/tessdata/
Profile icon
ChanDe

@blacksmithop @19wintersp @Noisewerk Do you think if I upgrade to the highest level, I would be able to clone the entire tessdata?

Profile icon
RYANTADIPARTHI

try pip install Pytesseract, if that doesn't work, then it's probably not available in repl.it

Profile icon
Noisewerk

@RYANTADIPARTHI Yeah, Pytesseract is already installed so it's not that the problem I believe. Guess it is not supported after all although there is the correspondent package. Thanks anyways

Profile icon
RYANTADIPARTHI

@Noisewerk no problem.

Profile icon
Noisewerk

@RYANTADIPARTHI I'm going to leave it open since it does not really solve the problem and maybe a solution is found in the future

Profile icon
linksafe

Thank you very much @RYANTADIPARTHI, it worked.

Profile icon
RYANTADIPARTHI

@linksafe my solution?

Profile icon
linksafe
Profile icon
linksafe

No I can't, I didn't ask this question @RYANTADIPARTHI

Profile icon
gouvs1

@linksafe Were you able to process images in Python with tesseract in replit? I cannot seem to get it working, and so far I haven't seen anyone who could other you.

Profile icon
linksafe

I'm getting an error: "Pytesseract is not in PATH."/"Pytesseract is not installed."/etc.
I can't find a way to get it to work.
@gouvs1