"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

May 21, 2018

Day#108 - OCR for Hindi

OCR for Hindi

1. Download data from https://github.com/tesseract-ocr/tessdata/blob/3.04.00/hin.traineddata

2. Copy it to C:\opencv\Tesseract-OCR\tessdata/hin.traineddata


3. Test Data



3. Output

Code
#https://theailearner.com/2019/05/29/creating-a-crnn-model-to-recognize-text-in-an-image-part-1/
import sys
import cv2
import numpy as np
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = "C:\\opencv\\Tesseract-OCR\\tesseract"
#Update the path if it doesn't work
#pytesseract.pytesseract.tesseract_cmd = r"D:\\Tesseract\Tesseract-OCR\tesseract.exe"
img = Image.open("E:\\Hindi2.jpg")
print(pytesseract.image_to_string(img,lang='hin'))
view raw OCRHindi.py hosted with ❤ by GitHub

More Reads - Link, Link1

Happy Learning!!!

No comments: