Senior Software Engineer with expertise in solution design for image processing and text extraction, leveraging deep learning and heuristic approaches to drive significant business improvements. Proven track record of implementing machine learning solutions in the banking sector, consistently delivering high-quality results within deadlines. Advocates for humanistic management techniques that enhance employee morale while optimizing production efficiency. Focused on balancing technical innovation with team well-being to foster long-term organizational success.
An application to extract text from images (TIFF, JPG, PNG), and also from documents (PDF, OXPS) This tool is capable of handling various image pre-processing tasks, such as orientation and skew correction, which, in turn, improve the data extraction This tool is developed on a generic basis, and is independent of any document type or template Using this tool, data is extracted from various bank documents, and then further entity extraction is done based on the business requirements Based on the problem statement, some additional features are added to this tool for further enhancement, including transactional documents for payment (debit and credit), payslips, and tax documents
Skills Set: Pytesseract, CTPN, Python, OpenCV, EasyOCR, NLP, NER, and GIT
To extract critical fields: payee name, payee address, bank name, cheque date, cheque amount, routing number, check number, and account number, using HTR and Tesseract as the OCR engine, and a deep learning-based object detection algorithm to detect various regions of the cheque.
Skill set: Pytesseract, CNN, FRCNN, HTR, Tesseract, TensorFlow, and GIT
This is a React tool that uses simple Python packages to perform data extraction from the uploaded documents through the UI, followed by the data-refining process to save the output in a unified format, as per business requirements. This tool has API and UI components for user-friendly access: Tabula, Python, Django, GIT