An AI-powered character recognition system that converts Brahmilipi script images to Kannada Unicode characters using deep learning
Brahmilipi script, an ancient writing system, poses significant challenges for modern digitization and preservation efforts. Manual transcription of historical documents is time-consuming, prone to human error, and requires expert knowledge of the script. There is a critical need for an automated system that can accurately recognize and convert Brahmilipi characters to modern Kannada Unicode, enabling preservation of historical texts, facilitating academic research, and making ancient manuscripts accessible to a broader audience. This project addresses the challenge of developing a deep learning-based character recognition system that can process Brahmilipi script images with high accuracy, bridging the gap between ancient manuscripts and digital archives while supporting cultural heritage preservation and linguistic research.
The system is built using a Convolutional Neural Network (CNN) implemented with TensorFlow and Keras. The model processes 64x64 grayscale images through two convolutional layers with BatchNormalization, followed by MaxPooling for feature reduction. Dropout layers (0.25-0.5) prevent overfitting, and the final softmax layer classifies characters into 7 Kannada vowels and consonants (ಅ, ಆ, ಇ, ಈ, ಉ, ಊ, ಕ).
An advanced synthetic data generation pipeline creates training datasets with realistic variations. The system applies noise injection, geometric transformations, and various preprocessing techniques using OpenCV to simulate real-world document conditions. Images undergo robust preprocessing including grayscale conversion, noise reduction, and normalization to ensure consistent model input.
A Flask-based web application provides an intuitive user interface for real-time character recognition. Users can upload Brahmilipi script images through an HTML5 interface, and the system returns instant predictions with Unicode Kannada characters. The application includes image display capabilities and maintains a mapping system between Brahmilipi representations and Kannada Unicode values.