Built in the open
Akshara is fully open source. The website, the OCR extraction pipeline, the scripts that turn crumbling PDFs into readable text — all of it lives on GitHub.
akshara
The website you're reading right now. A Hugo-powered digital library for India's forgotten literary heritage — hand-curated book pages, a clean reading interface, full-text search, and everything needed to make century-old texts feel alive on screen.
akshara-extract
A multi-pass archival digitization pipeline. Takes horribly scanned PDFs from Archive.org and turns them into clean, structured markdown using Gemini Flash for OCR extraction, Claude Haiku for assembly planning, and deterministic verification to ensure nothing is lost or hallucinated. The engine behind every book on this site.
Found a bug? Want to contribute a book? Have ideas for improvements?
Pull requests and issues are always welcome.