What is Indian LIPI?

The CLARIN K-Centre for Indian Language Innovation and Publishing Infrastructure (Indian-LIPI) is a specialised knowledge centre that provides expertise in multilingual digital scholarship and cultural digital heritage. The initiative is a joint effort of digital humanists at the JPN Centre and Digital Humanities and Publishing Research Group at IIT Indore. By combining expertise in language technologies, publishing studies, and digital humanities, Indian-LIPI supports researchers, archivists, and digital humanists working with Indian languages, textual archives, and publishing infrastructures. The centre aims to function as a regional and international hub for collaboration, training, and methodological support in multilingual digital research.

Mission
The Centre offers guidance and expertise in multilingual digital scholarship and cultural digital heritage practices. It supports research projects with digital texts, archives, and tools, and provides access to resources, training, and documentation. The Centre also hosts the Digital Humanities Intersections (DHI) journal, strengthening its role as a hub for digital humanities research — integrating AI and ML tools to publish issues in Indian languages.
Planned Services
The centre provides multilingual digital humanities support, including guidance on OCR, digitisation workflows, corpus creation, metadata enrichment, and text analysis. Users can access documentation, training materials, examples, and tools, as well as a helpdesk for project queries. The page also links to publications, datasets, ongoing initiatives, and service updates.
Physical Infrastructure

Housed within IIT Indore’s main campus, the Centre’s physical infrastructure spans specialised laboratory spaces, scanning facilities, and seminar rooms designed for both technical operations and humanistic inquiry.

Digital Imaging Studio Lab
Equipped with the i2S CopiBook OS A2 Scanner — a professional-grade book scanner producing archival-quality digital images of manuscripts, documents, maps, and artefacts up to A2 size.
Active · Digitisation
Computing & OCR Infrastructure
Supports LIMB Processing Software for image processing and metadata management, and TESSERACT OCR for converting digitised materials into searchable, machine-readable text across multiple Indian scripts.
Active · Processing
Seminar & Workshop Spaces
Flexible seminar infrastructure supporting 9 major academic events since founding, including hybrid international workshops, PhD colloquia, and hands-on training sessions for up to 50+ participants.
Active · Events
Research & Reference Resources
Access to IIT Indore’s central library holdings, digital journal subscriptions, and specialised humanities databases, supplemented by Centre-specific collections in Digital Humanities and Environmental Humanities literature.
Active · Research Support
Publication Infrastructure — DHI Journal
Open-Access Journal · IIT Indore
Digital Humanities Intersections (DHI)
The Centre hosts the Digital Humanities Intersections journal — a peer-reviewed open-access publication advancing scholarship in multilingual digital humanities, publishing studies, and computational methods. Future issues will integrate AI and ML tools to publish in Indian languages, positioning DHI as a model for multilingual scholarly publishing infrastructure.
Visit DHI Journal
Digital Infrastructure

The Centre’s digital stack is designed for long-term preservation, open access, and interoperability — aligned with FAIR principles (Findable, Accessible, Interoperable, Reusable).

Digital Infrastructure Pipeline
01
Acquisition & Capture
  • A2 High-res scanning
  • Audio-visual recording
  • Field documentation
  • Community-sourced materials
02
Processing & Annotation
  • LIMB image processing
  • TESSERACT OCR
  • Metadata tagging
  • Multilingual annotation
03
Archive & Storage
  • Multi-modal archive platform
  • Access control systems
  • Community governance
  • Long-term preservation
04
Publication & Access
  • DHI Open-Access Journal
  • Project websites
  • jpnnationalcentre.com
  • Public data repositories
05
Research & Re-use
  • NLP / computational analysis
  • Scholarly citation
  • Community access
  • Policy engagement
Server Infrastructure · HPE ProLiant
HPE ProLiant DL380 Gen11 · TPS1024 Configuration

The TPS1024-configured HPE ProLiant DL380 Gen11 server forms a core component of enterprise digital infrastructure, delivering high-performance compute, storage, and virtualisation capabilities. Designed for hybrid cloud, AI, and data-intensive workloads, it enables secure, scalable, and intelligently managed IT environments.

High-Performance Compute Virtualisation Hybrid Cloud Ready AI & ML Workloads Enterprise Storage
Human Infrastructure

The Centre’s most important infrastructure is its people — a multidisciplinary team spanning Digital Humanities, Environmental Humanities, Social Sciences, and technical support, guided by national and international advisory bodies.

Digital Humanities
Computational Text AnalysisDigital ArchivingNLP & Corpus LinguisticsDigital Mapping (GIS)Multilingual DHCritical Digital Studies
Environmental Humanities
Environmental EthicsPolitical EcologyClimate StudiesEcological HistoryEnvironmental JusticeIndigenous Ecologies
Social Sciences
Labour StudiesCultural StudiesDevelopment EconomicsSpatial EconometricsQualitative MethodsCommunity Research
Administrative & Technical
Project ManagementFinancial AdministrationArchive ManagementDigital CommunicationsEvent Coordination
Advisory Structure
BodyCompositionRoleKey Contributions
National Advisory CommitteeSenior scholars from IITs, national institutes of eminenceStrategic oversightResearch priority-setting; curriculum design; project selection
International Expert PanelScholars from Oxford, ECU, FU Berlin, Heidelberg, KGU JapanInternational quality assuranceGlobal benchmarking; collaborative workshops; peer review
Project Selection CommitteeHigh-level scholars, double-blind external reviewersResearch quality controlEvaluation of proposals; merit-based selection across 2 funding rounds
JPN Faculty TeamCore faculty at IIT Indore (HSS Department)Day-to-day leadershipProgramme delivery; faculty-led research; external funding mobilisation
Services & Capabilities
Service Architecture
01
Access & Data
  • Access to documentation
  • Access to DH tools
  • Access to data
  • Research infrastructure
  • Annotated corpora
  • Parallel texts
02
OCR & Digitisation
  • OCR services (LIMB, Tesseract)
  • Book scanning
  • Metadata creation support
  • OCR post-correction
  • Multilingual text handling
  • Heritage digitisation
03
Training & Support
  • Training workshops
  • User assistance
  • Technical support
  • Helpdesk
  • Research consulting
  • Workflow design
04
Publishing & NLP
  • DHI journal platform
  • Data processing on demand
  • NLP & corpus annotation
  • Named entity recognition
  • Topic modelling
  • Text-to-speech
Who We Serve
Research Community
DH Research StudentsDH FacultyComputational LinguistsDH PractitionersMultilingual Digital Humanists
Cultural Heritage
ArchivistsLibrariansMuseum TechnologistsDigital CuratorsGLAM Professionals
Publishing & Journals
Editors of DH JournalsContributorsScholarly PublishersOpen Science Advocates
Expertise Areas
OCR & DigitisationCorpus MethodsText AnalyticsCultural Heritage PreservationDigital CartographySpatial Literary Analysis
Support to Indian Languages

Indian-LIPI serves as the primary broad portal for the following Indian languages, covering all aspects of language use, documentation, and digital infrastructure.

Tamil
Dravidian · ISO 639-1: ta
One of the oldest classical languages of India. Corpus building, OCR, and digitisation support for Tamil literary and archival materials.
Malayalam
Dravidian · ISO 639-1: ml
Support for digital humanities workflows and text processing in Malayalam, including metadata enrichment and corpus annotation.
س
Sindhi
Indo-Aryan · ISO 639-1: sd
Focused expertise in Sindhi literature in post-partition India, marginalisation studies, and digital interventions for preservation.
Hindi
Indo-Aryan · ISO 639-1: hi
Comprehensive coverage for India’s most widely spoken language including NLP resources, shared corpora, and pre-trained models.
Bengali
Indo-Aryan · ISO 639-1: bn
Support for Bengali corpus creation, literary digitisation, and multilingual digital scholarship including diaspora writing.
A
English
Indo-European · ISO 639-1: en
Primary documentation language for all K-Centre services; support for multilingual comparative and translation studies.
Additional Language Expertise

The Centre has supported projects in marginal languages outside the 8th Schedule of the Indian Constitution:

AngikaPhalee Naga LanguagesAssamese + Other Minority Languages
Data Modalities & Linguistic Topics
Modalities Covered
TextImages (manuscripts, printed books)Audio (oral histories)Audio-visual (cultural practices)Multimodal Data
Linguistic Topics
Language DocumentationMultilingualismTranslation StudiesDiachronic Language StudiesCorpus-based AnalysisOpen Science Publishing
Selected Publications

The JPN Centre brings a strong track record of peer-reviewed publications, edited volumes, and open-access digital scholarship spanning multilingual DH, digital cartography, corpus linguistics, and publishing studies.

01
Practices of Digital Humanities in India: Learning by Doing
Nirmala Menon & Maya Dodd (Eds.) · Routledge UK · 2024
02
Sindhi Literature in Post-Partition India: Marginalisation, Challenges, and Digital Interventions
Govindani, V., & Menon, N. · Diaspora, Indigenous, and Minority Education · 2024
03
Applications and Developments of NLP Resources for Text Processing in Indian Languages
Joseph, J., Lalithsriram, S.R., & Menon, N. · Multilingual Digital Humanities, 48–58 · 2023
04
Digital Migration Infrastructure in Return-Writing: Visualizing the Migration Landscape of India
Mukherjee, P., & Menon, N. · Frontiers in Sociology, 9 · 2024
05
Digital Cartography and Feminist Geocriticism Case Study II: Kilvenmani Massacre
Justin, J., & Menon, N. · Cartographica, 59(2) · 2024
06
Challenges and Opportunities of Scholarly Publishing Landscape: A Case Study From India
Das, S., & Menon, N. · Learned Publishing, 38(4) · 2025
07
Spatial Hypertexts or Hypermaps: A Proposal for Using Maps as Hypertexts in Geo-Spatial Archives
Justin, J., & Menon, N. · Journal of Map & Geography Libraries, 19(1–2) · 2023
08
Indian Electronic Literature Anthology
Menon, N., Shanmugapriya, T., Joseph, J., & Sutton, D. · IIT Indore KSHIP · 2023 · Open Access ↗
Helpdesk & Queries
Get in Touch with
Indian LIPI
Submit your project queries, request access to documentation, or explore collaboration opportunities. The K-Centre responds to all queries within 2 working days via a human agent.
JPN Centre Website ↗
Convener JPN Centre, IIT Indore
Prof. Nirmala Menon
Institute Chair Professor, HSS
Chair, JPN National Centre
Affiliate Research Professor, University of Oxford
Editor, Digital Humanities Quarterly
nmenon@iiti.ac.in convenerjpn@iiti.ac.in jpnnationalcentre.com ↗
Contact Person
Ms. Anima Singh
phd2401194002@iiti.ac.in