
Tulu Munisi
A transliteration-based keyboard enabling digital use of the Tulu language.
Docbook
Github Project
Despite being Unicode-encoded, Tulu lacks phonetic digital input tools, making everyday typing inconsistent, exclusionary, and unreliable for its speakers.
Duration
4 Months
Role
Solo Designer
Interaction Design
UX
Linguistic
HCI

Design Outline
The project seeks to address these gaps by building a fully custom web-based transliteration keyboard system
shaped by script-specific logic, dialectal variation, and user needs, this system includes:
A transliteration engine capable of phonetic-to-Unicode mapping,
A shaping-aware input pipeline congruent with Indic orthographic rules.
A multi-dialect predictive text engine capable of learning variants rather than standardising them away.
A Unicode-compliant font that renders the script reliably across browsers.
Together, these interventions aim to restore agency to Tulu speakers by giving them a consistent, self-determined, and decolonised means of writing their language digitally.
Project TImeline
Loom Walkthrough
Context
Tulu
What
Tulu is a Dravidian language spoken primarily in coastal Karnataka and northern Kerala, with a rich oral tradition and a deeply rooted cultural identity. While widely used in everyday life, performance, and ritual, Tulu has long faced challenges in written representation.
Historically, the language was written using the Tigalari script, a South Indian Brahmic script used across the region for centuries, especially in manuscripts related to religion, literature, and administration. Over time, the dominance of Kannada and Malayalam scripts in education and print gradually displaced Tigalari from everyday use.
Who
Tulu is spoken by approximately two to three million people, primarily within communities native to the coastal regions of Karnataka and northern Kerala. The language is used across diverse social and cultural groups, including Tuluva Hindus, Jains, Christians, and Muslims.
Within these communities, the language varies in vocabulary, pronunciation, and influence from neighboring languages, resulting in multiple dialectal forms shaped by history, occupation, caste, and regional interaction.
Where
Tulu is predominantly spoken in the coastal districts of Dakshina Kannada and Udupi in Karnataka, and in the Kasaragod region of northern Kerala—an area collectively known as Tulu Nadu.
Within this region, the language exists in several dialectal forms influenced by geography and community identity, including Brahmin Tulu, Jain Tulu, Harijan Tulu, and variants shaped by contact with languages such as Kannada, Malayalam, and Beary. These variations reflect the linguistic richness of Tulu but also pose challenges for standardized digital input systems.

Today, Tulu exists in an unusual linguistic position: vibrant in speech, yet fragmented in writing. The revival of Tulu-Tigalari has gained momentum in recent years, driven by scholars, practitioners, and communities seeking to reconnect with their literary heritage. However, digital support for the script remains limited. Many characters are newly encoded in Unicode, fonts are inconsistent, and input methods are still emerging.

Tulu Tigalari
Script of the Tulu Language
Tigalari is a Southern Brahmic script which was used to write Tulu, Kannada, and Sanskrit languages. It was primarily used for writing Vedic texts in Sanskrit. It evolved from the Grantha script.
Today, majority of tulu speakers are not literate in the Tigalari script. Subsequently, Kannada, Malayalam and English are used to write the language digitally.
Research
Primary and Secondary
Research
Stakeholder Interviews
Six voices on a script's digital future
Conversations with engineers, journalists, poets, and practitioners surfaced a linguistic landscape shaped by deep expertise, generational tension, and a shared anxiety about Tulu's survival in digital spaces.
01/06
Pioneering Software Engineer
Skeptic
Challenged the very foundations — arguing Unicode is structurally inadequate for Indic scripts, requiring what he described as "eleven dimensions" of complexity. Dismissed English-based keyboards, dialect-sensitive models, and younger speakers' language competency in equal measure.
"The only major Tulu corpus is publicly searchable but its source code stays closed — trapped by licensing and credit disputes."
01/06
Retired Tulu Journalist
Advocate
Offered the sharpest contrast — enthusiastic about digital efforts and confident that new tools can serve the language well. Validated key decisions around vowel representation and welcomed attempts to encode Tulu sounds normatively and accurately across scripts.
A reminder that not all experts resist change; some have been waiting for this work.
01/06
Tulu Poet
Cautious
Grounded the conversation in lived use: online transliteration and translation tools regularly distort meaning, especially for non-standard dialects. Meaning is fragile, and tools that flatten dialect variation cause real harm to the language's expressive range.
Reinforced the need for transparent, dialect-aware models over confident but opaque automation.
01/06
Senior Tulu Engineer
Pragmatist
Advised against prioritising a Tigalari keyboard given that most speakers today cannot read the script. Yet acknowledged a real and growing counter-trend: younger users increasingly want to learn Tigalari, and Roman script dominates online Tulu communication.
The script question is not resolved — it's generational. Design must hold space for both realities.
01/06
Tulu Filmmaker
Advocate
Having used Tulu fonts in a feature film, brought a practitioner's view of real transliteration workflows. Mapped current pain points and confirmed a broader cultural shift: Roman-script Tulu has become the de facto register for digital and informal expression.
Cultural production is already happening in Roman Tulu. The tool needs to meet speakers where they are.
01/06
Tulu Software Engineer
Critical
Confirmed the closed nature of existing lexical resources and gave concrete technical guidance: grammar sources, data-scraping constraints, and the hard limits of current Roman–Tulu fonts. Emphasised that meaningful progress requires a fully Unicode-native Tigalari font and open, expandable language data.
"Open data isn't just a preference — it's a prerequisite for any tool that claims to serve the community."
Persona Map
Designing for the fluent but unscripted speaker
She speaks Tulu every day but has never typed it. Aishwarya represents the majority: a generation navigating language without the tools to express it digitally.
Aishwarya
Primary Persona
Age
22
Occupation
College Student
Location
Mangalore, Karnataka
Language Fluency
Tulu
English
Kannada
Background
Aishwarya grew up speaking Tulu at home with her family and friends but was educated primarily in English and Kannada. While she is fluent in spoken Tulu, she has never formally learned to read or write the traditional Tulu script. Like many in her generation, she communicates digitally through messaging apps and social media, often switching between English, Kannada, and Tulu.
Goals
Preserve the language in everyday digital life
Communicate naturally in Tulu through messaging and social media
Type quickly without learning a new layout or script
Pain Points
No reliable keyboard for typing Tulu directly
Inconsistent transliteration in Latin characters
Keyboard switching breaks conversation flow
Predictive text has no knowledge of Tulu words
Core Need
No reliable keyboard for typing Tulu directly
Fast, predictable, and consistent output
Near-zero learning curve — feels like any other keyboard
Observed Behaviours
Types Tulu in Latin transliteration
Active on messaging apps and social media
Uses improvised or shortened spellings
Switches between English & Kannada keyboards
Code-switches mid-sentence
"She already speaks the language. She just needs a tool that speaks it back — in the scripts she grew up with, and the ones she's still discovering."
Secondary Research
Theoretical Framework
The project is grounded in critical design lenses to move beyond simple translation toward true inclusion:
Postcolonial Computing
Dismantling power structures that render certain languages "backend" and others "frontend".
Shadow Infrastructures
Recognizing the informal workarounds, such as hacked fonts and closed-source dictionaries, that Tulu speakers use to survive digitally.
Design Justice
Ensuring marginalized communities control and benefit from the design process through participatory methods.
The Digital Language Divide
Digital infrastructures such as keyboards and operating systems often enforce a standard of English supremacy. For Tulu speakers, this results in a layered accessibility problem:
Infrastructural Friction
Users must negotiate with autocorrect systems that view native Tulu words as "errors"
Infrastructural Bricolage
Tulu youth often force the Roman (English) keyboard to speak Tulu, leading to lost phonemes and a frustrating user experience.
Fragmented Writing
While vibrant in speech, Tulu has been displaced in writing by Kannada and Malayalam scripts, leaving the traditional Tigalari script with limited digital support.
Dialectal Exclusion
Generic predictive text models fail to recognize the linguistic diversity of the community, such as Brahmin, Jain, and Beary-influenced forms of Tulu.
Revised Definitions
Designing a
Transliteration System
Problem Statement
Although the script has recently been encoded in the Unicode Standard, the surrounding digital ecosystem remains incomplete: mainstream keyboards, input methods, predictive text engines, and fonts do not yet support the script in a usable, everyday format.
As a result, Tulu speakers continue to rely on Latin transliteration, Kannada, or improvised hybrid typing systems that vary widely across dialects and social contexts.
This creates a layered accessibility problem: users cannot type Tulu consistently across devices;
The script renders unpredictably due to incomplete shaping support. Some sounds and glyphs get excluded too.
Generic predictive text models fail to recognise the linguistic diversity of the community, including Brahmin Tulu, Jain Tulu, Harijan Tulu, Beary-influenced forms, and rural vs. coastal variations. These failures disproportionately affect those whose dialects fall outside the informal “standard,” reinforcing inequalities within an already minoritised linguistic landscape.
Design Outline
The project seeks to address these gaps by building a fully custom web-based keyboard system, shaped by script-specific logic, dialectal variation, and user needs. This system includes:
A transliteration engine capable of phonetic-to-Unicode mapping,
A shaping-aware input pipeline congruent with Indic orthographic rules.
A multi-dialect predictive text engine capable of learning variants rather than standardising them away.
A Unicode-compliant font that renders the script reliably across browsers.
Together, these interventions aim to restore agency to Tulu speakers by giving them a consistent, self-determined, and decolonised means of writing their language digitally.
The primary output is a web-based keyboard system designed to restore agency to Tulu speakers by providing a consistent, self-determined way to write their language digitally.

The Design Phase
Phonetic Inventory
Tulu has many unique sounds, which do not have glyphs to represent them in languages used to write the language.
Tulu has many unique sounds, which do not have glyphs to represent them in languages used to write the language.
A study into the unique phonological structure of Tulu is essential in developing the Phonetic Inventory, so as to not exclude the unique sounds of the language




The phonological study of a language involves looking at data (phonetic transcriptions of the speech of native speakers) and trying to deduce what the underlying phonemes are and what the sound inventory of the language.

Link to Spreadsheet
Starting out, it was useful to see the transliteration schemes practiced by the dominant language of the region, Kannada. In doing so, we were able to identify cases where transliterating text caused ambiguous translations, i.e. instances where multiple sounds were mapped on the same characters, excluding unique dialectal sounds, resulting in misrepresentation.
The Design Phase
Keyboard Layout
Tulu has many unique sounds, which do not have glyphs to represent them in languages used to write the language.
Key Characteristics of a Good Input Method
Control: The user should be in full control. The input method should not "dictate" or guess what the user wants to type, a criticism aimed at probabilistic keyboards.
Privacy: The tool must be private and not "spy on you." The article states that being free and open-source is a prerequisite for ensuring this
Availability & Ownership: A good IME should work offline, be available on all your devices (PC, phone), and not be a proprietary tool that can be withdrawn by its vendor (like Google's abandoned desktop IMEs)
Correctness: It must generate correct and standard Unicode. It should not, for example, substitute a visually similar character like the number '0' for the Malayalam anuswaram 'ം'.
Well-maintained: The software should have active maintainers, a public bug tracker, and regular updates to fix issues and adapt to new operating systems.
Documentation: It must have clear, up-to-date documentation for users.
Easy to Learn: While ease of learning is important, the author stresses that users should expect to put in a reasonable effort to master a lifelong skill, just as they did when learning to write by hand.
Well-maintained: The software should have active maintainers, a public bug tracker, and regular updates to fix issues and adapt to new operating systems.
Key Layout
Apple’s own design guidelines, which recommend that all clickable interface elements be of least 6.85 × 6.85 millimeters because anything below that would yield very poor click accuracy. (Microsoft and Nokia also recommend a minimum hit area of approximately 7 × 7 millimeters). Predictably, this results in misspellings.
Using Genetic algorithms and multi-objective Pareto optimization will help arrive at the mathematically optimal keayboard layout.
Key Findings (Post Testing)
Designing keyboards for scripts with many glyphs benefits from phonetic and phonology-aware layouts (map by place of articulation / phoneme clusters) rather than just visual/orthographic clustering — this helps learnability for novices. See work on Indic/phonetic keyboards (AKSHAR, Keylekh, IIT-Bombay work). pranavmistry.com+1
For soft keyboards, direct mapping (one tap → target glyph) is generally faster than approaches that require extra keystrokes (dead key) to produce diacritics, but direct mapping demands more screen real estate or layer switching. Bi et al. found K5-Direct faster than dead-key approaches for many languages. Stony Brook Computer Science
Long-press (press & hold) is commonly used to expose diacritics on mobile soft keyboards and is a practical compromise — but discoverability and delay tuning matter a lot; novice users often find press-and-hold less intuitive. Empirical/UX evidence and community reports show long-press latency and discoverability strongly impact subjective flow.
For Indic languages, transliteration + predictive candidate bars (IMEs that convert Roman input to script) are powerful: they reduce keystrokes by letting the IME resolve phoneme sequences into glyphs and can hide complexity from users. Surveys and literature on machine transliteration / IMEs show this is a high-value approach for many users.
Net: long-press is a usable tool but is not a silver bullet, it helps on small screens, but it can break typing flow if used as the only method or if the long-press delay is poorly tuned; predictive transliteration and a small diacritic layer are strong complementary strategies.
The Design Phase
Transliteration Schemes
Tulu has many unique sounds, which do not have glyphs to represent them in languages used to write the language.
Transliteration schemes dictate how glyphs of one language are represented in another script.
IPA (International Phonetic Alphabet) and IAST (Indian Alphabet of Sanskrit Transliteration) are two of the most relevant systems used to depict indian languages in the latin script.
However, before creating a transliteration scheme, it was important to gain an understanding of existing transliteration systems.



For this, I looked at Unicode proposals published by Vaishnavi Murthy and the transliteration scheme developed for the Tulu Lexical project, which spanned over several decades and comprised of 6 volumes.
The Design Phase
Dialect Sensitive
Predictive Text Engine
Tulu has many unique sounds, which do not have glyphs to represent them in languages used to write the language.
The Design Phase
Unicode Optimised
Tigalari Font
Tulu has many unique sounds, which do not have glyphs to represent them in languages used to write the language.
Prototype 1


