Transliteration, Transcription and
Pronunciation of Tamil and Sanskrit

  1. Transliteration Scheme and Pronunciation
  2. Transliteration and Transcription
  3. Transliteration of Tamil Script
  4. Transliteration of Devanagari Script
  5. PDF copy of this page

Transliteration Scheme and Pronunciation

The transliteration scheme that I use is based upon several closely related schemes, namely the International Alphabet of Sanskrit Transliteration (IAST), the scheme used in the Tamil Lexicon, the National Library at Kolkata romanization scheme, the American Library Association and the Library of Congress (ALA-LC) transliteration schemes and the more recent international standard known as ‘ISO 15919 Transliteration of Devanagari and related Indic scripts into Latin characters’ (a detailed description of which is available here).

Transliteration table

The following table summarises this transliteration scheme. In the first column I list all the diacritic and non-diacritic Latin characters that I use to transliterate the Tamil and Sanskrit alphabets; in the second column I give the Tamil letter that each such character represents [followed in square brackets where applicable by the Grantha letter that is optionally used in Tamil to denote the represented sound more precisely]; in the third column I give the Devanagari letter that it represents; and in the last column I give an indication of its pronunciation or articulation.

In the Tamil and Devanagari columns, a dash (–) indicates that there is no exact equivalent in that script for the concerned letter in the other script. In the Tamil column, round brackets enclosing a letter indicates that it is pronounced and transliterated as such only in words borrowed from Sanskrit or some other language. Likewise, in the Devanagari column, round brackets enclosing a letter indicates that it is not part of the alphabet of classical Sanskrit, though it does occur either in Vedic Sanskrit or in some other Indian languages written in Devanagari.

Vowels:
a Short ‘a’, pronounced like ‘u’ in cut
ā Long ‘a’, pronounced like ‘a’ in father
i Short ‘i’, pronounced like ‘e’ in English
ī Long ‘i’, pronounced like ‘ee’ in see
u Short ‘u’, pronounced like ‘u’ in put
ū Long ‘u’, pronounced like ‘oo’ in food
Short vocalic ‘r’, pronounced like ‘ri’ in merrily
Long vocalic ‘r’
Short vocalic ‘l’, pronounced like ‘lry’ in revelry (not to be confused with the Tamil consonant ள், which is also transliterated as )
Long vocalic ‘l’
e (ऎ) Short ‘e’, pronounced like ‘e’ in else
ē Long ‘e’, pronounced like ‘ai’ in aid
ai Diphthong ‘ai’, pronounced like ‘ai’ in aisle
o (ऒ) Short ‘o’, pronounced like ‘o’ in cot
ō Long ‘o’, pronounced like ‘o’ in dote
au Diphthong ‘au’, pronounced like ‘ou’ in sound
Consonantal diacritics:
Tamil āytam, indicating gutturalization of the preceding vowel, pronounced like ‘ch’ in loch
Sanskrit anusvāra, indicating nasalization of the preceding vowel, pronounced like ‘m’ or (when followed by certain consonants) ‘ṅ’, ‘ñ’, ‘ṇ’ or ‘n’
Sanskrit visarga, indicating frication (or lengthened aspiration) of the preceding vowel, pronounced like ‘h’ followed by a slight echo of the preceding vowel
Consonants:
k க் क् Velar plosive, unvoiced and unaspirated
kh (க்) ख् Velar plosive, unvoiced but aspirated
g க் ग् Velar plosive, voiced but unaspirated
gh (க்) घ् Velar plosive, voiced and aspirated
ங் ङ् Velar nasal
c ச் च् Palatal plosive, unvoiced and unaspirated (pronounced like ‘c’ in cello or ‘ch’ in chutney)
ch (ச்) छ् Palatal plosive, unvoiced but aspirated
j ச் [ஜ்] ज् Palatal plosive, voiced but unaspirated
jh (ச்) झ् Palatal plosive, voiced and aspirated
ñ ஞ் ञ् Palatal nasal
ட் ट् Retroflex plosive, unvoiced and unaspirated
ṭh (ட்) ठ् Retroflex plosive, unvoiced but aspirated
ட் ड् Retroflex plosive, voiced but unaspirated
ḍh (ட்) ढ् Retroflex plosive, voiced and aspirated
ண் ण् Retroflex nasal
t த் त् Dental plosive, unvoiced and unaspirated
th (த்) थ् Dental plosive, unvoiced but aspirated
d த் द् Dental plosive, voiced but unaspirated
dh (த்) ध् Dental plosive, voiced and aspirated
n ந் न् Dental nasal
p ப் प् Labial plosive, unvoiced and unaspirated
ph (ப்) फ् Labial plosive, unvoiced but aspirated
b ப் ब् Labial plosive, voiced but unaspirated
bh (ப்) भ् Labial plosive, voiced and aspirated
m ம் म् Labial nasal
y ய் य् Palatal semivowel
r ர் र् Dental tap (in Tamil phonology) or retroflex trill (in Sanskrit phonology)
l ல் ल् Dental lateral approximant
v வ் व् Labial semivowel
ழ் (ऴ्) Retroflex central approximant (transliterated as in the Tamil Lexicon, and commonly transcribed as zh)
ள் (ळ्) Retroflex lateral approximant
ற் (ऱ्) Alveolar plosive, unvoiced (pronunciation of ற only when it is muted, that is, not followed by a vowel)
ḏṟ ற் (ऱ्) Alveolar plosive, voiced (pronunciation of ற only when it follows ன்)
ற் (ऱ्) Alveolar trill (pronunciation of ற when it follows and precedes a vowel)
ன் (ऩ्) Alveolar nasal
ś (ச்) [ஶ்] श् Palatal aspirated sibilant, pronounced somewhat like ‘s’ in sure (or ‘sh’ in she)
(ச்) [ஷ்] ष् Retroflex aspirated sibilant, pronounced somewhat like ‘s’ in sure (or ‘sh’ in she), but with the tongue curled further back
s ச் [ஸ்] स् Dental aspirated sibilant, pronounced like ‘s’ in see
h க் [ஹ்] ह् Voiced glottal fricative

Pronunciation guide

In the following guide to the pronunciation of Tamil and Sanskrit words as represented by this transliteration scheme, I will not venture to go too deep into the science of phonetics, which is a subject of which my understanding is very limited, but will attempt to offer at least a simple guide.

For those who wish to learn more about the phonetics of Tamil and Sanskrit (and also their phonologies, scripts, transliteration, grammar and so on), there is abundant (but not always entirely reliable) information available online, particularly in Wikipedia articles that can be accessed through the language portal such as Tamil language, Tamil phonology, Tamil script, Tamil grammar, Sanskrit, Vedic Sanskrit, Śikṣā (the science of Sanskrit phonetics), Grantha script, Devanagari script, Devanagari transliteration and International Alphabet of Sanskrit Transliteration (IAST).

More detailed information about phonetics in general (including detailed explanations of many of the technical terms that I have used here) is also available in Wikipedia and can be accessed through the index of phonetics articles. In many of these articles the symbols of the International Phonetic Alphabet (IPA) are used, and a detailed list and explanation of these symbols are given in the Wikipedia:IPA article, which will be helpful to anyone who wants to learn exactly how Tamil and Sanskrit letters should be pronounced.

Each diacritic mark used in this transliteration scheme indicates a specific quality of pronunciation. A macron above any vowel (ā, ī, ū, , , ē and ō) indicates that it is long (and a breve above any vowel, such as ă, ĭ or ŭ, indicates that it is particularly short, though it does not actually occur as a diacritic in this scheme). Except in the case of , which indicates that the preceding vowel is aspirated, an underdot below any consonant (, ṭh, , ḍh, , and ) or any ‘vowel’ (, , and ) indicates that it is retroflex (as also does the overdot above the Tamil consonant ). A macron below any consonant other than (namely , , and ) in a Tamil word indicates that it is alveolar. An ‘h’ appended to any other consonant (kh, gh, ch, jh, ṭh, ḍh, th, dh, ph and bh) in a Sanskrit word indicates that it is aspirated.

However, no such general rule applies to any of the other diacritic marks, namely those on , , , , ñ, and ś, so I will explain below what each of them indicates while discussing each of these characters individually.

Vowels:

A short vowel is pronounced for a single unit of sound duration called a mātra, and a long vowel or diphthong is pronounced for two such units. The first vowel, a (அ, अ), is pronounced like ‘u’ in ‘pun’ or ‘a’ in ‘above’, and ā (ஆ, आ) is the same sound pronounced twice as long, like ‘a’ in ‘after’ or ‘father’. The next vowel, i (இ, इ), is pronounced like ‘i’ in ‘in’ or ‘e’ in ‘English’, and ī (ஈ, ई) is the same sound pronounced twice as long, like ‘ee’ in ‘see’. The next vowel, u (உ, उ), is pronounced like ‘u’ in ‘put’, and ū (ஊ, ऊ) is the same sound pronounced twice as long, like ‘oo’ in ‘food’. The next vowel in Tamil, e (எ), is pronounced like ‘e’ in ‘else’, and ē (ஏ, ए) is the same sound pronounced twice as long, like ‘ai’ in ‘aid’.

Unlike Tamil, in Sanskrit there is no short e, but prior to the long ē (ए) there are four other vowels (though the last of these is only classified for the sake of symmetry, since it is never actually used), namely (ऋ), which is pronounced almost like ‘ri’ (as in ‘merrily’) or somewhere between ‘ri’ and ‘ru’ (with a short ‘u’ as in ‘put’), (ॠ), which is the same sound pronounced twice as long (somewhat like ‘ri’ in ‘marine’), (ऌ), which is pronounced almost as ‘lri’ (like ‘lry’ in ‘revelry’), and (ॡ), which is theoretically the same sound pronounced twice as long.

In both Tamil and Sanskrit, ē (ஏ, ए) is followed by the diphthong ai (ஐ, ऐ), which is pronounced like ‘ai’ in ‘aisle’. In Tamil this is followed by o (ஒ), which is pronounced like ‘o’ in ‘cot’, and then in both languages comes ō (ஓ, ओ), which is the same sound pronounced twice as long, like ‘o’ in ‘dote’. The final vowel is another diphthong, au (ஔ, औ), which is pronounced like ‘ou’ in ‘sound’.

Consonantal diacritics:

In Tamil the next letter is the ‘hermaphrodite’ āytam, (ஃ), which is pronounced somewhat like a guttural ‘k’, ‘g’ or ‘h’ (or ‘ch’ in the Scottish word ‘loch’) appended to the preceding vowel, and which only occurs between a short vowel and one of the ‘hard class’ consonants.

In Sanskrit the fourteen vowels are followed by two consonantal diacritics, the anusvāra, (ं), which nasalises the vowel to which it is appended, and which may therefore be pronounced either like ‘m’ or (when it is not at the end of a sentence) like any form of ‘n’, depending upon which consonant follows it, and the visarga, (ः), which aspirates the vowel to which it is appended, and which is therefore pronounced somewhat like ‘h’, but often followed by a slight echo of the preceding vowel. Thus namaḥ is pronounced ‘namahă’, and śāntiḥ pronounced ‘śāntihĭ’.

Consonants:

In Sanskrit the consonants begin with five groups of five stop consonants (known in Sanskrit as sparśa or ‘touch’ consonants, since they are formed by complete contact of the organs of utterance), each group consisting of four plosives (oral stops) followed by one nasal stop. Each group of four plosives consists of a pair of voiceless (unvoiced or aghōṣa) and a pair of voiced (ghōṣa) consonants, and each pair consists of one unaspirated (alpaprāṇa or ‘slight breath’) and one aspirated (mahāprāṇa or ‘great breath’) consonant. Thus the order of plosives within each group is unvoiced and unaspirated, unvoiced and aspirated, voiced and unaspirated, and voiced and aspirated.

In Tamil the consonants also begin with the same five groups of stop consonants, but each group of four plosives is represented by a single letter (all of which are collectively known as the val-l-iṉam or ‘hard class’ of consonants). As I will explain in more detail in the section on the transliteration of Tamil consonants, these five ‘hard class’ consonants (and the sixth one, [ற்], which is placed near the end of the Tamil alphabet) may be either unvoiced or voiced, because their exact pronunciation is determined by their position in a word and the letters that precede or follow them. In words of pure Tamil origin they are never aspirated, but in loanwords from Sanskrit and other languages they are aspirated where appropriate. In the Tamil alphabet each of these ‘hard class’ consonants is followed by its corresponding nasal (which are collectively known as the mel-l-iṉam or ‘soft class’ of consonants), so whereas Sanskrit has five nasal consonants (, ñ, , n and m), Tamil has six (, ñ, , n, m and ).

With the exception of the sixth group of stop consonants in Tamil (namely [ற்] and [ன்], which are alveolar and therefore belong phonetically between the third and fourth group, but which are placed separately at the end of the Tamil alphabet), in both Sanskrit and Tamil the five groups of stop consonants are arranged phonetically according to their place of articulation, beginning from the back of the mouth (velar) and ending with the lips (labial), so their sequence is velar (kaṇṭhya or ‘guttural’), palatal (tālavya), retroflex (mūrdhanya or ‘cerebral’), dental (dantya) and labial (ōṣṭhya).

Thus in both languages the first group of stop consonants are velar, which means that they are pronounced with the back part of the tongue against the velum (the soft palate at the back of the mouth). The first velar consonant in both languages is k (க், क्), which is an unaspirated and unvoiced plosive, pronounced like ‘k’ in ‘skip’ (though in Tamil க் can also be pronounced as ‘g’ or ‘h’, according to the rules that I will explain later in the section on the transliteration of Tamil consonants). In Sanskrit this is followed by three more velar plosives, the aspirated unvoiced kh (ख्), which is pronounced like ‘k’ but with a stronger exhalation (like the more strongly aspirated ‘k’ in ‘kip’), the unaspirated voiced g (க், ग्), which is pronounced like ‘g’ in ‘game’, and the aspirated voiced gh (घ्), which is pronounced like ‘g’ but with a stronger exhalation. The final consonant in this group is the velar nasal, (ங், ङ्), which is pronounced like ‘ng’ in ‘sing’.

The second group of stop consonants are palatal, which means that they are pronounced with the body of the tongue raised against the hard palate (the middle part of the roof of the mouth). The first palatal consonant in both languages is c (ச், च्), which is an unaspirated and unvoiced plosive, pronounced like ‘ch’ in ‘church’ (though in Tamil ச் can also be pronounced as ‘j’ or ‘s’, according to the rules that I will explain in the section on the transliteration of Tamil consonants). In Sanskrit this is followed by three more palatal plosives, the aspirated unvoiced ch (छ्), which is pronounced like ‘ch’ but with a stronger exhalation, the unaspirated voiced j (ச், ஜ், ज्), which is pronounced like ‘j’ in ‘jug’, and the aspirated voiced jh (झ्), which is pronounced like ‘j’ but with a stronger exhalation. The final consonant in this group is the palatal nasal, ñ (ஞ், ञ्), which is pronounced like ‘ni’ in ‘onion’ or ‘ny’ in ‘canyon’.

The first of these palatal consonants, c (ச், च्), is often transcribed (both in Tamil and in Sanskrit words) as ‘ch’, since it is pronounced like ‘ch’ in many English words such as ‘chair’ (and also in ‘chutney’, which English has borrowed from an Urdu and Hindi word, caṭnī), but its correct transliteration is only ‘c’, since in all precise schemes for transliterating Indic scripts the post-consonantal ‘h’ is reserved for indicating that the consonant to which it is appended is aspirated. Therefore the transliteration ‘ch’ represents the second of the Sanskrit palatal consonants (छ्), which is aspirated. Thus, though some frequently used Sanskrit words such as aruṇācala, cit and vicāra are commonly transcribed as ‘Arunachala’, ‘chit’ and ‘vichara’ respectively, when transliterated precisely the ‘ch’ sound should be represented by ‘c’.

In Sanskrit a commonly occurring consonant cluster is (for which the Devanagari character is ज्ञ्, which is a ligature of ज् [j] and ञ् [ñ]), but it is not pronounced exactly as it is spelt. In north India jña (ज्ञ) tends to be pronounced like ‘gya’, whereas in south India when it occurs in initial position (as for example in ज्ञान, jñāna) the j (ज्) is hardly pronounced (and hence in Tamil jñāna is spelt as it is pronounced, namely ஞானம் [ñānam]), whereas in the middle of a word (as for example in ajñāna or prajñāna) the j (ज्) is pronounced somewhat like g (which in Tamil is indicated by gemination of ஞ் [ñ], as for example in ajñāna, which is spelt அஞ்ஞானம் [aññānam]). In this respect is similar to the cognate cluster ‘gn’ in English, because in initial position (as for example in ‘gnaw’ or ‘gnosis’) the ‘g’ is silent, whereas in the middle of a word (as for example in ‘agnostic’ or ‘diagnosis’) the ‘g’ is pronounced.

The third group of stop consonants are retroflex (as indicated in transliteration by the diacritic underdot), which means that they are pronounced by curling the tip of the tongue back to point up towards (or in the case of these stop consonants, to actually touch) the roof of the mouth, just behind the alveolar ridge (for which reason they are called in Sanskrit mūrdhanya, ‘head’ or ‘cerebral’ consonants). When articulating any of the five retroflex sparśa or ‘touch’ consonants (, ṭh, , ḍh and ) or the Tamil retroflex lateral approximant, (ள்), the tip of the tongue actually touches the roof of the mouth, but when articulating Tamil retroflex central approximant, (ழ்), or the Sanskrit retroflex sibilant, (ஷ், ष्), no contact is made.

The first retroflex consonant is (ட், ट्), which is an unaspirated and unvoiced retroflex plosive, pronounced like an English ‘t’ but with the tongue curled up (though in Tamil ட் can also be pronounced as ‘d’ with the tongue curled up, according to the rules that I will explain in the section on the transliteration of Tamil consonants). In Sanskrit this is followed by three more retroflex plosives, the aspirated unvoiced ṭh (ठ्), which is pronounced like an English ‘t’ but with the tongue curled up and with a stronger exhalation, the unaspirated voiced (ட், ड्), which is pronounced like an English ‘d’ but with the tongue curled up, and the aspirated voiced ḍh (ढ्), which is pronounced like an English ‘d’ but with the tongue curled up and with a stronger exhalation. The final consonant in this group is the retroflex nasal, (ண், ण्), which is pronounced like an English ‘n’ but with the tongue curled up.

The fourth group of stop consonants are dental, which means that they are pronounced with the tongue touching or close to the upper teeth (unlike the corresponding consonants in English, which are pronounced with the tongue touching or close to the alveolar ridge, just above the upper teeth). The first dental consonant is t (த், त्), which is an unaspirated and unvoiced dental plosive, pronounced like an English ‘t’ but with the tongue touching the teeth (though in Tamil த் can also be pronounced as ‘d’ with the tongue touching the teeth, according to the rules that I will explain in the section on the transliteration of Tamil consonants). In Sanskrit this is followed by three more dental plosives, the aspirated unvoiced th (थ्), which is pronounced like an English ‘t’ but with the tongue touching the teeth and with a stronger exhalation, the unaspirated voiced d (த், द्), which is pronounced like an English ‘d’ but with the tongue touching the teeth, and the aspirated voiced dh (ध्), which is pronounced like an English ‘d’ but with the tongue touching the teeth and with a stronger exhalation. The final consonant in this group is the dental nasal, n (ந், न्), which is pronounced like an English ‘n’ but with the tongue touching the teeth.

The fifth group of stop consonants are labial (or more precisely, bilabial), which means that they are pronounced with the lips. The first labial consonant is p (ப், प्), which is an unaspirated and unvoiced labial plosive, pronounced like ‘p’ in ‘spun’ (though in Tamil ப் can also be pronounced as ‘b’, according to the rules that I will explain in the section on the transliteration of Tamil consonants). In Sanskrit this is followed by three more labial plosives, the aspirated unvoiced ph (फ्), which is pronounced like ‘p’ but with a stronger exhalation (like the more strongly aspirated ‘p’ in ‘pun’), the unaspirated voiced b (ப், ब्), which is pronounced like ‘b’ in ‘bat’, and the aspirated voiced bh (भ्), which is pronounced like ‘b’ but with a stronger exhalation. The final consonant in this group is the bilabial nasal, m (ம், म्), which is pronounced like an English ‘m’.

These five groups of stop consonants are followed by a group of oral sonorants that in Sanskrit are called antastha (antaḥ-stha), which means ‘standing between’ (since they are sounds that considered to be standing between vowels and true consonants), and in Tamil are called the iṭai-y-iṉam or ‘medial class’ of consonants. Since the Sanskrit term antastha is sometimes translated in English as ‘semivowel’, this group of oral sonorants are often loosely described as semivowels, but they are more accurately described as approximants, because only two of them (y and v) are truly semivowels and the rest (r, l, and ) are liquids.

In Tamil there are six of these ‘medial class’ consonants (y, r, l, v, and ), but in Sanskrit there are only the first four members of this group. The first of these is y (ய், य्), which is the palatal semivowel (also called the palatal approximant or palatal central approximant). The second is r (ர், र्), which according to Tamil phonology is a dental tap (though phonetically it is described as the alveolar tap), but according to Sanskrit phonology is a retroflex trill (though phonetically it is described as the alveolar trill). The third is l (ல், ल्), which is traditionally described as the dental ‘l’ or dental lateral approximant (though phonetically it is described as the alveolar lateral approximant). The fourth is v (வ், व्), which is the labiodental semivowel (also called the labiodental approximant).

Each of these four oral sonorants is pronounced more or less like its counterpart in English, except that the voiced labiodental approximant, v (வ், व्), is often pronounced slightly more like an English ‘w’ (the voiced labiovelar approximant) [or in Sanskrit loanwords in Tamil like ‘v’ with a slight ‘u’ sound before it], particularly when it follows a mute consonant in Sanskrit (so for example īśvara is pronounced ‘īśwara’ [or ‘īśŭvara’], and svāmi is pronounced ‘swāmi’ [or ‘sŭvāmi’]), or colloquially in certain Tamil words, such as the respectful greeting வணக்கம் (vaṇakkam), which is often pronounced waṇakkam.

In Tamil these four are followed by two more liquid sonorants, both of which are retroflex. The first is the retroflex central approximant, (ழ்), which is pronounced by simultaneously curling the tongue back so that its tip points up towards but does not touch the roof of the mouth, and spreading it sideways so that its sides touch the sides of the upper alveolar ridge, thereby causing the air to flow centrally (that is, over the centre of the tongue). The resulting sound (which is the final ‘l’ in ‘Tamil’ and which is often transcribed as ‘zh’) is somewhere between an ‘r’ and an ‘l’, but phonetically it is classified as an ‘r’ rather than an ‘l’, since it is not a lateral sound but a rhotic one (and hence, though the Tamil Lexicon transliterates it as , some scholars [such as the authors of A Dravidian Etymological Dictionary] transliterate it as an ‘r’ with either one or two dots below [ or ], and I transliterate it as , for a reason that I will explain in the section on the transliteration of the Tamil script). The second is the retroflex lateral approximant, (ள், ळ्), which is pronounced by curling the tongue back so that its tip touches the roof of the mouth, thereby causing the air to flow laterally (that is, past both sides of the tongue).

Thus in Tamil there are two pairs of liquids, each of which consists of one rhotic (or ‘r’-like) sound and one lateral (or ‘l’-like) sound, the distinction between them being that the first pair, r (ர்) and l (ல்), are dental (that is, pronounced with the tongue close to or touching the upper teeth), whereas the second pair, (ழ்) and (ள்), are retroflex (that is, pronounced by curling the tip of the tongue back to point up towards or touch the roof of the mouth).

In Tamil these six oral sonorants or ‘medial class’ (iṭaiyiṉa) consonants (y, r, l, v, and ) are followed by the last two letters of the Tamil alphabet, namely the sixth ‘hard class’ (valliṉa) consonant, (ற்), which is the alveolar trill, and the sixth ‘soft class’ (melliṉa) consonant, (ன்), which is the alveolar nasal. Thus, though the pronunciation of this ‘hard class’ (ற்) is similar to that of the ‘medial class’ r (ர்), it is somewhat harder, and hence it is phonetically classified as a trill, as opposed to the ‘medial class’ r, which is classified as a tap.

As I will explain in more detail in the section on the transliteration of Tamil consonants, when this hard ṟa (ற) is muted (that is, when its inherent vowel sound, a, is suppressed, as indicated by the addition of a puḷḷi or diacritic dot above it, ற்), it is pronounced (which is a voiceless alveolar plosive), when it is geminated (that is, doubled as ற்ற), it is pronounced ṯṟa, and when its follows the mute form of the final nasal, (ன்), it is pronounced ḏṟa (which is a combination of a voiced alveolar plosive and an alveolar trill).

The pronunciation of the final Tamil nasal, (ன்), is virtually the same as that of the dental nasal, n (ந்), though it is phonetically classified as an alveolar nasal, which means that it is pronounced by touching the tip of the tongue against the upper alveolar ridge rather than the upper teeth. However, in practice this distinction hardly exists (and hence in the International Phonetic Alphabet the dental and alveolar nasals are represented by the same symbol), and both these Tamil nasals are used to transcribe the Sanskrit dental nasal, n (न्) – the one that is used in each case depending upon which vowels or consonants precede or follow it – so when transliterating Tamil words of Sanskrit origin, I often do not distinguish (ன்) but transliterate it according to the Sanskrit spelling as n.

In the Tamil alphabet, after this final Tamil consonant, ṉa (ன), six Grantha characters (five consonants and one consonantal ligature) are appended for optional use when writing loanwords from Sanskrit or other languages. These six Grantha characters are ஜ (ja), ஶ (śa), ஷ (ṣa), ஸ (sa), ஹ (ha) and க்ஷ (kṣa), which are pronounced exactly like their Devanagari counterparts, ज (ja), श (śa), ष (ṣa), स (sa), ह (ha) and क्ष (kṣa).

In Sanskrit the four oral sonorants (y, r, l and v) are followed by four fricatives, of which the first three are sibilants, namely the voiceless palatal fricative, ś (ச், ஶ, श्), the voiceless retroflex fricative, (ச், ஷ், ष्), and the voiceless dental fricative, s (ச், ஸ், स्). The palatal ś (श्) is pronounced somewhat like ‘s’ in ‘sure’ or ‘sh’ in ‘she’; the retroflex (ष्) is pronounced like ś but with the tongue curled back to point up at the roof of the mouth; and the dental s (स्) is pronounced like ‘s’ in ‘see’.

The retroflex (ष्) is often transcribed as ‘sh’, and the palatal ś (श्) is sometimes transcribed thus, but since in all precise schemes for transliterating Indic scripts the post-consonantal ‘h’ is used only to distinguish aspirated (mahāprāna or ‘great breath’) consonants from their unaspirated (alpaprāna or ‘small breath’) counterparts, and since all the three Sanskrit sibilants are aspirated and have no unaspirated counterparts, none of them should be transliterated as ‘sh’.

The final letter of the Sanskrit alphabet is the voiced glottal fricative, h (க், ஹ், ह्), which is pronounced somewhat like ‘h’ in ‘happy’, but with more resonance of the vocal cords.

Transliteration and Transcription

There are two basic methods that can be employed when writing in Latin script words from a language whose original script is not Latin based, such as Tamil or Sanskrit, namely precise transliteration or simple transcription.

Though the terms ‘transcription’ and ‘transliteration’ are often used interchangeably, in a technical sense transcription means the writing of the sounds of one language in the script of another language (and though strictly phonetic transcription employs the use of a technical code such as the International Phonetic Alphabet, simple transcription employs no code other than the basic alphabet of the language in which it is written and is therefore less precise), whereas transliteration means the writing of the script of one language in the script of another language using diacritic marks (or some other device) where necessary to indicate precisely how each word is spelt in the original script.

Thus when a word from a language such as Tamil or Sanskrit is transcribed in Latin script for English-speaking readers, no diacritical characters are used to indicate precisely how it is spelt in its original script (or exactly how it should be pronounced), so it is written using only the twenty-six Latin characters of the English alphabet to indicate approximately how it is pronounced. But when a word from such a language is transliterated in Latin script, a specific (and usually internationally recognised) code employing diacritical characters is used to indicate precisely how the word is spelt in its original script (and also ideally how it should be pronounced).

The principle that I generally follow is just to transcribe the names of people or places and their associated titles whenever I write them in plain type, and to transliterate them and all other Tamil or Sanskrit words, including the titles of poems, books and other texts, whenever I write them in italic type. For example, I transcribe names such as ரமண (ramaṇa), அருணாசல (aruṇācala) and திருவண்ணாமலை (tiruvaṇṇāmalai) and titles such as ஸ்ரீ (śrī) and பகவான் (bhagavān) in plain type as ‘Ramana’, ‘Arunachala’, ‘Tiruvannamalai’, ‘Sri’ and ‘Bhagavan’ respectively, except when they are part of the title of a text, but I transliterate all other Tamil or Sanskrit words according to the following principles:

Firstly, I try to strike a balance between precision and user-friendliness, because a precise system of transliteration does not always indicate how a non-Latin script should be pronounced, since many scripts include letters whose pronunciation varies according to context. This is particularly true of the six ‘hard class’ consonants in Tamil (as I will explain in more detail below), and also of the anusvāra () in Sanskrit. Therefore I transliterate these consonants according to their actual pronunciation, rather than according to a strict one-to-one system of transliteration.

Moreover Tamil has its own system of transcribing Sanskrit words, which in many cases involves certain euphonic changes that alter the original pronunciation, so if Sanskrit words used in Tamil are transliterated exactly as they are spelt in Tamil, they may become unrecognisable. Therefore I generally transliterate such words as they are spelt in Sanskrit, unless there is any particular reason for transliterating them as they are spelt in Tamil (as for example in the case of ahandai, the Tamil form of ahaṁtā, meaning ‘ego’, which I often transliterate as it is spelt in Tamil, since this is also how it is usually pronounced and since it is word that Sri Ramana uses so frequently in his Tamil writings).

Secondly, since the scripts of Indian languages such as Tamil or Sanskrit do not have any capital letters, I avoid capitalising the initial character of any Tamil or Sanskrit word, except when such a word occurs in the title of a text or at the beginning of an English sentence.

(Incidentally, with the exception a few words such as ‘God’, which are conventionally spelt with an initial capital, I also avoid capitalising the initial character of any English word that is used to denote the one absolute reality, such as ‘self’, ‘being’, ‘consciousness’ or ‘reality’, because there is something intrinsically dualistic about attempting to distinguish between a capitalised ‘Self’ and a lesser ‘self’, when in reality there are no two selves. When used by Sri Ramana, ‘self’ is a metaphysical term that often defies definition, and is intended to defy it. For example, when we translate the term ātma-vicāra as ‘self-investigation’, ‘self-scrutiny’ or ‘self-enquiry’, it is unnecessary and undesirable to define whether the word ‘self’ denotes our real self or our false self, because though we may initially imagine that the ‘I’ we are scrutinising is our finite mind or ego, which appears to be our ‘self’, we will end up discovering that what exists and shines as ‘I’ is in fact only the one infinite being-consciousness, which is our real ‘self’, just as a person may initially imagine that he is scrutinising a snake but will end up discovering that what he is actually looking at is only a rope.)

There are also no hyphens in the scripts of Indian languages, but when transliterating compound words, which occur frequently in such languages, I often use hyphens to separate the individual words within each compound. Thus for example in a compound term such as ātmajñāna (which means self-knowledge), I usually insert a hyphen between its two component words, ātma and jñāna, to indicate that it is a compound formed of these components.

I also follow the Tamil Lexicon (the comprehensive Tamil-Tamil-English dictionary published in seven volumes by the University of Madras between 1924 and 1939) in using hyphens to separate consonants that are inserted in Tamil compounds according to the rules of word-conjunction (known in Tamil as puṇarcci or sandhi, which is a Tamil form of a Sanskrit word that is spelt saṁdhi but pronounced sandhi). For example, in the compound name திருவண்ணாமலை (tiru-v-aṇṇāmalai), the ‘v’ is inserted to link euphonically the two consecutive vowels, ‘u’ and ‘a’. This insertion of consonants for the sake of euphonic conjunction often occurs not only in compound words but also between consecutive words within a sentence, when the first of two words ends with a vowel, so where appropriate I separate such inserted consonants by hyphens.

However, when transliterating Tamil, I sometimes omit such conjunctive consonants, particular in the titles of poems or books, such as Upadēśa-v-Undiyār or Guru-Vācaka-k-Kōvai, which I transliterate simply as Upadēśa Undiyār and Guru Vācaka Kōvai respectively. Moreover, though it may be useful in some titles to separate certain words, it may not be necessary to separate other words, so for example I transliterate the title Upadēśa-t-Taṉi-p-Pākkaḷ (which means ‘Solitary Verses of Instruction’) as Upadēśa Taṉippākkaḷ, because it seems unnecessary and inelegant to split the compound word taṉippākkaḷ (which means ‘solitary verses’).

Moreover, as I mentioned above, I usually transliterate Tamil words of Sanskrit origin according to the spelling of the Sanskrit original, because this makes it easier for people who do not know Tamil to recognise them, and because in most cases this is more or less how they are pronounced in Tamil. Thus, for example, I often transliterate the title of the Tamil poem Ēkāṉma Pañcakam as Ēkātma Pañcakam, but I generally transliterate the title of the Tamil poem Āṉma-Viddai as it is spelt and pronounced in Tamil, rather than as Ātma-Vidyā, as it would be spelt and pronounced in Sanskrit.

Transliteration of Tamil Script

When transliterating Tamil words, I use a modified form of the transliteration scheme used in the Tamil Lexicon, which was later incorporated in the National Library at Kolkata romanization scheme for all Indic scripts and more recently in the international standard known as ‘ISO 15919 Transliteration of Devanagari and related Indic scripts into Latin characters’ (a detailed description of which is available here).

The reason why I use a modified form of the Tamil Lexicon transliteration scheme rather than its original form is that it uses a single Latin character (with or without an appropriate diacritic mark) or pair of Latin characters (in the case of the two diphthongs, ஐ [ai] and ஔ [au]) to represent each individual letter of the Tamil alphabet, and hence it is not a strictly phonetic scheme, because as I will explain in more detail below, the pronunciation of the six ‘hard class’ Tamil consonants varies according to the context in which each of them is used. Therefore, I transliterate each such consonant according to its actual pronunciation in each particular context.

I also differ from the Tamil Lexicon scheme in the transliteration of the retroflex consonant , the mute form of which is the final ‘l’ in the word ‘Tamil’ (தமிழ், tamiṙ), but which in many other words is traditionally transcribed as zha, as in Tiruchuzhi (திருச்சுழி, tiru-c-cuṙi, the birthplace of Sri Ramana), vazhi (வழி, vaṙi, which means path, way or means) and ezhuttu (எழுத்து, eṙuttu, which means a letter or alphabetical character). Whereas the Tamil Lexicon transliterates this consonant as ḻa, I transliterate it as ṙa, because it is not a lateral approximant (a typical ‘l’-like sound) but a central approximant, which means that it is actually closer to an ‘r’-like sound than an ‘l’-like sound (though when muted it is often pronounced with a somewhat more ‘l’-like quality [almost as a fusion of a retroflex ‘r’ and ‘l’], and in some spoken dialects it is wrongly pronounced as ள [ḷa], which creates confusion between certain pairs of words such as அழி [aṙi, meaning ‘destroy’] and அளி [aḷi, meaning ‘give’]).

This consonant ழ (ṙa) is technically described as the retroflex central approximant, in contrast to the subsequent consonant, ள (ḷa), which is the retroflex lateral approximant. Therefore, since the standard diacritic used to distinguish the retroflex consonants is the underdot (as in , ṭh, , ḍh, , and ), ழ should logically be transliterated as ṛa (as some scholars transliterate it), but since is used to transliterate the seventh vowel in Sanskrit (the retroflex ऋ), to avoid confusion I chose instead to transliterate ழ as ṙa.

Transliteration of Tamil vowels

Since the pronunciation of each of the twelve Tamil vowels is fixed, the Tamil Lexicon transliteration scheme is phonetic with respect to them, so I transliterate each vowel accordingly as follows: அ a, ஆ (ா) ā, இ (ி) i, ஈ (ீ) ī, உ (ு) u, ஊ (ூ) ū, எ (ெ) e, ஏ (ே) ē, ஐ (ை) ai, ஒ (ொ) o, ஓ (ோ) ō, ஔ (ௌ) au.

The five vowels with a macron above are each a long form of the same vowel without a macron, so for example அ (a) is a short sound while ஆ (ā) is a long sound. The symbol that I have given in brackets after each vowel except அ (a) is the typical diacritic form that that vowel takes when it is combined with a preceding consonant (though the actual diacritic form of some vowels varies according to the consonant with which it is combined), the dotted circle in each case representing the position of the consonant in the resulting ligature or compound set of characters. The first vowel, அ (a), has no such diacritic form, because it is inherent in the default form of each consonant, and whenever a consonant is not followed by any vowel sound, the absence of a vowel is indicated by a puḷḷi (்), a diacritic dot placed above the consonant. Thus, for example, the thirteen possible forms of the first consonant க (ka) are as follows: க் (k), க (ka), கா (), கி (ki), கீ (), கு (ku), கூ (), கெ (ke), கே (), கை (kai), கொ (ko), கோ () and கௌ (kau).

The Tamil ‘hermaphrodite letter’

There is one Tamil letter called the āytam, which is considered to be neither a pure vowel nor a pure consonant, and is therefore described as the ali-y-eṙuttu or ‘hermaphrodite letter’. It is hence the thirteenth letter in the Tamil alphabet, being placed after the twelve vowels and before the eighteen consonants, and it is pronounced somewhat like a guttural ‘k’, ‘g’ or ‘h’ (or ‘ch’ in the Scottish word ‘loch’) appended to the preceding vowel. It is written as ஃ and it is transliterated as .

In words of Tamil origin it is rarely used, occurring only in about fifty such words, some of which are alternative spelling of other words, such as அஃது (aḵdu), which is an alternative spelling of அது (adu), meaning ‘that’, and it always occurs only after a short open initial syllable (that is, an initial syllable consisting of only a short vowel or a short consonant-vowel combination, without any closing mute consonant) and before a ‘hard class’ consonant.

Transliteration of Tamil consonants

There are eighteen pure Tamil consonants, plus another six consonantal characters borrowed from the Grantha script that are optionally used in Tamil to write words of Sanskrit origin. The original eighteen Tamil consonants are divided into three groups of six, namely the val-l-iṉam or ‘hard class’ of consonants, which consists of six plosives (oral stops), க (ka), ச (ca), ட (ṭa), த (ta), ப (pa) and ற (ṟa, a trill, whose muted form, , is a plosive), the mel-l-iṉam or ‘soft class’ of consonants, which consists of six corresponding nasals, ங (ṅa), ஞ (ña), ண (ṇa), ந (na), ம (ma) and ன (ṉa), and the iṭai-y-iṉam or ‘medial class’ of consonants, which consists of six oral sonorants (two semivowels and four liquids), ய (ya), ர (ra), ல (la), வ (va), ழ (ṙa) and ள (ḷa).

The pronunciation of each of the six soft consonants (ங, ஞ, ண, ந, ம and ன) and six medial consonants (ய, ர, ல, வ, ழ and ள) is more or less fixed, as is the pronunciation of each of the six Grantha characters, ஜ (ja), ஶ (śa), ஷ (ṣa), ஸ (sa), ஹ (ha) and க்ஷ (kṣa), but the pronunciation of each of the six hard consonants (க, ச, ட, த, ப and ற) varies, and hence I transliterate them as far as possible according to their actual pronunciation in each context.

In the case of words of Tamil origin, the actual pronunciation of each of these six hard consonants depends largely upon its position in a word and whether or not it is conjoined to another consonant, because each of them has an unvoiced and a voiced form of pronunciation, and some of them also have a fricative or trilled form of pronunciation. In simpler terms, the unvoiced pronunciation of each of these hard consonants (k, c,, t, p and respectively) is hardest, their voiced pronunciation (g, j,, d, b and ḏṟ respectively) is softer, and the fricative pronunciation of the first two (h and s respectively) and the trilled pronunciation of the final one () are still softer.

More precisely, ‘unvoiced’ (or ‘voiceless’) means a sound that is pronounced without resonance of the vocal cords (as in the normal English pronunciation of ‘k’, ‘ch’, ‘t’ or ‘p’), whereas ‘voiced’ means one that is pronounced with resonance of the vocal cords (as in the normal English pronunciation of ‘g’, ‘j’, ‘d’ or ‘b’), and ‘fricative’ means one that is pronounced by forcing the breath through a narrow opening between two organs of articulation (as in the normal English pronunciation of ‘h’ or ‘s’), whereas ‘trilled’ means one that is pronounced with vibration of (in this case) the tongue.

The hardest (unvoiced) pronunciation of the velar க is ka, its softer (voiced) pronunciation is ga, and its softest (fricative) pronunciation is ha. The hardest (unvoiced) pronunciation of the palatal ச is ca (pronounced somewhat like the English ‘chu’ in ‘chum’ or ‘chutney’), its softer (voiced) pronunciation is ja, and its softest (fricative) pronunciation is sa. The hardest (unvoiced) pronunciation of the retroflex ட is ṭa and its softest (voiced) pronunciation is ḍa. The hardest (unvoiced) pronunciation of the dental த is ta and its softest (voiced) pronunciation is da. The hardest (unvoiced) pronunciation of the labial ப is pa and its softest (voiced) pronunciation is ba. And the hardest (unvoiced) pronunciation of the alveolar ற is ṯa (though this is actually used only when it is muted), its softer (voiced) pronunciation is ḏṟa, and its softest (trilled) pronunciation is ṟa.

The rules that determine how each of these ‘hard class’ consonants should be pronounced in any given context are defined in Tolkāppiyam, the most ancient extant Tamil grammar, and are based upon natural principles of euphony, which are as follows: The pronunciation of each of these consonants is hardest (unvoiced) when it is the initial letter of a word, when it is muted (that is, when its inherent vowel sound is suppressed), when it is geminated (that is, when its basic consonantal sound is lengthened by duplication) or when it follows any other muted ‘hard class’ consonant; it is softer (voiced) when it occurs ‘post-nasally’ (that is, after any muted ‘soft class’ consonant, ங், ஞ், ண், ந், ம் or ன்) or when it follows any muted ‘medial class’ consonant (such as ய் or ர்); and it is softest (either fricative or trilled, or if it has no such form, voiced) when it occurs intervocalically (that is, between two vowels).

Thus க is pronounced ka when it is the initial letter of a word, k when it is muted (க்), kka when it is geminated (க்க), ka when it follows any other muted hard consonant (such as ட் or ற்), ga when it follows a muted soft consonant (as in the frequently occurring cluster ங்க, which is pronounced ṅga) or a muted medial consonant (such as ய் or ர்), and ha when it follows a verb. Likewise ச is pronounced ca (or arbitrarily sa, as in fact it is customarily pronounced in many if not most cases, though strictly speaking this contravenes the ancient rule described here) when it is the initial letter of a word, c when it is muted (ச்), cca when it is geminated (ச்ச), ca when it follows any other muted hard consonant, ja when it follows a muted soft consonant (as in the frequently occurring cluster ஞ்ச, which is pronounced ñja), and sa when it follows a verb. ட is not the initial letter of any word of Tamil origin, but it is pronounced when it is muted (ட்), ṭṭa when it is geminated (ட்ட), and ḍa when it follows either a muted soft consonant (as in the frequently occurring cluster ண்ட, which is pronounced ṇḍa) or a verb. த is pronounced ta when it is the initial letter of a word, t when it is muted (த்), tta when it is geminated (த்த), and da when it follows either a muted soft consonant (as in the frequently occurring cluster ந்த, which is pronounced nda), a muted medial consonant or a verb. ப is pronounced pa when it is the initial letter of a word, p when it is muted (ப்), ppa when it is geminated (ப்ப), pa when it follows any other muted hard consonant, and ba when it follows either a muted soft consonant (as in the frequently occurring cluster ம்ப, which is pronounced mba, or in the clusters ண்ப and ன்ப, which are pronounced respectively ṇba and ṉba) or a verb.

Like these other hard consonants, the final hard consonant, ற, also has several allophones or variant forms of pronunciation. Like ட (ṭa), it is never the initial letter of a word. Its default pronunciation is considered to be ṟa (in which is a trilled ‘r’, described technically as an alveolar trill), but its mute form (ற்) is pronounced (or sometimes slightly more like , depending upon which consonant it precedes, and when it is used in the transliteration of a word of Sanskrit origin, it can also be pronounced d or l). Its geminated form (ற்ற) is pronounced ṯṟa, and the cluster ன்ற is pronounced ṉḏṟa, the extra sound being a natural euphonic increment.

In earlier forms of Tamil this consonant ற was probably treated as being essentially the alveolar plosive, (ற்), phonetically belonging between the retroflex plosive, (ட்), and the dental plosive, t (த்), but since it never occurs at the beginning of a word and is therefore pronounced only when it is muted, it has come to be considered to be basically an alveolar , as it is pronounced when followed by any vowel, and hence it is commonly called the valliṉaṟa’ (hard class ‘ra’), or more colloquially as the periyaṟa’ (big ‘ra’). However, if we consider it to be essentially the alveolar , it is clearly a plosive like the other five valliṉa or ‘hard class’ consonants (k, c, , t and p), and it follows the same basic rule concerning its pronunciation, namely that it is unvoiced () when it is muted, voiced (ḏṟ) when it occurs post-nasally, and trilled () when it occurs intervocalically. According to this rule, it should also be unvoiced when it is geminated (ṯṯ), as it is in certain dialects such as Sri Lankan Tamil (and also in Malayalam), but in most Tamil dialects it is actually pronounced ṯṟ.

However, these rules for the pronunciation of the six hard consonants are usually followed only in the case of words of Tamil origin, and are seldom followed in the case of words of Sanskrit origin, which are usually pronounced more or less as they would be in Sanskrit. For example, if புத்தி (which means intellect) were a word of Tamil origin, it would be pronounced putti, but since it is of Sanskrit origin, it is pronounced buddhi.

The reason why the normal rules for the pronunciation of these six hard consonants are thus generally not followed in the case of words of Sanskrit origin is that each of the first five Tamil plosive consonants represents a series of four Sanskrit consonants, and some of them are also used to transcribe other Sanskrit consonants, as also is the muted alveolar plosive, ற் (). That is, க (ka) represents the series of four Sanskrit velar plosives, क (ka), ख (kha), ग (ga) and घ (gha), and is also used to transcribe the Sanskrit voiced glottal fricative, ह (ha); ச (ca) represents the series of four Sanskrit palatal (or postalveolar) plosives, च (ca), छ (cha), ज (ja) and झ (jha), and is also used to transcribe the three Sanskrit unvoiced sibilant fricatives, श (śa), ष (ṣa) and स (sa); ட (ṭa) represents the series of four Sanskrit retroflex plosives, ट (ṭa), ठ (ṭha), ड (ḍa) and ढ (ḍha), and its mute form (ட்) is often used to transcribe the muted Sanskrit retroflex sibilant, ष् (); த (ta) represents the series of four Sanskrit dental plosives, त (ta), थ (tha), द (da) and ध (dha); and ப (pa) represents the series of four Sanskrit bilabial plosives, प (pa), फ (pha), ब (ba) and भ (bha).

The Tamil alveolar plosive, ற (ṟa), does not have any equivalent in Sanskrit, but its mute form, ற் (), is used to produce a euphonic transcription of certain mute consonants in Sanskrit. For instance, the Sanskrit words sadguru (a compound of sat and guru, which means ‘real teacher’ or ‘teacher of reality’), sadbhāva (a compound of sat and bhāva, which means ‘real being’ or ‘state of being’), alpa (which means small, little or insignificant) and kalpanā (which means fabrication, imagination, mental creation or illusion) are transcribed in Tamil as சற்குரு (caṯkuru), சற்பாவம் (caṯpāvam), அற்பம் (aṯpam) and கற்பனை (kaṯpaṉai) respectively, but are pronounced sadguru, sadbhāvam, alpam and kalpanai.

Certain words that are spelt the same in Tamil are actually derived from different words in Sanskrit, and hence they are pronounced in each case according to the Sanskrit word from which they are derived. For example, பாவம் is pronounced bhāvam when it is the Tamil form of bhāva (a Sanskrit word with many meanings such as becoming, being, state, nature, feeling, thought, idea, imagination or meditation), but pāvam when it is the Tamil form of pāpa (a Sanskrit word that means a sinful action or the result of such an action).

One other Tamil consonant is commonly used to transcribe more than one sound in Sanskrit, namely the palatal nasal, ஞ (ña), which is used to transcribe both its Sanskrit equivalent, ञ (ña), and the consonant cluster ज्ञ (jña). Though ज्ञ (jña) is a ligature of ज् (j) and ञ (ña), when it occurs in initial position (as for example in jñāna) the ज् (j) is hardly pronounced, whereas in the middle of a word (as for example in ajñāna) the ज् (j) is pronounced somewhat like g (as I explained earlier in the section on pronunciation). This is reflected in Tamil by the fact in initial position ज्ञ (jña) is transcribed as ஞ (ña), whereas in the middle of a word it is transcribed as ஞ்ஞ (ñña), so for example ज्ञान (jñāna) is transcribed as ஞானம் (ñānam) whereas अज्ञान (ajñāna) is transcribed as அஞ்ஞானம் (aññāna). However, to avoid confusion, when either ஞ (ña) or ஞ்ஞ (ñña) are used in Tamil to represent ज्ञ (jña) in a word of Sanskrit origin, I transliterate them as jña in accordance with the actual spelling of the original word.

In Tamil, loanwords from Sanskrit can take one of two forms, தற்சமம் (tatsamam) or தற்பவம் (tadbhavam), the former meaning words that are borrowed without any alteration in pronunciation (or in spelling, other than any adaptation necessitated by the availability of a different range of vowels and consonants in Tamil script, and by any appropriate change to the form of the nominative case-ending) and the latter meaning words that are borrowed with certain permitted alterations in pronunciation and spelling. Some examples of the permitted alterations that can occur in tadbhava loanwords are given in Indo-Aryan loanwords in Tamil, but many of the ‘changes’ listed there (such as ahaṃ to akam, agati to akati, adhikāra to atikāra, adhika to atika, aśuddha to acutta, ahaṃkāram to akankāram, ahambhāva to akampāvam, ārambha to ārampam, kārttika to kārttikai, śani to cani and budha to putan) are not actual tadbhava modifications but are merely adaptations necessitated by the restrictions imposed by Tamil script (and in a few cases by Tamil nominative case-endings) and therefore appear to be tadbhava changes only when each Tamil plosive (‘hard class’ consonant) is transliterated by a single unvoiced Latin consonant rather than by whichever one of the range of consonant sounds that they each truly represent.

Therefore, though the one-to-one transliteration scheme used in the Tamil Lexicon and recommended in the international standard ‘ISO 15919’ is useful in that it indicates exactly how a word is spelt in Tamil, its value is limited because it does not serve to indicate how a word is actually pronounced (except to people who already know both the principles that determine how each of the six hard consonants should be pronounced in any given context and the principles that determine how Sanskrit words should be transcribed in Tamil). Therefore, though I transliterate the twelve Tamil vowels, the six soft consonants (except ஞ) and the six medial consonants (except ழ) strictly according to the Tamil Lexicon scheme, I transliterate the six hard consonants and ஞ according to their actual pronunciation in each particular context, and ழ according to a more accurate phonetic representation.

Thus I transliterate the eighteen Tamil consonants as follows:

  • க (valliṉa velar plosive or fricative) as ka, (kha), ga, (gha) or ha
  • ங (melliṉa velar nasal) as ṇa
  • ச (valliṉa palatal plosive or fricative) as ca, (cha), ja, (jha), (śa), (ṣa) or sa
  • ஞ (melliṉa palatal nasal) as ña or (jña)
  • ட (valliṉa retroflex plosive) as ṭa, (ṭha), ḍa or (ḍha)
  • ண (melliṉa retroflex nasal) as ṇa
  • த (valliṉa dental plosive) as ta, (tha), da or (dha)
  • ந (melliṉa dental nasal) as na
  • ப (valliṉa bilabial plosive) as pa, (pha), ba or (bha)
  • ம (melliṉa bilabial nasal) as ma
  • ய (iṭaiyiṉa palatal central approximant) as ya
  • ர (iṭaiyiṉa dental tap) as ra
  • ல (iṭaiyiṉa dental lateral approximant) as la
  • வ (iṭaiyiṉa labiodental central approximant) as va
  • ழ (iṭaiyiṉa retroflex central approximant) as ṙa [though it is transliterated in the Tamil Lexicon as ḻa and is commonly transcribed as zha]
  • ள (iṭaiyiṉa retroflex lateral approximant) as ḷa
  • ற (valliṉa alveolar plosive or trill) as ṟa [ற் as , , (d) or (l); ற்ற as ṯṟa; ன்ற as ṉḏṟa]
  • ன (melliṉa alveolar nasal) as ṉa

In this list, the first transliteration for each consonant (which in the case of each of the first five plosives is its unvoiced form) other than ழ (ṙa) is the transliteration used in the Tamil Lexicon and recommended by ‘ISO 15919’, and each transliteration enclosed in brackets is one that I use only when transliterating a Tamil word of Sanskrit origin.

The main consonant sounds in Sanskrit and other Indian languages that are missing in Tamil are the ten aspirated plosives (kh, gh, ch, jh, ṭh, ḍh, th, dh, ph and bh) and the distinction made between the three varieties of sibilant, the palatal ś, the retroflex and the dental s. Though there are no separate Tamil letters for the unaspirated voiced plosives (g, j, , d and b), the voiced glottal fricative (h) or any basic sibilant (s), these sounds do exist in Tamil as allophones (variant forms) of the first five hard consonants, க் (k), ச் (c), ட் (), த் (t) and ப் (p).

Two consonant sounds, the retroflex ழ் () and alveolar ற் (), occur only in Tamil and Malayalam (and perhaps in some other minor or older Dravidian languages) but not in Sanskrit or most other Indian languages, and one consonant, the alveolar nasal ன் (), is unique to Tamil (and older forms of Malayalam), though in pronunciation it is virtually the same as ந் (n).

Unlike Grantha, Devanagari and most other Indic scripts, Tamil script does not form consonant clusters into ligatures (single characters that each represent a cluster of two or more letters), so in this respect it is much easier to learn than most other Indic scripts.

Transliteration of Grantha consonants used in Tamil

As I mentioned above, there are also six Grantha characters that are used in Tamil, particularly in words of Sanskrit origin. Grantha is an ancient script that was used in south India to write Sanskrit, and it predates Devanagari, which is nowadays the standard script used for writing Sanskrit. The use of Grantha in south India was replaced by Devanagari only in the late nineteenth and early twentieth century, when it became the custom to print Sanskrit texts in Devanagari.

Since the Tamil script does not have separate characters to represent all the Sanskrit letters, it borrowed the following six characters from Grantha: ஜ (ja), ஶ (śa), ஷ (ṣa), ஸ (sa), ஹ (ha) and க்ஷ (kṣa). However the use of these Grantha letters is optional, because each of them can be transcribed using a Tamil letter (or pair of Tamil letters), and ஶ (śa) in particular is used very rarely nowadays. ஜ (ज or ja), ஶ (श or śa), ஷ (ष or ṣa) and ஸ (स or sa) are each usually transcribed using the Tamil letter ச (ca); the muted retroflex sibilant, ஷ் (ष् or ), is often transcribed using the muted Tamil retroflex plosive, ட் (); the glottal fricative, ஹ (ह or ha), is usually transcribed using the Tamil velar fricative, க (ka); and க்ஷ (क्ष or kṣa), which is a ligature of க் (क् or k) and ஷ (ष or ṣa), is often transcribed using the Tamil cluster ட்ச (ṭca).

However, even when such Tamil characters are used to transcribe Sanskrit words containing these letters, I usually transliterate them according to the Sanskrit letters that they represent, in order to indicate clearly in each case what the original Sanskrit word is. For example, in Tamil the Sanskrit word मोक्ष (mōkṣa), which means ‘liberation’, can be written either as மோக்ஷம் (mōkṣam) or as மோட்சம் (mōṭcam), but whichever way it happens to be written, I would usually transliterate it as mōkṣam. Likewise, in Tamil the Sanskrit compound word आत्मनिष्ठा (ātma-niṣṭhā), which means ‘self-abidance’, can be written either as ஆத்மநிஷ்டை (ātma-niṣṭhai) or as ஆன்மநிட்டை (āṉma-niṭṭāi), but whichever way it happens to be written, I would usually transliterate it as ātma-niṣṭhā (or ātma-niṣṭhai).

Besides க்ஷ (kṣa) and its various verb-bound forms, there is one other consonantal ligature that Tamil has borrowed from Grantha, namely ஸ்ரீ (śrī), for which there is no exact equivalent in pure Tamil script (but which is sometimes transliterated as சீ [] in certain names such as Srinivasan).

Transliteration of Devanagari Script

As I explained above, since the late nineteenth century Devanagari has become the standard script for writing Sanskrit, partly due to standardisations that took place under the influence of the British education policy in India, and partly due to the subsequent fact that Devanagari was adopted as the principal script in which Sanskrit texts were printed. Prior to that, such texts were written not only in Devanagari (in many northern regions of India) and Grantha (in many southern regions of India), but also in several other regional scripts such as Bengali, Gujarati, Kannada, Malayalam, Oriya and Telugu, and in older scripts such as Brahmi, Gupta and Sarada, because most Indic scripts other than Tamil have characters that denote every sound in the Sanskrit alphabet.

Though the Sanskrit alphabet has more vowels than and nearly twice as many consonants as the Tamil alphabet, it is actually simpler to transliterate phonetically, because each vowel or consonant has a more or less fixed pronunciation, unlike the ‘hard class’ consonants in Tamil. However, the Tamil script is simpler to read and write than Devanagari, because it uses ligatures only for some consonant-vowel compounds, whereas Devanagari uses ligatures not only for consonant-vowel compounds but also for all consonant clusters, which are numerous and often quite complex, since a single ligature can represent a cluster of several consonants. However, though the numerous ligatures in Devanagari make it somewhat difficult to read or write, this is not a problem when it is transliterated into Latin script using diacritics, because each member of a ligature is then represented by separate Latin letter or pair of letters.

When transliterating Sanskrit words in Latin script, I use a slightly modified form of the International Alphabet of Sanskrit Transliteration (IAST), upon which both the National Library at Kolkata romanization scheme and ‘ISO 15919’ are based. I deviate from IAST only in regard to two long Sanskrit vowels, ए and ओ, which I transliterate with a macron above as ē and ō respectively, and the anusvāra, ं (a diacritic dot placed above a vowel or consonant-vowel compound indicating that it is nasalised), which I transliterate either as or as some other nasal (, ñ, , n or m), depending upon which consonant follows it (as I will explain in more detail below).

In IAST these two long vowels, ए and ओ, are transliterated without a macron as e and o respectively, because in Sanskrit there are no short equivalents of them, but the National Library at Kolkata romanization scheme and ‘ISO 15919’ both recommend that they should each be transliterated with a macron above, as I do, both for the sake of uniformity and to avoid confusion with the short e and o in Dravidian languages such as Tamil, which have both a short e and o and a long ē and ō.

The anusvāra (ं) is transliterated in IAST as (with an underdot), but ‘ISO 15919’ recommends that it should be transliterated as (with an overdot), and hence I use as its default transliteration, though in many cases I transliterate it as whichever nasal it is actually pronounced in that context.

Transliteration of Sanskrit vowels

Thus I transliterate the fourteen Sanskrit vowels as follows: अ a, आ (ा) ā, इ (ि) i, ई (ी) ī, उ (ु) u, ऊ (ू) ū, ऋ (ृ) , ॠ (ॄ) , ऌ (ॢ) , ॡ (ॣ) , ए (े) ē, ऐ (ै) ai, ओ (ो) ō, औ (ौ) au.

The seven vowels with a macron above are each pronounced long, as are the two diphthongs, ऐ (ai) and औ (au), so for example अ (a) is a short sound while आ (ā) is a long sound. The symbol that I have given in brackets after each vowel except अ (a) is the diacritic form that that vowel takes when it is combined with a preceding consonant, the dotted circle in each case representing the position of the consonant in the resulting ligature. The first vowel अ (a) has no such diacritic form, because it is inherent in the default form of each consonant, and whenever a consonant is not followed by any vowel sound, the absence of a vowel is indicated by a virāma (्), an oblique diacritic stroke placed below the consonant (though this is used much less frequently than the equivalent diacritic puḷḷi or overdot [்] in Tamil, because except at the end of a sentence a mute consonant in Devanagari should always be joined to the following consonant to form a ligature). Thus, for example, the fifteen possible forms of the first consonant क (ka) are as follows: क् (k), क (ka), का (), कि (ki), की (), कु (ku), कू (), कृ (kṛ), कॄ (kṝ), कॢ (kḷ), कॣ (kḹ), के (), कै (kai), को () and कौ (kau).

Four of these Sanskrit vowels, ऋ (), ॠ (), ऌ () and ॡ (), are transliterated using a diacritic form of a Latin consonant, because they are vocalised retroflex forms of these consonant sounds. The two vocalic retroflex ‘r’s, ऋ () and ॠ (), are pronounced approximately as ‘ri’ (or sometimes more like ‘ru’) and ‘rī’ respectively, while the two vocalic retroflex ‘l’s, ऌ () and ॡ (), are pronounced approximately as ‘lri’ (somewhat like ‘lry’ in ‘revelry’) and ‘lrī’ respectively.

Of these four vowels, ऋ () occurs most frequently. Common examples of words in which it occurs include ऋषि (ṛṣi), which is commonly transcribed as ‘rishi’, कृष्ण (kṛṣṇa), which is commonly transcribed as ‘Krishna’, and various words derived from the verb दृश् (dṛś), which means to see, look at, understand, intuit or know. ॠ () occurs much less frequently (usually as the terminating vowel in a verbal root), and ऌ () still less frequently (mostly in certain forms of the verb कॢप्, kḷp, which means to be well ordered, be adapted to, be suitable, correspond, happen, arrange, produce, cause or create), but ॡ () does not occur in any word except ॡ (), which means Siva, mother or a divine female.

‘ISO 15919’ recommends that these four vowels should be transliterated with a small circle below instead of a dot below (as in IAST) in order to avoid confusion with two retroflex consonants, namely ṛa (for which the Devanagari character is ड़), which occurs in some languages such as Hindi and Panjabi, and ḷa (for which the Devanagari character is ळ), which occurs in Tamil (as ள) and in most Indian languages other than classical Sanskrit (though it does occur in the older Vedic Sanskrit). However I transliterate them according to IAST with a dot below, because the characters ‘r’ and ‘l’ with a circle below are not yet available in any suitable font, and fortunately the Sanskrit retroflex vowel ऌ () occurs so rarely that it is unlikely to be confused in any transliteration with the muted Tamil retroflex consonant ள் ().

Transliteration of Sanskrit consonantal diacritics

In the Sanskrit alphabet, the fourteen vowels are followed by two consonantal diacritics, the anusvāra, ं (), and the visarga, ः (). As I explained above, the anusvāra indicates that the vowel or consonant-vowel syllable above which it is placed is nasalised, so at the end of a sentence it is pronounced more or less like ‘m’, whereas before a plosive consonant it is pronounced as the corresponding nasal. The visarga indicates an aspiration of the preceding vowel and is therefore pronounced somewhat like ‘h’ in English, but often followed by a slight echo of the preceding vowel (so after an ‘a’, for example, it may be pronounced ‘hă’, or after an ‘i’, ‘hĭ’). It seldom occurs except at the end of a sentence, because before most consonants it merges or mutates into some other letter according to the rules of sandhi or letter-conjunction.

Though the default transliteration of the anusvāra is (or in IAST), before many consonants I transliterate it as only when there is a particular need to indicate the precise spelling of a word. That is, when it occurs at the end of a word, before a labial consonant (p, ph, b, bh or m) or before any non-sparśa consonant (y, r, l, v, ś, , s or h), I usually transliterate it as (or sometimes simply as m), but when it occurs before a non-labial sparśa consonant (whether plosive or nasal) I usually transliterate it as the corresponding nasal, because that is how it is actually pronounced (and also how it is transcribed in Tamil script). For example, ahaṁkāra and ahaṁtā (which both mean ‘ego’) are pronounced ahaṅkāra and ahantā respectively, saṁcita (which means ‘piled’ or ‘accumulated’) is pronounced sañcita, saṁdhi (which means ‘conjunction’) is pronounced sandhi, and saṁnyāsa (which means ‘renunciation’) is pronounced sannyāsa. Thus before a velar consonant (k, kh, g, gh or ) I usually transliterate the anusvāra as the velar nasal (), before a palatal consonant (c, ch, j, jh or ñ) as the palatal nasal (ñ), before a retroflex consonant (, ṭh, , ḍh or ) as the retroflex nasal (), and before a dental consonant (t, th, d, dh or n) as the dental nasal (n).

A variant of the anusvāra is the candrabindu or ‘moon-dot’, ँ (), which is used in Vedic Sanskrit in place of the anusvāra, and which like the anusvāra indicates nasalization of the vowel or consonant-vowel syllable above which it is placed. The best known example of its use is in the sacred monosyllable (ōm̐). In IAST it is transliterated like the anusvāra as , but ‘ISO 15919’ recommends that it should be transliterated as (‘m’ with a moon-dot centred above).

The classification of Sanskrit consonants

Like the Sanskrit vowels, the Sanskrit consonants are arranged phonetically, being divided into five groups of five stop consonants, each consisting of four plosives and one nasal, followed by two groups of four other consonants, the first consisting of four oral sonorants (which can be roughly described as approximants), two of which are semivowels (y and v) and the other two of which are liquids (r and l), and the second consisting of four fricatives, the first three of which are sibilants and the last of which is glottal.

The five groups of five stop consonants are called sparśa or ‘touch’ consonants, because they are formed by the organs of utterance (either the tongue and part of the mouth, or both the lips) actually touching each other. In each of these five groups, the first four are plosives (oral stops), of which the first two are unvoiced (or ‘voiceless’) while the second two are voiced, and in each of these two pairs the first is unaspirated while the second is aspirated (that is, pronounced with mahāprāṇa or a full breath, like ‘h’ in the English word ‘house’, and indicated in transliteration by an ‘h’ following the lead consonant). The fifth member of each of these five groups is a voiced and unaspirated nasal.

Thus for example in the first group, which are the velar or ‘throat’ (kaṇṭhya) stops, क (ka) is unvoiced and unaspirated, ख (kha) is unvoiced but aspirated, ग (ga) is voiced but unaspirated, घ (gha) is voiced and aspirated, and ङ (ṅa) is the velar nasal.

Each of the other four groups of sparśa consonants follows this same pattern. The second group are the palatal (tālavya) stops, च (ca), छ (cha), ज (ja), झ (jha) and ञ (ña). The third group are the retroflex or ‘cerebral’ (mūrdhanya) stops, ट (ṭa), ठ (ṭha), ड (ḍa), ढ (ḍha) and ण (ṇa). The fourth group are the dental (dantya) stops, त (ta), थ (tha), द (da), ध (dha) and न (na). The fifth group are the labial (ōṣṭhya) stops, प (pa), फ (pha), ब (ba), भ (bha) and म (ma).

In the Sanskrit alphabet, these five groups of sparśa consonants are followed by two other groups, each consisting of four consonants, namely the four oral sonorants or approximants (called the antastha or ‘standing between’ consonants), य (ya), र (ra), ल (la) and व (va), and the four fricatives (called the ūṣma or ‘hot’ consonants), which are the three sibilants, श (śa), ष (ṣa) and स (sa), and the glottal fricative, ह (ha).

Each of these final eight consonants are further classified along with one of the five groups of sparśa consonants according to their place of articulation, as follows:

  • Since ह (ha) is articulated with the glottis in the throat, it is classified along with the first group of sparśa consonants, the five velar stops, क (ka), ख (kha), ग (ga), घ (gha) and ङ (ṅa), which are articulated with the tongue touching the soft palate at the back of the mouth, and together they are called the kaṇṭhya or ‘throat’ consonants.
  • Since य (ya) and श (śa) are articulated with the body of the tongue close to the hard palate, they are classified along with the second group of sparśa consonants, the five palatal stops, च (ca), छ (cha), ज (ja), झ (jha) and ञ (ña), which are articulated with the body of the tongue touching the hard palate, and together they are called the tālavya or ‘palatal’ consonants.
  • Since र (ra) and ष (ṣa) are articulated with the tip of the tongue curled back close to the roof of the mouth, they are classified along with the third group of sparśa consonants, the five retroflex stops, ट (ṭa), ठ (ṭha), ड (ḍa), ढ (ḍha) and ण (ṇa), which are articulated with the tip of the tongue curled back so that it touches the roof of the mouth, and together they are called the mūrdhanya or ‘cerebral’ consonants.
  • Since ल (la) and स (sa) are articulated with the tongue close to the upper teeth, they are classified along with the fourth group of sparśa consonants, the five dental stops, त (ta), थ (tha), द (da), ध (dha) and न (na), which are articulated with the tongue touching the upper teeth, and together they are called the dantya or ‘dental’ consonants.
  • Since व (va) is articulated with the lower lip close to the upper teeth, it is classified along with the fifth group of sparśa consonants, the five bilabial stops, प (pa), फ (pha), ब (ba), भ (bha) and म (ma), which are articulated with both lips touching each other, and together they are called the ōṣṭhya or labial consonants.

PDF copy of this page

To read or download a PDF copy of this page, please click on the following link:

Transliteration, Transcription and Pronunciation — PDF copy