Module:Wp/nod/Translit2

From Wikimedia Incubator
Jump to navigation Jump to search

English Version[edit]

Templates and Intended Direct Use[edit]

The prefix WP/nod used in the icubator has been omitted.

Recommended invocation Function Purpose Testcases
{{xlit2}} trpage Transliterates a page from the Tai Tham script to the Thai script, mapping consonants etymologically. {{xlit2/testcases/control}}
{{xlit3}} trpage Transliterates a page from the Tai Tham script to the Thai script, mapping consonants phonetically. It is very much in development. {{xlit2/testcases/control}}
{{ᨩᩨ᩵ᨲ᩠ᩅᩫ}} lettername Returns name of letter (consonant or independent vowel) as convenient for naming Unicode codepoint. See {{ᨩᩨ᩵1A60}} for combining marks. {{ᨩᩨ᩵ᨲ᩠ᩅᩫ/testcases}}
{{ᨿᩣ᩠ᨠ|word}} hardword Special handling for when transliteration rules fail.
{{#invoke:Translit2|tr|string}} tr Transliterates a string from the Tai Tham script to the Thai script, mapping consonants etymologically. {{xlit2/testcases}}
{{#invoke:Translit2|tr|string|true}} tr Transliterates a string from the Tai Tham script to the Thai script, mapping consonants phonetically. It is very much in development. {{xlit3/testcases}}

Algorithms of function tr[edit]

Word Boundaries[edit]

The design assumes that word boundaries will frequently not be indicated. However, it is assumed that marking a word boundary is preferred to invoking the template {{ᨿᩣ᩠ᨠ}}.

Dependent Vowels[edit]

The simplest analysis for transliteration is to treat final glottal stops as part of the vowel. Northern Thai therefore has 12 vowel qualities, which can be short or long, and occur in an open or a closed syllable. 3 of these are diphthongs, which, unlike the system supported by the Standard Thai orthography, can also occur in short, closed syllables. There are three vowel-consonant combinations which may or must have special symbols - these appear to map straightforwardly to their Standard Thai equivalents. It appears that simply equating ᩂ and ᩄ with ฤ and ฦ works well enough.

The use of ᨠᩢ and ᨠᩡ to mark final /k/ also needs to be handled under the heading of vowels.

The transliteration process treats the vowels of ᨠᩣ and ᨣᩤ identically.

This tables links to the chief area of discussion of the transliteration of each vowel.

Sound quality Short closed Short open Long closed Long open Other
/a/ ᨠᩢ อั Yes ᨠᩡ อะ Yes ᨠᩣ อา Yes ᨠᩣ อา Yes
/i/ ᨠᩥ อิ Yes ᨠᩥ อิ Yes ᨠᩦ อี Yes ᨠᩦ อี Yes
/ɯ/ ᨠᩧ อึ Yes ᨠᩧ อึ Yes ᨠᩨ อื ᨠᩨ อือ
/u/ ᨠᩩ อุ Yes ᨠᩩ อุ Yes ᨠᩪ อู Yes ᨠᩪ อู Yes
/e/ ᨠᩮᩢ เอ็ Yes ᨠᩮᩡ เอะ ᨠᩮ เอ Yes ᨠᩮ เอ Yes
/ɛ/ ᨠᩯᩢ แอ็ Yes ᨠᩯᩡ แอะ ᨠᩯ แอ Yes ᨠᩯ แอ Yes
/o/ ᨠᩫ อ Yes ᨠᩰᩡ โอะ ᨠᩰᩫ โอ Yes ᨠᩰ โอ Yes ᨠᩮᩣ โอ Yes
/ɔ/ ᨠᩬᩢ อ็อ ᨠᩰᩬᩡ เอาะ Yes ᨠᩬ ออ Yes ᨠᩬᩴ ออ Yes
/ɤ/ ᨠᩮᩥᩢ เอิ-็ Yes ᨠᩮᩬᩥᩡ เออะ ᨠᩮᩥ เอิ ᨠᩮᩬᩥ เออ
/ia/ ᨠ᩠ᨿᩢ เอีย็ ᨠ᩠ᨿᩮᩡ เอียะ ᨠ᩠ᨿ เอีย ᨠ᩠ᨿᩮ เอีย
/ɯa/ ᨠᩮᩬᩥᩢ เอือ็ Yes ᨠᩮᩬᩥᩋᩡ เอือะ ᨠᩮᩬᩥ เอือ ᨠᩮᩬᩥᩋ เอือ
/ua/ ᨠ᩠ᩅᩢ อ็ว No ᨠ᩠ᩅᩫᩡ อํวะ ᨠ᩠ᩅ อว ᨠ᩠ᩅᩫ อัว

ᨠᩬ is handled differently with and without a tone mark.

Mai kak in its various forms:

Vowel Sample Words
/a/ ᩁᩢ รัก
/aː/ ᨾᩢᩣ มาก Yes
/ua/ ᨻ᩠ᩅ᩶ᩡ พวก
/uː/ ᩃᩪᩢ ลูก
/ɔː/ ᨯᩬᩢ ดอก ᨾᩬᩡ ดอก ᨯᩬᩢᩡ ดอก

Other symbols:

Sound /ai/ /ai/ /au/ /au/ /au/ /am/ /rɯ/ /lɯ/
Spelling ᨠᩱ ไอ ᨠᩲ ใอ ᨠᩮᩢᩣ เอา Yes ᨠᩳ เอา ᨠᩪᩦ เอา ᨠᩣᩴ Yes

Preposed Vowels[edit]

The two parts of the short preposed vowels of open syllables (ᨠᩮᩡ ᨠᩯᩡ ᨠᩰᩡ) are treated independently. Any tone mark can be handled independently.

Explicitly short preposed closed vowels (ᨠᩮᩢ᩠ᨠ ᨠᩯᩢ᩠ᨠ) occur rarely if ever with tone marks, and the interaction can be ignored. They can therefore be treated as long open preposed vowel plus a mark converted to maitaikhu (อ็) in place. There appears to be a convention that implicitly short preposed closed vowels are not marked as short in transliteration.

The two parts of the closed vowel ᨠᩰᩫ᩠ᨠ are treated independently.

The compound vowel symbol ᨠᩮᩣ rarely if ever incorporates a tone mark; it can therefore be converted to ᨠᩰ.

The preposed vowels (ᨠᩮ ᨠᩯ ᨠᩰ ᨠᩱ ᨠᩲ) are then converted to the corresponding Thai vowels (เอ แอ โแ ไอ ใอ). Ideally, one would just swap them with every preceding medial consonant and combination of sakot + consonant. One problem that has not been addressed is the the Thai orthographic syllable boundaries within consonant clusters occurs later. For example, the common final element -ᨵᨾ᩠ᨾᩮᩣ in monks’ names transliterates to -ธัมโม using vernacular rules, or -ธมฺโม for academic Pali. The extreme solutions are to interchange with just one consonant and to interchange with the whole cluster. The latter rule works much better for normal text, and is what is currently implemented.

Unambiguous Compound Vowels[edit]

Any combination of combining marks containing ᨠᩬ and a tone mark is treated as a compound vowel; these two marks must be swapped round in transliteration.

The following vowels are handled by a context-free substitution of two characters: ᨠᩯᩢ ᨠᩮᩣ ᨠᩰᩫ ᨠᩣᩴ ᨠᩢᩣ

The following vowels are handled by substituting for a maximal sequence of vowel signs and tone marks excluding SIGN A (ᨠᩡ); the tone marks are not shown in the list: ᨠᩮᩢ ᨠᩰᩬᩡ ᨠᩮᩥᩢ ᨠᩮᩬᩥᩢ ᨠᩮᩢᩣ ᨠᩬᩴ ᨠᩬ (only with tone mark)

ᨠᩡ is excluded from the sequence with the aim of dealing with open short vowels as one deals with long (closed) vowels. The maitaikhu resulting from some of these vowel symbols will not fit above the consonant in some of these syllables; it is placed on the next Thai character, which should be a consonant.

Tricky encoding is used for ᨠᩰᩬᩡ; the substring ᨠᩰᩬ is converted to เกา and the ᨠᩡ is converted independently.

Unambiguous Explicit Simple Vowels[edit]

Once all compound vowels have been dealt with, most of the simple vowels shown by a single combining mark can be dealt with by simple substitution to a combining mark or nothing. If the order of the encoding proposals is followed, then with one exception, treated as a compound vowel, the order of vowel sign and tone mark will be the same in both scripts. This handles the vowels ᨠᩡ ᨠᩢ ᨠᩣ ᨠᩥ ᨠᩦ ᨠᩧ ᨠᩩ ᨠᩪ ᨠᩫ ᨠᩬ ᨠᩮᩥ ᨠᩳ.

Ambiguous Vowels[edit]

The 'medial' vowels, those represented by subscript ᨿ and ᩅ , are generally ambiguous. See the section on medial vowels for more information. Certain other vowels are also ambiguous considered in isolation:

Although ᨠᩨ is unambiguous in the context of Northern Thai, it transliterates to อื in closed syllables but to อือ in open syllables.

The combination ᨠᩮᩬᩥ is ambiguous. In closed syllables, it represents /ɯːa/ and is transliterated as เอือ, but in open syllables it represents /ɜː/ and is transliterated as เออ unless it is part of the longer sequences ᨠᩮᩬᩥᩋ and ᨠᩮᩬᩥᩋᩡ which are the long and short syllables corresponding to /ɯːa/ and transliterate as เอือ and เอือะ. Because of word pairs such as ᩋᨶᩣᨳ and ᨶᩣᨳ, the ambiguity cannot always be resolved.

The combination ᨠᩬᩢ is also ambiguous. In an apparently open syllable, mai sat is actually mai kak, as in ᨯᩬᩢ = ᨯᩬᨠ which transliterates as ดอก. In a clearly closed syllable, ᨠᩬᩢ transliterates as อ็อ.

The most complicated logic distinguishing open and closed syllables is as follows:

  1. Split text into vowel sequence, tones plus sakot, and the next two characters.
  2. If the next two characters are ᩋᩡ and it needs special handling, apply the special handling and exit
  3. else if the next two characters are consonant and vowel, tone or sakot (should include medials!) then the syllable is open
  4. else if the next character is ᩋ and it needs special handling, apply the special handling and exit
  5. else if the next character is a consonant, the syllable is closed
  6. otherwise, the syllable is open.

This works on the principle that the syllable should be in a native Thai word, and therefore will not end in two consonants, albeit one silent, and that the final consonant will not be stacked with the initial syllable of the next. Therefore, stacked consonants following the vowel will start a new syllable.

The processing for ᨠᩮᩬᩥ identifies ᨠᩮᩬᩥᩡ as containing it in an open syllable.

This principle breaks down with some loanword spellings, such as ᨣᩬᩢᨷ᩠ᨷᩦ᩶ 'copy'. For now, the simplest solution for ᨠᩬᩢ is to treat spellings such as ᨯᩬᩢ as anomalous.

Medial Vowels[edit]

Ambiguities[edit]

The sequence ᨠ᩠ᨿᩢ is ambiguous. ᨿ is part of the vowel symbol in: ᨻᩕ᩠ᨿᩢᨠ ᨽ᩠ᨿᩢᨠ but is merely an onset consonant in: ᨡ᩠ᨿᩢ᩶ᩁ ᨺᩪ᩶ᩉ᩠ᨿᩢ᩶ᩁ ᩈ᩠ᨿᩢᨾ᩠ᨽᩪ ᩉ᩠ᨿᩢᨦ ᩉ᩠ᨿᩢᩁ ᩉ᩠ᨿᩢᨷ ᩋᩉ᩠ᨿᩢᨦ

ᨻ᩠ᨿᩢᨷᨯᩯ᩠ᨯ and ᨻ᩠ᨿᩢᨻ look contentious – the MFL transliterates the first syllable as พยับ and พยัพ but records the pronunciation as เปี๊บ.

The sequence is currently treated as a compound vowel, but based on the statistics, ᨻᩕ᩠ᨿᩢᨠ and ᨽ᩠ᨿᩢᨠ (which are the same word) should be listed as exceptions.

The sequence ᨠ᩠ᨿ is also ambiguous between เอีย in closed syllables (e.g. > เรียน) and combinations with implicit vowels such as อยั and อยะ in ᨻ᩠ᨿᨬ᩠ᨩᨶ > พยัญชนะ and ᩋᩣᨶᨱ᩠ᨿ > อา⁠นัณ⁠ยะ.

The sequence ᨠ᩠ᩅᩢ is likewise ambiguous. ᩅ is part of the vowel symbol in ᩉᩖ᩠ᩅᩢᨠ, but is merely an onset consonant in ᨠᩕᩉ᩠ᩅᩢᨯ ᨠ᩠ᩅᩢᨠ ᨠ᩠ᩅᩢᨦ ᨠ᩠ᩅᩢᨯ ᨡ᩠ᩅᩢᨠ ᨡ᩠ᩅᩢᩁ ᨡ᩠ᩅᩢ᩶ᩁ ᨣᩕᩉ᩠ᩅᩢᨯ ᨣᩖᩢ᩵ᨦ᩻ᨧ᩠ᩅᩢᨦ᩻ ᨣ᩠ᩅᩢᨠ ᨣ᩠ᩅᩢᨠ᩻ᨩᩦ᩶᩻ ᨣ᩠ᩅᩢᨯ ᨣ᩠ᩅᩢ᩶ᩁ ᨣᩢ᩠᩵ᨦ᩻ᨧ᩠ᩅᩢᨦ᩻ ᨤ᩠ᩅᩢᨠ ᨤ᩠ᩅᩢᨯ ᨤ᩠ᩅᩢᩁ ᨤ᩠ᩅᩢ᩵ᩁ ᨧ᩠ᩅᩢᨦ ᨧᩢ᩠ᨦᩉ᩠ᩅᩢᨯ ᨩ᩠ᩅᩢᨠ ᨲᩕᩉ᩠ᩅᩢᩁ ᨲᩱ᩶ᩉ᩠ᩅᩢᩁ ᨴ᩠ᩅᩢᨠ ᩅᩯ᩠ᨯ᩻ᩉ᩠ᩅᩢᩁ᩻ ᩈ᩠ᩅᩢᨠ ᩈ᩠ᩅᩢᩁ᩠ᨣ᩺ ᩈ᩠ᩅᩢᩔᨯᩦ ᩈ᩠ᩅᩢᩈᨯᩦ ᩉ᩠ᩅᩢᨠ ᩉ᩠ᩅᩢᨦ ᩉ᩠ᩅᩢᨯ ᩉ᩠ᩅᩢᩁ ᩉ᩠ᩅᩢ᩵ᩁ.

ᩉᩖ᩠ᩅᩢᨠ is therefore treated as an exception.

The sequence ᨠ᩠ᩅ is also ambiguous between the long vowel in a closed syllable and word-final กวะ as in ᩋᩢᩆ᩠ᩅ. The context easily disambiguates.

Implementation[edit]

The medial vowels are converted by finding the maximal sequence of Tai Tham vowels other than SIGN A, tone marks and vowel killers following consonant + subscript WA or YA. The consonant is required so as to exclude WA and YA acting as final consonants. This handles the compound vowel symbols ᨠ᩠ᩅ ᨠ᩠ᩅᩫ ᨠ᩠ᨿ ᨠ᩠ᨿᩢ ᨠ᩠ᨿᩮ. Combinations with tone marks are not shown. The combination ᨠ᩠ᩅ without any tone marks is not handled explicitly, for the vowel and the interpretation as a consonant cluster are homographs in both Tai Tham and Thai. ᨠ᩠ᨿᩮᩡ is processed as a combination of the vowels ᨠ᩠ᨿᩮᩡ and ᨠᩡ and ᨠ᩠ᩅᩫᩡ is processed as a combination of ᨠ᩠ᩅᩫ and ᨠᩡ.

Implicit Vowels[edit]

Practically, there are three cases where the implicit vowel becomes explicit upon transcribing:

  1. Final vowel in Pali/Sanskrit Words
  2. Open syllables in native words and Special Cases
  3. Closed syllables in Pali/Sanskrit words.
Final Vowel in Pali/Sanskrit Words[edit]

This requires the detection of the end of words; the final vowel is not written after the first element of a compound. Detection relies on the occurrence of a non-word character.

Consonant-sakot-consonant at the end of the word implies that the word is of Pali or Sanskrit origin, and has a final implicit vowel.

Mai kaa + consonant at the end of a word implies a final implicit vowel: if there were no final implicit vowel, the word would be written mai kaa + sakot + consonant.

Consonant + consonant at the end of a word implies a Pali Sanskrit word, and so a final vowel, with one significant exception. The second consonant might be preceded by a medial vowel, so ᩅ and ᨿ are excluded from the being the first consonant, unless they are not preceded by sakot.

The consonant in final phonetic vowel plus consonant will be converted to a subscript form unless there is very little space for it, so that yields the condition single storey consonant + Pali vowel not a vowel below+ consonant except when that single storey consonant is itself subscript.

Open syllables in native words and Special Cases[edit]

In general, in the MFL, native open syllables with the implicit vowel in Northern Thai have it made explicit, but it is not made explicit in words of Pali/Sanskrit origin. However, it is made explicit if it is immediately preceded by ra hong even if the word is of Pali/Sanskrit origin. This yields the fairly reliable rule:

Closed syllables in Pali/Sanskrit words[edit]

In principle, the rule is simple – implicit vowels are transcribed explicitly, and the syllables are closed by explicit clusters. Unfortunately, if for example ᨩ᩠ᨿᨦᩉ᩠ᨾᩲ᩵ were written without a word break, this rule would then be transliterated as เชียงัใหม่. The rule is therefore restricted to clusters that cannot occur at the starts of Tai Tham words. A whitelist of clusters is maintained.

Consonants[edit]

The general processing of consonants is mostly straightforward. There are a few special cases:

ᨷᩕ is mapped to ปร. ᨻᩛ ᨾᩛ ᨭᩛ ᨱᩛ are mapped to พพ มม ฏฐ ณฐ, and the processes is generalised to all obstruent and nasal labial and retroflex base consonants.

There is one glaring omission – the mapping of final, non-subscript ᩁ to น.

Tones[edit]


local export = {}
local gsub = mw.ustring.gsub
local u = mw.ustring.char
local match = mw.ustring.match
local find = mw.ustring.find

local PAGENAME = mw.title.getCurrentTitle().prefixedText
local function sc(s) return gsub(s, "[ᨠก]", ""); end; -- Remove mark bearers, which are added for readability.
local data = mw.loadData('Module:Wp/nod/Translit_data')
local disruptor = data.disruptor

local tt = data.tt2

local sakot = sc("ᨠ᩠")

local kia = {
  [sc("ᨠ᩠ᩅ᩵")] = sc("ก่ว"),
  [sc("ᨠ᩠ᩅ᩶")] = sc("ก้ว"),
  [sc("ᨠ᩠ᩅᩫ")] = sc("กัว"),
  [sc("ᨠ᩠ᩅᩫ᩵")] = sc("กั่ว"),
  [sc("ᨠ᩠ᩅᩫ᩶")] = sc("กั้ว"),
  [sc("ᨠ᩠ᨿ")] = sc("↶เกีย"),
  [sc("ᨠ᩠ᨿ᩵")] = sc("↶เกี่ย"),
  [sc("ᨠ᩠ᨿ᩶")] = sc("↶เกี้ย"),
  [sc("ᨠ᩠ᨿᩢ")] = sc("↶เกีย็"),
  [sc("ᨠ᩠ᨿᩢ᩵")] = sc("↶เกี่ย็"),
  [sc("ᨠ᩠ᨿᩢ᩶")] = sc("↶เกี้ย็"),
  [sc("ᨠ᩠ᨿᩮ")] = sc("↶เกีย"),
  [sc("ᨠ᩠ᨿᩮ᩵")] = sc("↶เกี่ย"),
  [sc("ᨠ᩠ᨿᩮ᩶")] = sc("↶เกี้ย"),
-- Hybrid forms
  [sc("ᨠ᩠ᩅ้")] = sc("ก้ว"),
  [sc("ᨠ᩠ᩅ๊")] = sc("ก๊ว"),
  [sc("ᨠ᩠ᩅ๋")] = sc("ก๋ว"),
  [sc("ᨠ᩠ᩅᩫ้")] = sc("กั้ว"),
  [sc("ᨠ᩠ᩅᩫ๊")] = sc("กั๊ว"),
  [sc("ᨠ᩠ᩅᩫ๋")] = sc("กั๋ว"),
  [sc("ᨠ᩠ᨿ้")] = sc("↶เกี้ย"),
  [sc("ᨠ᩠ᨿ๊")] = sc("↶เกี๊ย"),
  [sc("ᨠ᩠ᨿ๋")] = sc("↶เกี๋ย"),
  [sc("ᨠ᩠ᨿᩢ้")] = sc("↶เกี้ย็"),
  [sc("ᨠ᩠ᨿᩢ๊")] = sc("↶เกี๊ย็"),
  [sc("ᨠ᩠ᨿᩢ๋")] = sc("↶เกีย็๋"),
  [sc("ᨠ᩠ᨿᩮ้")] = sc("↶เกี้ย"),
  [sc("ᨠ᩠ᨿᩮ๊")] = sc("↶เกี๊ย"),
  [sc("ᨠ᩠ᨿᩮ๋")] = sc("↶เกี๋ย"),
--  [sc("")] = sc(""),
}

local function pkia(m1, m2)
    local r2 = kia[m2]
    if r2 then
        return m1..r2
    else
        return m1..m2
    end
end

local tone=sc("ᨠ᩵᩶ก่ก้ก๊ก๋")
local pv=sc("ᨠᩰᩬᩢᨠᩱᩩᩥᩴᨠᩲᩪᩨᨠᩮᩧ᩵ᩤᨠᩯᩨᩣᨠᩫ")
local pvt  = pv..tone
local pvtk = pvt..sc("ᨠ᩺ᨠ᩼")
local vt   = pv..sc("ᨠᩢᩡกะ")..tone
local vts  = vt..sakot
local cons_not_wy = "ᨠ-ᨾᩀᩁᩃᩆ-ᩌᩔ" -- Remove ᨿᩂᩄᩅ
local cons = "ᨠ-ᩌᩔ"
local pure_cons = "ᨠ-ᩁᩃᩅ-ᩌᩔ"
local cons_squat = "ᨠ-ᨭᨯ-ᩉᩋᩓᩔ" -- Omit ᨮᩊᩌ
local medial=sc("ᨠᩕᩖᨠᩛ")
local Mw = u(0x10fffe) -- Non-characters
local My = u(0x10ffff)
local Td = u(0xefffe)

-- 2-character transformations.    
local kam = {
    [sc("ᨠᩯᩢ")] = sc("↶แก็"),
    [sc("ᨠᩮᩣ")] = sc("ᨠᩰ"), -- Need to manipulate later as single vowel.
    [sc("ᨠᩮᩤ")] = sc("ᨠᩰ"),
    [sc("ᨠᩰᩫ")] = sc("ᨠᩰ"),
    [sc("ᨠᩣᩴ")] = sc("กำ"),
    [sc("ᨠᩤᩴ")] = sc("กำ"),
    ["ᨷᩕ"] = "ᨸᩕ", -- Straight to Thai would mess up ᨷᩕᩰ
    ["ᩃᩖ"] = "ᩃ᩠ᩃ", -- Simpler to only generate vowels before clusters with sakot
    [sc("ᨠᩢᩣ")] = sc("กา").."ก"
}

-- Context-independent sequences starting with mai ke or mai ko:
local keoext = {
    [sc("ᨠᩯᩢ")] = sc("↶แก็"),
    [sc("ᨠᩮᩢ")] = sc("↶เก็"), -- Interferes with ᨠᩮᩢ᩵ᩣ
    [sc("ᨠᩮᩢ᩵")] = sc("↶เก่ก็↷"), -- Attestation?
    [sc("ᨠᩮᩢ᩶")] = sc("↶เก้ก็↷"), -- Attestation?
    [sc("ᨠᩰᩬ")] = sc("↶เกา"), -- For ᨠᩰᩬ᩶
    [sc("ᨠᩰᩬ᩵")] = sc("↶เก่า"),
    [sc("ᨠᩰᩬ᩶")] = sc("↶เก้า"),
    [sc("ᨠᩮᩬᩥᩡ")] = sc("↶เกอะ"),
    [sc("ᨠᩮᩬᩥ᩵ᩡ")] = sc("↶เก่อะ"),
    [sc("ᨠᩮᩬᩥ᩶ᩡ")] = sc("↶เก้อะ"),
    [sc("ᨠᩮᩥᩢ")] = sc("↶เกิก็↷"),
    [sc("ᨠᩮᩥᩢ᩵")] = sc("↶เกิ่ก็↷"),
    [sc("ᨠᩮᩥᩢ᩶")] = sc("↶เกิ้ก็↷"),
    [sc("ᨠᩮᩬᩥᩢ")] = sc("↶เกือ็"), -- There is no ᨠᩮᩬᩥᩢ=เกือก
    [sc("ᨠᩮᩬᩥᩢ᩵")] = sc("↶เกื่อ็"), -- Unattested
    [sc("ᨠᩮᩬᩥᩢ᩶")] = sc("↶เกื้อ็"), -- Unattested
    [sc("ᨠᩮᩢᩣ")] = sc("↶เกา"),
    [sc("ᨠᩮᩢᩤ")] = sc("↶เกา"),
    [sc("ᨠᩮᩢ᩵ᩣ")] = sc("↶เก่า"),
    [sc("ᨠᩮᩢ᩵ᩤ")] = sc("↶เก่า"),
    [sc("ᨠᩮᩢ᩶ᩣ")] = sc("↶เก้า"),
    [sc("ᨠᩮᩢ᩶ᩤ")] = sc("↶เก้า"),
    [sc("ᨠᩬᩴ")] = sc("กอ"),
    [sc("ᨠᩬᩴ᩵")] = sc("ก᩵อ"),
    [sc("ᨠᩬᩴ᩶")] = sc("ก᩶อ"),
    [sc("ᨠᩬ᩵")] = sc("ก᩵อ"),
    [sc("ᨠᩬ᩶")] = sc("ก᩶อ"),
}
local keoext_phonetic = {
    [sc("ᨠᩯᩢ")] = sc("↶แก็"),
    [sc("ᨠᩮᩢ")] = sc("↶เก็"), -- Interferes with ᨠᩮᩢ᩵ᩣ
    [sc("ᨠᩮᩢ᩵")] = sc("↶เก่ก"), -- Attestation?
    [sc("ᨠᩮᩢ᩶")] = sc("↶เก้ก"), -- Attestation?
    [sc("ᨠᩰᩬ")] = sc("↶เกา"), -- For ᨠᩰᩬ᩶
    [sc("ᨠᩰᩬ᩵")] = sc("↶เก่า"),
    [sc("ᨠᩰᩬ᩶")] = sc("↶เก้า"),
    [sc("ᨠᩮᩬᩥᩡ")] = sc("↶เกอะ"),
    [sc("ᨠᩮᩬᩥ᩵ᩡ")] = sc("↶เก่อะ"),
    [sc("ᨠᩮᩬᩥ᩶ᩡ")] = sc("↶เก้อะ"),
    [sc("ᨠᩮᩥᩢ")] = sc("↶เกิก"),
    [sc("ᨠᩮᩥᩢ᩵")] = sc("↶เกิ่ก"),
    [sc("ᨠᩮᩥᩢ᩶")] = sc("↶เกิ้ก"),
    [sc("ᨠᩮᩬᩥᩢ")] = sc("↶เกือ"), -- There is no ᨠᩮᩬᩥᩢ=เกือก
    [sc("ᨠᩮᩬᩥᩢ᩵")] = sc("↶เกื่อ"), -- Unattested
    [sc("ᨠᩮᩬᩥᩢ᩶")] = sc("↶เกื้อ"), -- Unattested
    [sc("ᨠᩮᩢᩣ")] = sc("↶เกา"),
    [sc("ᨠᩮᩢᩤ")] = sc("↶เกา"),
    [sc("ᨠᩮᩢ᩵ᩣ")] = sc("↶เก่า"),
    [sc("ᨠᩮᩢ᩵ᩤ")] = sc("↶เก่า"),
    [sc("ᨠᩮᩢ᩶ᩣ")] = sc("↶เก้า"),
    [sc("ᨠᩮᩢ᩶ᩤ")] = sc("↶เก้า"),
    [sc("ᨠᩬᩴ")] = sc("กอ"),
    [sc("ᨠᩬᩴ᩵")] = sc("ก᩵อ"),
    [sc("ᨠᩬᩴ᩶")] = sc("ก᩶อ"),
    [sc("ᨠᩬ᩵")] = sc("ก᩵อ"),
    [sc("ᨠᩬ᩶")] = sc("ก᩶อ"),
-- Hybrid forms, for phonetic transliteration
--    [sc("ᨠᩮᩢ้")] = sc("↶เก้ก็↷"),
--    [sc("ᨠᩮᩢ๊")] = sc("↶เก๊ก็↷"),
--    [sc("ᨠᩮᩢ๋")] = sc("↶เก๋ก็↷"),
--    [sc("ᨠᩯᩢ้")] = sc("↶แก้ก็↷"),
--    [sc("ᨠᩯᩢ๊")] = sc("↶แก๊ก็↷"),
--    [sc("ᨠᩯᩢ๋")] = sc("↶แก๋ก็↷"),
    [sc("ᨠᩮᩢ้")] = sc("↶เก๊ก"), -- Dirty cheat
    [sc("ᨠᩮᩢ๊")] = sc("↶เก๊ก"),
    [sc("ᨠᩮᩢ๋")] = sc("↶เก๋ก"),
    [sc("ᨠᩯᩢ้")] = sc("↶แก๊ก"), -- Dirty cheat
    [sc("ᨠᩯᩢ๊")] = sc("↶แก๊ก"),
    [sc("ᨠᩯᩢ๋")] = sc("↶แก๋ก"),
    [sc("ᨠᩰᩬ้")] = sc("↶เก้า"),
    [sc("ᨠᩰᩬ๊")] = sc("↶เก๊า"),
    [sc("ᨠᩰᩬ๋")] = sc("↶เก๋า"),
    [sc("ᨠᩮᩬᩥ้ᩡ")] = sc("↶เก้อะ"),
    [sc("ᨠᩮᩬᩥ๊ᩡ")] = sc("↶เก๊อะ"),
    [sc("ᨠᩮᩬᩥ๋ᩡ")] = sc("↶เก๋อะ"),
--    [sc("ᨠᩮᩥᩢ้")] = sc("↶เกิ้ก็↷"),
--    [sc("ᨠᩮᩥᩢ๊")] = sc("↶เกิ๊ก็↷"),
--    [sc("ᨠᩮᩥᩢ๋")] = sc("↶เกิ๋ก็↷"),
    [sc("ᨠᩮᩥᩢ้")] = sc("↶เกิ๊ก"), -- Dirty cheat
    [sc("ᨠᩮᩥᩢ๊")] = sc("↶เกิ๊ก"),
    [sc("ᨠᩮᩥᩢ๋")] = sc("↶เกิ๋ก"),
-- There is no ᨠᩮᩬᩥᩢ=เกือก
    [sc("ᨠᩮᩢ้ᩣ")] = sc("↶เก้า"),
    [sc("ᨠᩮᩢ้ᩤ")] = sc("↶เก้า"),
    [sc("ᨠᩮᩢ๊ᩣ")] = sc("↶เก๊า"),
    [sc("ᨠᩮᩢ๊ᩤ")] = sc("↶เก๊า"),
    [sc("ᨠᩮᩢ๋ᩣ")] = sc("↶เก๋า"),
    [sc("ᨠᩮᩢ๋ᩤ")] = sc("↶เก๋า"),
    [sc("ᨠᩬᩴ้")] = sc("ก้อ"),
    [sc("ᨠᩬᩴ๊")] = sc("ก๊อ"),
    [sc("ᨠᩬᩴ๋")] = sc("ก๋อ"),
    [sc("ᨠᩬ้")] = sc("ก้อ"),
    [sc("ᨠᩬ๊")] = sc("ก๊อ"),
    [sc("ᨠᩬ๋")] = sc("ก๋อ"),
}

local function pkuue(m1, m2, m3)
    if mw.ustring.match(m3, "["..cons.."]["..vts.."]")
    or not mw.ustring.match(m3, "^["..cons.."]") then -- Mai kuue is in open syllable
        local repl = mw.ustring.match(m2, "^["..tone.."]")
        if repl then
            return sc("กื")..repl.."อ"..m3 -- Tone substituted at end
        else
            return sc("กือ")..m3
        end
    else
        return m1..m2..m3 -- Use final character by character substitution
    end
end

local function pkuea(m1, m2, m3)
    local is_maikoe -- Open syllable, Thai เกอ
    local repl = mw.ustring.match(m2, "^["..tone.."]")
    if not repl then repl = ""; end
    if "ᩋᩡ" == m3 then
        return sc("↶เกื")..repl.."อะ" -- Tone substituted at end
    elseif mw.ustring.match(m3, "["..cons.."]["..vts.."]") then
        is_maikoe = true
    elseif mw.ustring.match(m3, "^["..cons.."]") then
        if "ᩋ" == mw.ustring.sub(m3, 1, 1) then
            return sc("↶เกื")..repl..m3 -- Tone and consonant substituted at end
        else
            is_maikoe = false -- Closed syllable
        end
    else
        is_maikoe = true
    end
    if is_maikoe then
        return sc("↶เก")..repl.."อ"..m3 -- Tone substituted at end
    else
        return sc("↶เกื")..repl.."อ"..m3 -- Tone substituted at end
    end
end

-- We need a white list of 2-consonant clusters that may occur within words and belong to different phonetic
-- syllables.
-- Clusters that start Northern Thai words are a problem - they need a word boundary hint
-- to handle properly.  They could also be a problem within compound words.  For now, they
-- require explicit handling.
local white_list = { -- Is there a better idiom for a quickly checked list?
    ["ᨠ᩠ᨠ"]=1, ["ᨠ᩠ᨡ"]=1, ["ᨣ᩠ᨣ"]=1, ["ᨣ᩠ᨥ"]=1, ["ᨦ᩠ᨠ"]=1, ["ᨦ᩠ᨡ"]=1, ["ᨦ᩠ᨣ"]=1, ["ᨦ᩠ᨥ"]=1, ["ᨦ᩠ᩈ"]=1, 
    ["ᨧ᩠ᨧ"]=1, ["ᨧ᩠ᨨ"]=1, ["ᨩ᩠ᨩ"]=1, ["ᨩ᩠ᨫ"]=1, ["ᨬ᩠ᨧ"]=1, ["ᨬ᩠ᨨ"]=1, ["ᨬ᩠ᨩ"]=1, ["ᨬ᩠ᨫ"]=1, ["ᨬ᩠ᨬ"]=1, 
-- When this is applied, ᩋᩘᨩ will have been converted to ᩋᨦ᩠ᨩ.
    ["ᨦ᩠ᨧ"]=1, ["ᨦ᩠ᨨ"]=1, ["ᨦ᩠ᨩ"]=1, ["ᨦ᩠ᨫ"]=1, ["ᨦ᩠ᩈ"]=1,
    ["ᨧ᩠ᩈ"]=1, ["ᨬ᩠ᨪ"]=1, ["ᨪ᩠ᨫ"]=1, ["ᨱ᩠ᨬ"]=1, -- Weird, but seen or perceived
-- When this is applied, ᨭᩛ and ᨱᩛ will have been converted to ᨭ᩠ᨮ and ᨱ᩠ᨮ.
-- ᨭᩛ/ᨭ᩠ᨮ may have to be black-listed, for ᨭᩛ is often used for ᨮ.
    ["ᨭ᩠ᨭ"]=1, ["ᨭ᩠ᨮ"]=1, ["ᨯ᩠ᨯ"]=1, ["ᨯ᩠ᨰ"]=1, ["ᨱ᩠ᨭ"]=1, ["ᨱ᩠ᨮ"]=1, ["ᨱ᩠ᨯ"]=1, ["ᨱ᩠ᨰ"]=1, ["ᨱ᩠ᨱ"]=1, 
    ["ᨲ᩠ᨲ"]=1, ["ᨲ᩠ᨳ"]=1, ["ᨴ᩠ᨴ"]=1, ["ᨴ᩠ᨵ"]=1, ["ᨶ᩠ᨯ"]=1, ["ᨶ᩠ᨲ"]=1, ["ᨶ᩠ᨳ"]=1, ["ᨶ᩠ᨴ"]=1, ["ᨶ᩠ᨵ"]=1, ["ᨶ᩠ᨶ"]=1, 
-- When this is applied, ᨻᩛ and ᨾᩛ will have been converted to ᨻ᩠ᨻ and ᨾ᩠ᨻ.
    ["ᨷ᩠ᨷ"]=1, ["ᨷ᩠ᨹ"]=1, ["ᨻ᩠ᨻ"]=1, ["ᨻ᩠ᨽ"]=1, ["ᨾ᩠ᨷ"]=1, ["ᨾ᩠ᨹ"]=1, ["ᨾ᩠ᨻ"]=1, ["ᨾ᩠ᨽ"]=1, ["ᨾ᩠ᨾ"]=1, 
    ["ᨸ᩠ᨸ"]=1, ["ᨸ᩠ᨹ"]=1, ["ᨾ"]=1, ["ᨷ᩠ᨸ"]=1, ["ᨸ᩠ᨷ"]=1, -- Lao & mixed
    ["ᨠ᩠ᩇ"]=1, ["ᨦ᩠ᩆ"]=1, ["ᨦ᩠ᩇ"]=1, -- Sanskrit
-- When this is applied, ᩔ will have been converted to ᩈ᩠ᩈ.
    ["ᨿ᩠ᩉ"]=1, ["ᩃ᩠ᩉ"]=1, ["ᩅ᩠ᩉ"]=1, ["ᨿ᩠ᨿ"]=1, ["ᩃ᩠ᩃ"]=1, ["ᩈ᩠ᩈ"]=1, ["ᨱ᩠ᩉ"]=1, ["ᨬ᩠ᩉ"]=1, ["ᩊ᩠ᩉ"]=1, 
-- [""]=1, [""]=1, [""]=1, [""]=1, [""]=1, [""]=1, [""]=1, [""]=1, [""]=1,
}
local function satwl(m1, m2)
    if white_list[m2] then
        if m1 == "ᨿ" then -- prevent interpretation as mai kia.
            return disruptor.."ᨿ"..sc("ᨠᩢ")..m2
        else
            return m1..sc("ᨠᩢ")..m2
        end
    else
        return m1..m2	
    end
end

local esf = {
    [sc("ᨠᩘᨠ")] = sc("ᨦ᩠ᨠ"), [sc("ᨠᩜ")] = sc("ᨠ᩠ᨾ"), [sc("ᨠᩞ")] = sc("ᨠ᩠ᩈ"), ["ᩔ"] ="ᩈ᩠ᩈ" ,
}

local function pkra(m1, m2)
    if "ᨻᩕ" == m1 and "ᩉ" == m2 then
        return m1..m2 -- e.g. ᨻᩕᩉᩫ᩠ᨾ and ᨻᩕᩉᩢ᩠ᩈ
    else
         return m1..sc("ᨠᩡ")..m2
    end
end

local function pword(m1, m2)
-- Conceptually this is just
--	return m1..export.hardword(m2, "ᨻ")
-- but that was too slow.
	local entry = data.hard[m2]
	return m1..(entry and entry[1] or m2)
end

local function pword3(m1, m2)
-- Conceptually this is just
--	return m1..export.hardword(m2, "ᩈ")
-- but that would be too slow.
	local entry = data.hard[m2]
	return m1..(entry and (entry[2] or entry[1]) or m2)
end

local dbg3 = ''
local function padd_tone(m1, m2, m3, m4, m5)
	dbg3 = dbg3..m1..'s'..m2..'s'..m3..'s'..m4..'s'..m5..' => '
	local new_tone = Td -- To allow next syllable to be processed at next call.
--	local new_word = {m1, m2, m3, new_tone, m4, m5}
	local class = data.class[m1]
	local Cres = 'ᨦᨬᨱᨶᨾᨿᩁᩃᩅᩊ'
	local has_coda = false
	local branch = 0
	local sb2c = sc("^[ᨠᩩᩥᨠᩧᩢᨠᩫ]$")
	local sb2o = sc("^[ᨠᩩᩥᨠᩧᩢ]$")
	if class == 'R' then
		local sb15 = sc("^[ᨠᩡᨠ᩺ᨠ᩼]")
		if find(m5, '^'..u(0x1A60)) then
			if find(m5, u(0x1A60)..'['..Cres..']') then
				new_tone = sc("ก๋")
			end
			has_coda = true
			branch = 11
		elseif find(m5, "["..cons.."]["..vts.."]") then -- Sakot over-restrictive
			has_coda = false
			branch = 12
		elseif find(m5, "^["..cons.."]") then
			has_coda = true
			branch = 13
			if find(m5, "^["..Cres.."]") then
				new_tone = sc("ก๋")
				branch = 14
			end
		elseif find(m5, sb15) then
			has_coda = true
			branch = 15
		elseif find(m5, "^ำ") then
			has_coda = true -- Built in!
			new_tone = sc("ก๋")
			branch = 16
		end
		if not has_coda then
			if #m4 > 0 then
				new_tone = sc("ก๋")
				branch = 1
			elseif find(m3, sb2o) then
				branch = 2
			elseif #m2 == 0 and #m3 == 0 then
				branch = 3
			elseif #m2 == 3 and #m3 == 0 then -- count UTF-8 code units.
				branch = 4
			else
				new_tone = sc("ก๋")
				branch = 5
			end
		end
	elseif class == 'F' then
		local sb25 = sc("^[ᨠ᩺ᨠ᩼]")
		local dead = nil
		local short = nil
		if find(m5, '^'..u(0x1A60)) then
			dead = not find(m5, u(0x1A60)..'['..Cres..']')
			has_coda = true
			branch = 21
			if #m2 == 0 and #m3 == 0 and #m4 == 0 then
				dead = false -- not a dead syllable
				has_coda = true;
				branch = 27
			end
		elseif find(m5, "["..cons.."]["..vts.."]") then  -- Sakot over-restrictive
			has_coda = false
			branch = 22
		elseif find(m5, "^["..cons.."]") then
			if #m2 == 0 and #m3==0 and #m4==0 then
				has_coda = true
				dead = false -- Not a dead syllable
				branch = 28
			else
				has_coda = true
				branch = 23
				dead = not find(m5, "^["..Cres.."]")
			end
		elseif find(m5, "^ᩡ") then
			has_coda = true
			dead = true
			short = true
			branch = 24
		elseif find(m5, sb25) then
			has_coda = true
			dead = false -- Not a dead syllable
			branch = 25
		elseif find(m5, "^ำ") then
			has_coda = true -- Built in!
			dead = false
			branch = 26
		end
		if not has_coda then
			if #m4 > 0 then
				dead = false
				branch = 31
			elseif find(m3, sb2o) then
				dead = true
				short = true
				branch = 32
			else
				dead = false -- Not a dead syllable
				branch = 33
			end
		end
		if dead then
-- Next fails for compound vowels, but gets corrected by a dirty cheat in keoext_phonetic.
			if short == nil then short = find(m3, sb2c) end
			if short then
				new_tone = sc("ก๊")
			else
				new_tone = sc("ก้")
			end
		end
	end
--	dbg3 = dbg3..m1..m2..m3..new_tone..m4..m5..branch..' '
	return
	m1..m2..m3..new_tone..m4..m5
--	table.concat(new_word) -- Slower!
end

function export.tr(text, phonetic)
	local tStart = os.clock()
	if type(text) == "table" then -- called directly from a template
		phonetic = text.args[2]
		text = text.args[1]
	end
	text = "cont"..text.."exts" -- Supply ASCII context

    local wordchar = u(0x1A20).."-"..u(0x1A7F)..u(0x0E01).."-"..u(0x0E3A)..u(0x0E40).."-"..u(0x0E4E).."↶↷"
    local lwordchar = u(0x1A20).."-"..u(0x1A7F)
-- Expect text to have been subjected to Unicode normalisation.
-- Process known 'hard' words.
	if phonetic then
		text = gsub(text, "([^"..lwordchar.."])(["..lwordchar.."]*)", pword3)	
	else
		text = gsub(text, "([^"..lwordchar.."])(["..lwordchar.."]*)", pword)
	end
-- Undo the curse of Davis
	text = gsub(text, "("..sakot..")(["..tone.."]+)", "%2%1")
-- Drop syllable-internal mai sam
	text = gsub(text, "("..u(0x1A7B)..")(["..vt..medial..sakot.."])", "%2")
-- Deal with haang "ᩛ"
	text = gsub(text, "([ᨭ-ᨱ])ᩛ", "%1᩠ᨮ")
	text = gsub(text, "([ᨷ-ᨾ])ᩛ", "%1᩠ᨻ")
-- Simplify letters that implicitly stack etc.
	text = gsub(text, sc("[ᩔᩜᩘᨠᩞ]"), esf) -- Need U+1A60 in replacements
-- 2-character transformations.
	if phonetic then
		text = gsub(text, sc("[ᩃᨷᩮᩤᨠᩣᨠᩰᩢ][ᨠᩖᩢᩣᨠᩫᩤᨠᩕᩴ]"), kam)
	else
		text = gsub(text, sc("[ᩃᨷᩮᩤᨠᩯᩣᨠᩰᩢ][ᨠᩖᩢᩣᨠᩫᩤᨠᩕᩴ]"), kam)
	end
-- Manifest mai sat as an implicit vowel.
	text = gsub(text, "(["..cons..medial..sc("])([")..cons..sc("]ᨠ᩠[")..cons.."])", satwl)
-- Generate final implicit vowel after cluster.  Depends on end of word indication
    text = gsub(text, "([^"..sakot.."])(["..cons..sc("]ᨠ᩠[")..cons..
                      "])([^"..wordchar.."])", "%1%2ะ%3")
    text = gsub(text, "([^"..sakot.."])(["..cons.."]["..medial..
                      "])([^"..wordchar.."])", "%1%2ะ%3")
-- Generate final vowel in other cases.
	text = gsub(text, "(["..sc("ᨠᩰᨠᩣᨣᩤ][")..cons.."])([^"..wordchar.."])", "%1ะ%2")
	text = gsub(text, "(["..cons_not_wy.."]["..pure_cons.."])([^"..wordchar.."])", "%1ะ%2")
	text = gsub(text, "([^"..sakot.."][ᨿᩅ]["..cons.."])([^"..wordchar.."])", "%1ะ%2")
	text = gsub(text, "([^"..sakot.."]["..cons_squat..sc("][ᨠᩥᨠᩦᨠᩮᨠᩰ][")..cons..
    	              "])([^"..wordchar.."])", "%1ะ%2")
	if phonetic then -- Insert/modify tones
		text = gsub(text, sc("ᨠ᩠[ᩅᨿ]"), {[sc("ᨠ᩠ᩅ")] = Mw, [sc("ᨠ᩠ᨿ")] = My})
		med3 = sc("ᨠᩕᨠᩖ")..Mw..My
		Fc = "ᨣᨩᨴᨻ" -- Consonants that transliterate from low to mid
		Rc = "ᨠᨧᨭᨲᨸ" -- Consonants that change from high to mid.
		text = gsub(text, "["..Fc..sc("]ᨠᩕ"),
				{["ᨣᩕ"] = "คᩕ", ["ᨩᩕ"] = "ชᩕ", ["ᨴᩕ"] = "ทᩕ", ["ᨻᩕ"] = "พᩕ"})
		local ctone = {[sc("ᨠ᩵")] = sc("ก้"), [sc("ᨠ᩶")] = sc("ก๊")}
		local function ptone(m1, m2) return m1..ctone[m2] end
		local Vo = sc("ᩮᩰᩯᩱᩲᨠᩩᩢᨠᩪᩥᨠᩬᩦᨠᩧᨠᩨᨠᩫᨠᩴ")
		text = gsub(text, "(["..Fc.."]["..med3.."]*["..Vo.."]*)("..sc("[ᨠ᩵ᨠ᩶])"), ptone)
		local Vf = sc("ᩤᩣ")
		text = gsub(text, "(["..Fc..Rc.."])(["..med3.."]*)(["..Vo.."]*)([ᩤᩣ]?)"..
						"([^"..tone..Vo..Vf..Td.."].)", padd_tone)
		text = gsub(text, "(["..Fc..Rc.."])(["..med3.."]*)(["..Vo.."]*)([ᩤᩣ]?)"..
						"([^"..tone..Vo..Vf..Td.."].)", padd_tone)
		text = gsub(text, "["..Mw..My..Td.."]",
				{[Mw] = sc("ᨠ᩠ᩅ"), [My] = sc("ᨠ᩠ᨿ"), [Td] = ""})
	end
-- And turn most final ᩁ into ᨶ.
	text = gsub(text, "([^"..sc("ᨣᩬᨣ᩠ (").."])ᩁ([^"..wordchar.."])", "%1ᨶ%2")
-- Deal with ᨠ᩠ᨿᩮ and ᨠ᩠ᩅᩫ derivatives.
	text = gsub(text, "(["..cons..medial.."])("..sakot.."[ᨿᩅ]["..pvtk.."]*)", pkia)
-- Does not handle Tai Khün ᨠ᩠ᨿ᩺ and ᨠ᩠ᩅ᩺!
--        text = gsub(text, "(["..cons..medial.."])("..sakot.."[ᨿᩅ]["..pvt.."]*)%f[^"..sakot.."]", pkia)
--long transformations
	if phonetic then
		text = gsub(text, sc("[ᨠᩮᨠᩰᩬᨠᩯ][")..pvt.."]*", keoext_phonetic)
	else
		text = gsub(text, sc("[ᨠᩮᨠᩰᩬᨠᩯ][")..pvt.."]*", keoext)
	end
	text = gsub(text, sc("(.ᨠᩕ)([")..cons.."])", pkra)
-- Context-dependent transliterations
-- ᨠᩨ ᨠᩮᩬᩥ (including ᨷᩮᩬᩥ᩵ᩋᩡ)
	text = gsub(text, sc("(ᨠᩨ)(["..tone.."]?ᨠ᩠?)(.?.?)"), pkuue)
	text = gsub(text, sc("(ᨠᩨ)(["..tone.."]?ᨠ᩠?)(.?.?)"), pkuue) -- Run again for words like ᨾᩨᨳᩨ
	text = gsub(text, sc("(ᨠᩮᩬᩥ)(["..tone.."]?ᨠ᩠?)(.?.?)"), pkuea)
	text = gsub(text, sc("(ᨠᩮᩬᩥ)(["..tone.."]?ᨠ᩠?)(.?.?)"), pkuea) -- Run again for same reason

	if phonetic then
		text = gsub(text, ".", data.tt3)
	else
		text = gsub(text, ".", tt)
	end
	text = gsub(text, sc("([ᨠ᩠")..u(0x200D).."][ก-ฮ]̱?)↶([เแโไใ])", "↶%2%1")
	text = gsub(text, sc("(ᨠ᩠").."[ก-ฮ]̱?)↶([เแโไใ])", "↶%2%1") -- Words like ᩉ᩠ᨦ᩠ᩅᩯ᩶᩻ yielding แหงว้ ๆ
	text = gsub(text, "([ก-ฮ]̱?)↶([เแโใไ])", "%2%1")
	text = gsub(text, sakot, "")
	text = gsub(text, "("..sc("ก็")..")↷([ก-ฮ]̱?)", "%2%1")

	return
--	'\n '..(os.clock()-tStart)..' ᩅᩥᨶᩣᨴᩦ:\n'..
	string.sub(text, 5, -5) --..dbg3 -- discard added ASCII context
end

local debug = "+DEBUG="
local function plink(m1, m2, m3)
    m2 = gsub(m2, "^:", " :")
    if m3 == "|" then
        return m1.."{{Wp/nod/ᩀ᩵ᩣᨳᩬᨯ-ᨠ|"..m2.."}}|"
    else
        return m1.."{{Wp/nod/ᩀ᩵ᩣᨳᩬᨯ-ᨠ|"..m2.."}}|"..m2..m3
    end
end

local trans_mode
local function recur(m1, m2, m3)
    debug = debug..";"..m1..","..m2..","..m3
    if mw.ustring.find(m1, "^.%{%{%u%u") then -- Might be magic!
        debug = debug..",MAGIC "
        return m1..m2..m3
    elseif "|" == m2 then
        if mw.ustring.find(m3, "^ᩅᨳ=") then -- Don't override
            debug = debug..",NOOVERRIDE "
            return m1..m2..m3
        end
    end
    debug = debug..",DONE "
    return m1.."|ᩅᨳ="..trans_mode..m2..m3
end

local function pparam(m1, m2, m3)
	if mw.ustring.find(m2, "^|") or m2 == "" then
		return trans_mode
	else
		return m1..m2..m3
	end
end

function export.trpage(page, phonetic)
	if type(page) == "table" then -- called directly from a template
		page = page.args[1] or PAGENAME
		phonetic = false
	end

	local text = mw.title.new(page):getContent()
-- Remove invocations of whole page transliteration templates
	text = gsub(text, "%{%{Wp/nod/translit%}%}", "")
	text = gsub(text, "%{%{Wp/nod/xlit[23]%}%}", "")
-- Remove text that is not to be repeated.
	text = gsub(text, "%{%{Wp/nod/ᩀ᩵ᩣᨪ᩶ᩣᩴ%}%}.-%{%{Wp/nod/ᨿᩬᨾᨪ᩶ᩣᩴ%}%}", "")
-- Insert transliteration parameter in plausible looking templates.  More work may be
-- necessary to make this robust.  Only allow templates whose names begin with a letter,
-- and disallow names beginning with two capitals.  (Latter test is deferred to recur().)
	debug_extra_parameter = false
	if debug_extra_parameter then debug = debug..text end
	if phonetic then
		trans_mode = 'ᩈ'
	else
		trans_mode = 'ᨠ'
	end
	text = gsub(text, "([^%{]%{%{%a[^%{]-)([%}%|])(.-%})", recur)
	text = gsub(text, "(%{%{%{ᩅᨳ)([^}]*)(%}%}%})", pparam)
	if debug_extra_parameter then debug = debug.." Expanded to:"..text end
-- Hide Wiki-specific markup
--	text = gsub(text, "<ref.-</ref>", "")
--	text = gsub(text, "<references/?>", "")

-- Calculation of frame filched from https://commons.wikimedia.org/wiki/Module:NationAndOccupation/sandbox
	local frame=mw.getCurrentFrame()
	text = frame:preprocess(text)
-- Protect links
	text = gsub(text, "(%[%[)([^%]|]+)([%]|])", plink)
	text = frame:preprocess(text) -- Expand Template:ᩀ᩵ᩣᨳᩬᨯ-ᨠ -- TO DO: Move task to plink.
	debug = ''
	text = export.tr(text, phonetic)
	return text -- ..debug
end

function export.lettername(letter, way)
    if type(letter) == "table" then -- called directly from a template
        way = letter.args["ᩅᨳ"]
        letter = letter.args[1]
    end
    local odd = data.oddname[letter]
    local prefix = nil
    if mw.ustring.find("ᨭᨮᨰᨱ", letter) then
    	prefix = 'ระ'
    else
    	prefix = ''
    end
    if odd then
    	return odd
    elseif 'ᩈ' == way -- planned for transcription)(transliteration
    or     'ᨠ' == way 
    then
        local class = data.class[letter]
        local newlet = data.tt3[letter]
        local tonerule = {["L"]=sc("ก"), ["F"]=sc("ก๊"), ["M"]=sc("ก๋"),
                          ["R"]=sc("ก๋"), ["H"]=sc("ก๋"), }
        local tone = tonerule[class]
        if class and tone and newlet then
            return prefix..newlet..tone.."ะ"
        end
    elseif 'ᨠ' == way then
        local class = data.class[letter]
        local newlet = data.tt2[letter]
        if class and newlet then
            return prefix..newlet.."ะ"
        end
    else
        return letter
    end
    return "{{Wp/nod/huge|{{Wp/nod/font color|red|white|{{Wp/nod/ᩀ᩵ᩣᨳᩬᨯ|"..letter..
           "}} ᨷ᩵ᨸᩮ᩠ᨶᩋᨠ᩠ᨡᩁ}}}}"
end 

function export.hardword(word, way)
	local frame, advice = nil, nil
	if type(word) == "table" then
		local frame = word
		word=frame.args[1]
		way=frame.args["ᩅᨳ"]
		advice = frame.args[way] or frame.args["ᨠ"]
	end
	if word == nil then
		return "" --.."x1"
	end -- An error message might be appropriate
	if way == nil then
		return word --.."x2"
	end -- Lazy, pointless invocation.
	if advice then
		return advice --.."x3"
	end
	local wordin = word
	word = gsub(word, u(0x200B), "") -- Strip ZWSP
	entry = data.hard[word]
	if not entry then
		return wordin --.."x4"
	end -- Nothing useful achieved
	if way == "ᨠ" then -- Transliteration of Northern Thai
		-- This is the fallback assumption
	elseif way == "ᩈ" then -- Transcription of Northern Thai (i.e. Siamese sound values)
		if entry[2] then
			return entry[2] --.."x5"
		end	
	end
	if entry[1] then
		return entry[1] --.."x6"
	else
		return wordin --.."x7" -- Nothing useful achieved.  Consider logging.
	end
end

function export.trphage(page)
	if type(page) == "table" then -- called directly from a template
		page = page.args[1] or PAGENAME
	end
	return export.trpage(page, true)
end

return export