When it comes to transcription of languages I can be very pedantic. Decisions that are made need to 1. be consistent 2. actually be a reflection of how the language is to be pronounced. This strike me as self-evident, but time and time again, I find that people do a bad job at these things. This post will be a summary of some common pitfalls in people's transcriptions of Classical Arabic, seemingly due to grammar books doing a poor job at explaining these things.
While I myself tend to vary somewhat between what transcription systems I use, depending on the audience, using signs transparent to linguists when I write more linguistic articles, I personally find that the DMG transcription is mostly excellent, and I use it in works aimed at a more Arabist audience (with some minor tweaks), The problem however is that the article that explains the transcription system does a poor job at explaining some of the common pitfalls.
The 3rd person masculine suffix
Most people, when transcribing Classical Arabic, tend to transcribe the pronominal suffix -hu/-hi without a long vowel in all contexts. This is incorrect. In Classical Arabic, the forms with a short vowel -hu/-hi are only used after heavy syllables: katabnāhu 'we wrote it', minhu'from it', fīhi 'in it', ʿalayhi 'on it'. When the suffix follows a light syllable, this suffix has a long vowel -hū/-hī, i.e. ʾinnahū 'verily, he', bihī 'with it'.
It is somewhat understandable that people get this wrong. Classical Arabic orthography in Naskh script (usually) does not have good tools to express this distinction, using هُ for -hu and -hū and هِ for -hi and -hī. In the Naskh style, occasionally we see that -hi and -hī are distinguished, where the former is spelled هِ and the latter هٖ. This becomes standard practice in Ottoman Qurans. It annoys me to no end that they never thought to come up a similar visual distinction between -hu and -hū... but what can you do. Modern print Qurans from the Indian subcontinent did devise such a distinction, using an up-side-down ḍammah for -hū: هٗ.
In the Maghreb, in carefully vocalised manuscripts we see that for -hū a miniature wāw is written below the hāʾ and for -hī a miniature returning yāʾ is written above the hāʾ. A similar practice is found in modern print Qurans like the Cairo Edition and the Medina Mushaf, which write a wāw and yāʾ after the hāʾ.
The Feminine demonstrative pronoun
Similar to the 3rd person masculine suffixes, The feminine demonstrative pronoun, despite being written هذه in fact ends with a long vowel hāḏihī in connected speech. Therefore the common transcription hāḏihi is wrong.
Iltiqāʾ al-Sākinayn
Very few people seem to know the proper rules for what to do with a word that ends in a consonant followed by a word that starts with two consonants with no intervening vowel, this is known as Iltiqāʾ al-Sākinayn. As a general rule, an epenthetic vowel is inserted and this epenthetic vowel is -i.
- qālat-i l-yahūdu
- ʾaw-i mraʾatun
- ʾiḏ-i ẓ-ẓālimūna
- man-i ttaḫaḏa
- raǧulun-i ftarā
One main exception to this is the plural pronominal forms that end in -m (-hum, -tum, -kum, kum, ʾantum). In these cases the epenthetic vowel is always u.
- kānū hum-u ẓ-ẓālimīna
- ʾa-fa-raʾaytum-u l-māʾa
As to the treatment of the harmonized for -him, so few people get this right that I'm not even sure what the standard form "should" be. The grammarians say both -him-i and -him-u are acceptable. -him-u is the form used by the most common Quranic reading traditions, and it is probably safe to go with that.
When a word that ends in a consonant occurs before a word that starts with two consonants followed by a stem-vowel u like the imperative (u)ḫruǧ, the grammarian allow both vowel harmony and no vowel harmony, e.g. ʾaw-u ḫruǧū and ʾaw-i ḫruǧū. The latter is the easier practice so probably best to stick to that.
For the verbal plural ending -aw such as in ʾātaw or taḫšaw, the connecting vowel is always u. Here the -u vowel is the only options, -i is incorrect.
- ʾātaw-u z-zakāta
- taḫšaw-u n-nāsa
The preposition min has an exceptional status. Whenever it precedes the definite article its epenthetic vowel is -a, e.g. min-a n-nāsi. When it precedes another ʾalif al-waṣl, it is -i: min-i mriʾin.
The first person pronoun
While the first person pronoun, from its spelling, might look like it is pronounced ʾanā, this is incorrect. It is pronounced ʾana with a short vowel. Only when pausing upon this word the form is ʾanā. In modern prints of the Quran this is absolutely clear, as أنا consistently has a small circle on top of the second ʾalif to indicate that it is not pronounced.
The pausal indefinite accusative
Most people get the pausal forms of nouns right when dealing with any case form but the indefinite accusative. Thus, most know that it is raǧul for raǧulun, raǧulin and ar-raǧul for ar-raǧulu, ar-raǧuli and ar-raǧula. But for some reason it is fashionable to render the indefinite accusative raǧulan not as the correct raǧulā but keep the non-pausal form with the tanwīn. This is no doubt inspired by Modern Standard Arabic, where this is quite common even in pause (nobody says ʾahlan wa-sahlā). However, if the goal is to transcribe Classical Arabic, this is incorrect. raǧulā is the correct form.
The pausal feminine ending -at-
The feminine ending -at- in pause is frequently transcribed simply as -a. This is in line with the Modern Standard Arabic pronunciation, and thus quite defensible in that context. However, for Classical Arabic, the proper pronunciation is -ah and therefore should be transcribed as such. The -a spelling is especially annoying to me in the transcription of the Quran where the feminine ending -atun clearly rhymes with -a-hū in pause, thus obviously showing that they are both pronounced identically: -ah, rhymes with -a-h.
As a side-note, those feminine nouns with a long vowel before it like aṣ-ṣalātu in their pausal form should be aṣ-ṣalāh. Again, in Modern Standard Arabic aṣ-ṣalāt is popular, but incorrect for Classical Arabic.
Word-initial hamzah is phonemic and should be written
Because word-initial vowels in European languages are automatically preceded by a glottal stop (hamzah), transcribers tend to think of word-initial hamzah in Arabic as just being 'automatic'. However, it is phonemically salient, and therefore better written. This is a place where I deviate from the DMG transcription system. In absolute word-initial position not writing the hamzah is not something so terrible that it is a hill that I'll die on, but I find it useful to write it because it gives us a chance to indicate a difference in behaviour between ism (which loses its initial vowel in connected speech) and ʾiṯm (which does not lose its initial vowel).
After the definite article, not writing it becomes extremely confusing. الاسم is pronounced [alism] and الإثم is pronounced [alʔiθm]. They are spelled differently in Arabic orthography, they are pronounced distinct, so the common al-ism and **al-iṯm is confusing and wrong. The latter should be spelled al-ʾiṯm.
The plural base ʾul-
Several forms use the plural pronominal base ʾul-, e,g, ʾulāʾika "those". Despite its spelling with wāw: أولئك the base is not pronounced with a long vowel **ʾūl-, and therefore should not be transcribed as such.
ūw and uww, īy and iyy are not interchangeable.
Sometimes we see that authors have the tendency to write uww (ḍammah followed by a wāw that carries a šaddah) as ūw, but the latter is proper to ḍammah followed by wāw followed by another wāw. These two sequences are contrastive and should be treated as such. Thus the passive of qawwala is quwwilaقُوِّلَ and the passive of qāwala is qūwilaقُووِلَ. Admittedly, it is a lot harder to find a contrastive pair between īy and iyy, but iyy still better reflects the orthography and pronunciation so I see no reason why we would transcribe this īy, better to use the spelling that parallels the ūw/uww contrast.
Aḷḷāh not Allāh!
Another pet peeve that I have with the transcription of Arabic is that the name of God is written with a normal lām. This would be fine if we were dealing with a transliteration, after all, the Arabic script does not distinguish it from a normal lām either. However, a transcription aimed at rendering the pronunciation of Classical Arabic should make this distinction as it is a separate sound. The name of God is pronounced with a velarized long l [lˁː] whenever a or u precede, and if one were to pronounce it as a non-emphatic version of the name, you would be pronouncing the name wrong. Therefore, the convention of writing an underdot for emphatic consonants should be adhered to here as well. i.e. mina ḷḷāhi, li-llāhi, fa-zāda-hum-u ḷḷāhu.
That is all for now. I may think of some more issues in transcription, and then will update accordingly. I will probably write another post soon on how to transcribe Quranic reading traditions.