Some first thoughts about arabic i18n


#1

During the 2nd Indico Workshop in fall 2017, it was mentioned that an arabic translation of Indico would be desirable for 2018. Such a project would imply more work than for right-to-left languages, which are in that sense compatible with English and other language translations up to now. Thus a strong impact on code development and a commitment from developers to follow new rules: Not only character strings will have to be translated, but also the arrangement of HTML elements must be minimally adopted, if not systematically then at least along some main lines. I am trying to illustrate the complexity with this post.
It is not meant to discourage us from doing it; there is a huge potential for this language, which has the 5th most native speakers in the world. But the effort should not be underestimated and carefully planned, in order not to get stuck or disappointed by the result.

Here is a tentative of a Participant List in Indico:


It illustrates that titles, like all text, need to be right-aligned. But the alignment of column headers is not obvious, if a mixture of arabic “legend” and latin character “content” is allowed in the conference. In addition, the order of columns would naturally be changed by a native arabic writer. The left bar menu should appear on the right side.

CSS 1.0 standard already contained the rtl/ltr (right-to-left/left-to-right) values for bidirectional writing. (And I believe it has been extended to positioning divs and ps accordingly as well. But in order to give everybody something more to think, here is an example containing more subtleties.
As you will notice, numbers are (always) written left-to-right in arabic (i.e. opposed to the text direction)! Hence the rendering is wrong, and I could not figure out, why. (It works in arabic Wikipedia.) In this special case the order of the words and the group of digits also gets reversed (at least on my FF 52.4.0).

It might also be instructive to have a look at the genesis of arabic Wikipedia. By the way, they use “Western arabic” numbers.

When it comes to the translation of dates, note that there are five different schemas to name the months of the solar calendar in arabic.

More reading about RTL/LTR (using hebrew for the examples).


PS: As I cannot guarantee its permanent availability on the w3school site, here is the HTML source used above for reference.
<style type="text/css">
p {unicode-bidi:bidi-override;}
.ltr {direction:ltr;}
.rtl {direction:rtl;}
</style>

<p class="ltr">Writing text from left to right: 0 1 2 3 4 5 6 7 8 9 </p>
<p class="rtl">٠١٢٣٤٥٦٧٨٩ من اليمين إلى اليسار <br/>
Writing text from right to left: 0 1 2 3 4 5 6 7 8 9.</p>