and Open Source Movements in Thailand
Theppitak KAROONBOONYANAN, Thaweesak KOANANTAKOOL
National Electronics and Computer Technology Center
National Science and Technology Development Agency
Ministry of Science Technology and Environment, Thailand.
Standardization of IT in Thailand was recognized since 1984, when there were more than 26 sets of character codes were in use . Two years later, an agreed standard code for Thai language was announced as a Thai Industrial Standard, TIS 620-2529/1986. However, at that time, only the codes were standardized. The input/output systems for computer processing  have not yet been unified. Operating systems and applications have been localized individually, based on different conventions. The proprietary standard that gains the lion’s share in the market becomes de facto, no matter how its enhancement makes it deviated from industrial standards. Interoperability problem is therefore inevitable, especially in the age that different systems are connected through the Internet. Hence, standardization plays an important role in moderating the plethora of practices.
Recently, the open source paradigm has been widespread, and has become another model for software development. The openness of the source code also gives the chance to control the conformance to the standards of the software, as well as the satisfaction to users’ needs.
To shape consistent language support technology in the country, standardization activities and responses to open source movements are thus important, and will be described in this paper.
1. Character Sets
The national standard character set for use in computers is TIS 620-2533/1990, from which several character sets are derived, for example, IBM code page 874 (cp-874), Microsoft code page 874 (windows-874) and Apple Thai (MacThai) . These character sets are widely adopted in proprietary software, causing conflicts in communication among different platforms in the Internet.
Ironically, it’s the game of the name. Only TIS 620 common characters are exchanged in practice, with different code set labels. The response to the code set with “unknown” name depends on applications. Some ignore the code set and process the text with their default preferences, while others simply reject.
Ad hoc solutions are also ubiquitous, such as using “iso-8859-1” or “x-user-defined” code name for Thai E-mails and web sites, by which Thai message could pass through the hole to the receiver in some weak situations. But that is not always the case.
In September 1998, the “tis-620” MIME character set has been registered by Trin Tantsetthi  with the Internet Assigned Number Authority (IANA) of the Internet Engineering Taskforce (IETF). A campaign has been set up by a group of developers   to promote the use of the new standard MIME character set.
In 1999, the international standard ISO/IEC 8859-11 Latin/Thai characters has been reactivated by the ISO/IEC JTC1/SC2/WG2, and is becoming another potential choice of the standard MIME character set. When applied, “tis-620” and “iso-8859-11” are likely to be aliases to each other.
For multilingual documents, “utf-8”  is another possible alternative encoding. Nonetheless, the lack of UTF-8 editor is still the problem.
The third edition of ISO/IEC 14651 International String Ordering  has included an informative annex describing Thai string ordering. And, hopefully, the ordering of Thai in the standard would be satisfactory for Thai users.
A principle for Thai string ordering in detail has been proposed by a group of developers , and, as a consequence, the LC_COLLATE category of POSIX locale has been defined, as well as the other categories in a later time .
With the cooperation with the GNU C library project, the drafted POSIX locale has been made effective with glibc 2.1.1, which is used in modern distributions of Linux operating systems, such as Red Hat 6.0. Applications known to be internationalized and reflect the Thai locale include Linux ‘date’ and ‘cal’ commands, GNOME calendar, GNOME panel clock, KDE panel clock, and Perl 5.
Thai fonts currently available in the market are designed based on Roman font metrics. This is not appropriate for Thai glyphs, since Thai characters are written in 4 levels. As a result, Thai glyphs are usually compressed to accommodate space for the 4 levels, and look smaller than Roman letters with the same point size.
The National Electronics and Computer Technology Center (NECTEC) therefore set up a committee for drafting the standard metrics for Thai glyphs relative to Roman and for creating prototype fonts to be used in public domain.
Three public-domain fonts, knowned as National Fonts (NF) 1, 2 and 3, are now available to the public. They are aimed to be the default fonts available in every platform. NF1 and NF3 are serif fonts. NF2 is sans serif. NF4 is planned for a “calligraphic” model font and NF5 is planned for a “handwriting” model font. Within December 1999, the official names of these fonts will be announced as part of the celebration of the 6th cycle anniversary (72nd birthday) of His Majesty The King of Thailand.
4. Tai Scripts Studies
Thai language used in the central Thailand belongs to the Tai language family. The scripts belonging to the family have caught the interests from a group of standardization committees. For example, New Tai Lue and Tai Dam scripts have been proposed to be encoded in the ISO/IEC 10646-1 character set.
In Thailand, Mr. Thawee Sawangpanyangkoon has done a research on Tai scripts and has created TrueType fonts for 13 Tai scripts, through the funding of the Thailand Research Fund (TRF).
We expect that more efforts will be made in the study of unification of these scripts with Thai scripts.
5. Open Source Movements
Several developers in Thailand have adopted the philosophy of open-source software in their works and have joined the world in this movement. Linux, the free OS of Linus Torvald, has become popular in Thailand and many developers have joined together in boosting the use of Thai language in the OS, with X Window as the GUI environment.
There are currently four local Linux distributions in Thailand: Kaiwal Linux by Kaiwal Software, Linux School Internet Server (Linux SIS) and Linux with Thai Language Extension (Linux-TLE) by the National Electronics and Computer Technology Center (NECTEC), and Burapha Linux by Burapha University. These distribution developers meet regularly and join in regular Linux/Open-Source Symposia. It is expected that some distributions may merge in the new releases.
5.2 Development Projects
Several efforts are made to enable Thai language in open-source applications. Here are some examples:
Solutions and practices are usually one step further than the standards. In such situation, interoperability problem will call for new standards. The Internet has proven to be the main force in making new standard and interoperability adopted a lot faster than in the past. More and more developers are now joining force in the making of standards and putting these standards to work
Open source model does not only provide a means of cooperative development, but also allows the software to be standardized, and the standard conventions to be realized. Therefore we take both streams as our means to develop our information technology for the future. We have illustrated the case of Thailand, which is now gaining a tremendous trust from the open-source movement. The outcome is amazing: something real, usable and stable enough for mission-critical applications.