Friday 31 March 2017

Computer Science Internationalization - Adaptive URL

A URL can consist of a Domain Name and a pathname. In the examples below x.y.z represents the Domain Name, the remainder being the pathname. My experience of the internet is that the pathname is usually written in English or more accurately ASCII. The below ASCII pathname represents a multi-page website in the form of a journey from home to a hotel in Korea.

x.y.z/home/bus/airplane/korea/taxi/hotel

Websites, such as Google, adapt the language of their text content according to the browser preferred display language (BL). This browser preferred language can be set by the user. Letʼs go one step further than Google and adapt the language of the URL pathname according to the BL. Here is the ASCII pathname rewritten into Chinese, Japanese and Korean.

x.y.z/家/公共汽车/飞机/韩国/出租车/饭店

x.y.z/ホーム/バス/飛行機/韓国/タクシー/ホテル

x.y.z/홈/버스/비행기/한국/택시/호텔

So, how do we implement these language adaptive URL parthnames? Firstly, we need to programmatically determine the BL. One way of achieving this is to examine the Accept-Language http header sent from the browser to the server. This will contain one or more language tags. If there is more than one language tag they are presented in priority order. Language tags can take many forms. They include: zh, zh-CN and cmn for Mandarin Chinese; ja for Japanese and ko for Korean. Now that we can determine the BL we can select the appropriate URL pathname, thus internationalizing our website with a language adaptive URL pathname.

On a Linux machine, each component of the pathname will be a directory. In my schema I am assuming an index.html or index.php, per directory. A requirement of this schema is that we do not want a directory hierarchy for each language, nor do we want an index.html or index.php for each language.

My native language is English so I will make my master pathname directory names English ie home, bus, airplane, korea, taxi and hotel. I will make the Chinese, Japanese and Korean directory names as aliases to the English named master directories. This can be easily achieved on Linux with the ln -s command, where ln means link and the -s option means create symbolic link, as opposed to a hard link.

ln -s home 家
ln -s home ホーム
ln -s home 홈

ln -s hotel 饭店
ln -s hotel ホテル
ln -s hotel 호텔

What if your native language is not English? In that case, create the master pathname directory names in your native language. If your native language is Korean then the master directory names will be 집, 버스, 비행기, 한국, 택시 and 호텔 and your links will be:

ln -s 홈 home
ln -s 홈 家
ln -s 홈 ホーム

ln -s 호텔 hotel
ln -s 호텔 饭店
ln -s 호텔 ホテル

Emoji are hugely popular so letʼs construct a totally cool Emoji pathname.

x.y.z/🏡/🚌/🛩/🇰🇷/🚕/🏨

ln -s home 🏡
ln -s bus 🚌
ln -s airplane 🛩
ln -s korea 🇰🇷
ln -s taxi 🚕
ln -s hotel 🏨

I have never encountered an Emoji URL pathname on a website and so implementing such a pathname on your website would be both totally cool and unique. You could also use an Emoji pathname for those languages your website does not support. My schema only supports Chinese, English, Japanese and Korean. If the BL was an unsupported language, such as Arabic, then the Emoji pathname could be displayed in the browser address bar instead of, for example, defaulting to English.

I have used x.y.x to represent the Domain Name, the implication being it is ASCII. We can complete the language adaptive equation by having Domain Names in supported BL languages. Thus my completed equation schema would have Chinese, Japanese and Korean Domain Names in addition to an ASCII Domain Name.

Friday 17 March 2017

Computer Science Internationalization - EAI

As I stated in schappo.blogspot.co.uk/2017/01/chinese-email-address.html both DataMail and Google mail support Email Address Internationalization (EAI). DataMail provides a complete EAI service which includes both support and creation of Internationalized email addresses. Google Mail provides a partial EAI service, in that, it supports EAI but does not yet provide for creation of internationlized email accounts with internationalized email addresses. Thus organisations using Google Mail have an advantage over those organisations having an ASCII addresses only email service and have a head start in provision of a complete EAI service.

Given the Domain name of an organisation, the Unix host command can be used to determine the mail service provider. Here are some of the organisations using Google Mail:

苹果电脑 ~: host spotify.com
spotify.com has address 194.132.198.198
spotify.com has address 194.132.197.198
spotify.com has address 194.132.198.149
spotify.com mail is handled by 10 ASPMX3.GOOGLEMAIL.com.
spotify.com mail is handled by 1 ASPMX.L.GOOGLE.com.
spotify.com mail is handled by 10 ASPMX2.GOOGLEMAIL.com.
spotify.com mail is handled by 5 ALT2.ASPMX.L.GOOGLE.com.
spotify.com mail is handled by 10 ASPMX5.GOOGLEMAIL.com.
spotify.com mail is handled by 5 ALT1.ASPMX.L.GOOGLE.com.
spotify.com mail is handled by 10 ASPMX4.GOOGLEMAIL.com.
苹果电脑 ~: host twitter.com
twitter.com has address 104.244.42.129
twitter.com has address 104.244.42.1
twitter.com mail is handled by 30 aspmx3.googlemail.com.
twitter.com mail is handled by 10 aspmx.l.google.com.
twitter.com mail is handled by 20 alt1.aspmx.l.google.com.
twitter.com mail is handled by 30 aspmx2.googlemail.com.
twitter.com mail is handled by 20 alt2.aspmx.l.google.com.
苹果电脑 ~: host mixi.jp # ミクシィ
mixi.jp has address 52.198.59.66
mixi.jp has address 54.92.71.226
mixi.jp has address 52.198.89.90
mixi.jp mail is handled by 30 aspmx2.googlemail.com.
mixi.jp mail is handled by 10 aspmx.l.google.com.
mixi.jp mail is handled by 20 alt2.aspmx.l.google.com.
mixi.jp mail is handled by 20 alt1.aspmx.l.google.com.
mixi.jp mail is handled by 30 aspmx3.googlemail.com.
苹果电脑 ~: host bristol.ac.uk # University of Bristol
bristol.ac.uk has address 137.222.0.38
bristol.ac.uk mail is handled by 5 ALT1.ASPMX.L.GOOGLE.COM.
bristol.ac.uk mail is handled by 10 ASPMX2.GOOGLEMAIL.COM.
bristol.ac.uk mail is handled by 1 ASPMX.L.GOOGLE.COM.
bristol.ac.uk mail is handled by 10 ASPMX3.GOOGLEMAIL.COM.
bristol.ac.uk mail is handled by 5 ALT2.ASPMX.L.GOOGLE.COM.
苹果电脑 ~: host bathspa.ac.uk # Bath Spa University
bathspa.ac.uk has address 194.83.160.0
bathspa.ac.uk has address 162.13.24.154
bathspa.ac.uk has address 72.47.217.0
bathspa.ac.uk mail is handled by 10 ALT4.ASPMX.L.GOOGLE.COM.
bathspa.ac.uk mail is handled by 5 ALT2.ASPMX.L.GOOGLE.COM.
bathspa.ac.uk mail is handled by 1 ASPMX.L.GOOGLE.COM.
bathspa.ac.uk mail is handled by 5 ALT1.ASPMX.L.GOOGLE.COM.
bathspa.ac.uk mail is handled by 10 ALT3.ASPMX.L.GOOGLE.COM.
Providing a full EAI service involves going beyond ASCII. It entails supporting Unicode email addresses. Unicode email addresses such as my Chinese email 小山@电邮.在线