- Australian Catholic University [@ACUInternational] weibo.com/acuinternational
- Charles Darwin University [@查尔斯达尔文大学] weibo.com/charlesdarwinuni
- Curtin University [@科廷大学CurtinUniversity] weibo.com/CurtinWestAustralia
- Deakin University [@澳大利亚迪肯大学] weibo.com/deakinuniversity
- Federation University [@澳大利亚联邦大学FedUni] weibo.com/FedUniAustralia
- Flinders University [@FlindersUni弗林德斯大学] weibo.com/flinders2011
- La Trobe University [@澳大利亚拉筹伯大学] weibo.com/latrobeuniaus
- Macquaire University [@澳大利亚麦考瑞大学] weibo.com/mquni
- Monash University [@MonashUni澳大利亚蒙纳士大学] weibo.com/monashuniversityaust
- Queensland University of Technology [@QUT昆士兰科技大学] weibo.com/qutbrisbane
- Southern Cross University [@澳大利亚南十字星大学] weibo.com/scuchina
- Swinburne University of Technology [@澳洲斯威本科技大学] weibo.com/swinburneuniversity
- University of Adelaide [@澳大利亚阿德莱德大学] weibo.com/uniadelaide
- University of Canberra [@堪培拉大学] weibo.com/unicanberra
- University of Melbourne [@墨尔本大学官微] weibo.com/melbourneuni
- University of New South Wales [@澳洲新南威尔士大学] weibo.com/ozunsw
- University of Queensland [@昆士兰大学] weibo.com/myuq
- University of South Australia [@南澳大学官方微博] weibo.com/studyatunisa
- University of Southern Queensland [@澳大利亚南昆士兰大学] weibo.com/usqchina
- University of Western Sydney [@西悉尼大学UWS] weibo.com/uwsinternational
- University of Wollongong [@澳大利亚卧龙岗大学UOW] weibo.com/uowaustralia
My Interests include: International Computing, IDNs, Apple Macs, i18n, Chinese, Japanese & Korean.
Monday, 14 April 2014
Australian Universities on Weibo
There are quite a number of Australian Universities on Sina Weibo 新浪微博. Below I list those I have found. I only include those Australian Universities that have verified (the big blue V after the username) Weibo accounts. The text in square brackets is the username on Weibo.
Friday, 11 April 2014
Regular Expressions
Regular Expressions are not just about ASCII. They are (or should be) about Unicode, with ASCII being a very small subset of Unicode.
The vast majority of Regular Expressions documentation and tutorials I have seen, only deal with ASCII. The consequence is that many/most will never consider non ASCII text strings.
If one considers Unicode text strings then one can process text strings consisting of non Latin Scripts and Symbols. Scripts such as: Cyrillic, Devanagari, Tamil, Georgian, Cherokee, Chinese and Sinhala. Symbols such as: Currency, Arrows, Mathematical Operators, Mahjong Tiles and Playing Cards. Unicode has a repertoire of over 100000 characters which can be processed with Regular Expressions.
Mostly, Regular Expressions are no different when using Unicode as compared to using the very limited ASCII. I will give some simple examples using Hangul, which is the Script used for writing Korean. The Hangul characters I will be using in the examples below are in Unicode block Hangul Syllables U+AC00-D7AF. I will intersperse other Unicode characters in my examples below. I present the examples in the form of a terminal session transcript.
The transcript may look a bit odd because of the variety and unfamiliarity of the Unicode characters I have used. If, though, you carefully examine the above Regular Expressions you will see they have standard syntax and are actually elementary constructs. So, if you teach Regular Expressions, why not give your students an insight into processing Unicode strings and not just ASCII strings. Or, to put it another way, give your students an insight into processing multi-language strings and not just English strings. Or, to put it yet another way, code for the whole world and not just the English speaking world.
BTW — 苹果电脑 ~: is the prompt I setup for my iMac and the first four characters are Chinese for Apple Computer.
In the examples above, I have deliberately used one of the standard and common Regular Expression engines. I have accessed this engine via egrep. This type of engine is one which you will most likely encounter. Much less common, are the Regular Expression engines that have been extended with features specifically for Unicode. Such extensions, for instance, facilitate matching with Unicode characters having some specified property e.g. \p{Hangul} will match with any character belonging to the Hangul Script. More information on such engines is available at regular-expressions.info/unicode.html and unicode.org/reports/tr18/
If one considers Unicode text strings then one can process text strings consisting of non Latin Scripts and Symbols. Scripts such as: Cyrillic, Devanagari, Tamil, Georgian, Cherokee, Chinese and Sinhala. Symbols such as: Currency, Arrows, Mathematical Operators, Mahjong Tiles and Playing Cards. Unicode has a repertoire of over 100000 characters which can be processed with Regular Expressions.
Mostly, Regular Expressions are no different when using Unicode as compared to using the very limited ASCII. I will give some simple examples using Hangul, which is the Script used for writing Korean. The Hangul characters I will be using in the examples below are in Unicode block Hangul Syllables U+AC00-D7AF. I will intersperse other Unicode characters in my examples below. I present the examples in the form of a terminal session transcript.
苹果电脑 ~: egrep '바나나' abcdef abc바나나def abc바나나def 苹果电脑 ~: egrep '바.나.나' 바诺丁汉나拉夫堡나 바拉나夫나堡 바拉나夫나堡 苹果电脑 ~: egrep '[바나다]' abcdef 보노도고로 ДЖԶख나ખ༁ ДЖԶख나ખ༁ 苹果电脑 ~: egrep '[가-힣]' abcdef abc현def abc현def 苹果电脑 ~: egrep '^[ 가-힣]+$' abcdef abc서울def 서울은 아름답다 서울은 아름답다Where you see a line duplicated that means there was a successful match with the Regular Expression. I have used egrep on OSX.
The transcript may look a bit odd because of the variety and unfamiliarity of the Unicode characters I have used. If, though, you carefully examine the above Regular Expressions you will see they have standard syntax and are actually elementary constructs. So, if you teach Regular Expressions, why not give your students an insight into processing Unicode strings and not just ASCII strings. Or, to put it another way, give your students an insight into processing multi-language strings and not just English strings. Or, to put it yet another way, code for the whole world and not just the English speaking world.
BTW — 苹果电脑 ~: is the prompt I setup for my iMac and the first four characters are Chinese for Apple Computer.
In the examples above, I have deliberately used one of the standard and common Regular Expression engines. I have accessed this engine via egrep. This type of engine is one which you will most likely encounter. Much less common, are the Regular Expression engines that have been extended with features specifically for Unicode. Such extensions, for instance, facilitate matching with Unicode characters having some specified property e.g. \p{Hangul} will match with any character belonging to the Hangul Script. More information on such engines is available at regular-expressions.info/unicode.html and unicode.org/reports/tr18/
Thursday, 16 January 2014
Japanese Domain Name
I believe はじめよう.みんな to be the world's first live fully Japanese Domain Name! It is written with the Japanese Hiragana script. みんな is one of Google's new gTLDs icannwiki.com/index.php/.みんな.
Google translates はじめよう to "Let's start with" and みんな to "Everyone" translate.google.co.uk/#ja/en/はじめよう%0Aみんな
One can use the Ideographic Full Stop rather than the ASCII Full Stop as the separator in Internationalized Domain Names ie はじめよう。みんな. This then gives us the rather cool translation to English "Let's start with. Everyone" translate.google.co.uk/#ja/en/はじめよう。みんな
Google translates はじめよう to "Let's start with" and みんな to "Everyone" translate.google.co.uk/#ja/en/はじめよう%0Aみんな
One can use the Ideographic Full Stop rather than the ASCII Full Stop as the separator in Internationalized Domain Names ie はじめよう。みんな. This then gives us the rather cool translation to English "Let's start with. Everyone" translate.google.co.uk/#ja/en/はじめよう。みんな
Monday, 13 January 2014
Apple Color Emoji
In a previous post schappo.blogspot.co.uk/2014/01/localized-font-names.html I examined how well Browsers deal with language localized font names. My exploration, this time, concerns the number of localized names a font has. I decided upon a font that stands out from the crowd. A font that many will be aware of, namely, Apple Color Emoji. OSX Mavericks 10.9.1 has 33 system language localizations and I found that the Apple Color Emoji font has 16 unique language localizations, as in the table below.
English
|
Apple Color Emoji |
Arabic
|
لون
|
Chinese (Simplified)
|
Apple 彩色表情符号 |
Chinese (Traditional)
|
Apple 彩色表情符號 |
Danish
|
Apple farve-emoji |
Dutch
|
Apple Kleur-Emoji |
Finnish
|
Applen väri-emoji |
French
|
Apple Emoji couleur |
German
|
Apple Farben-Emoji |
Italian
|
Colore Emoji Apple |
Japanese
|
Apple カラー絵文字 |
Korean
|
Apple 컬러 이모티콘 |
Norwegian
|
Apple farge-emoji |
Portuguese
|
Apple Emoji em Cores |
Russian
|
Цветные эмодзи Apple |
Swedish
|
Apple färg-emoji |
Sunday, 12 January 2014
Localized Font Names
| English Name | Chinese Name |
|---|---|
| HanziPen SC | 翩翩体-简 |
| Wawati SC | 娃娃体-简 |
| Xingkai SC | 行楷-简 |
| Yuppy SC | 雅痞-简 |
My aim is to determine whether or not Browsers can select and use fonts by their, in this case, Chinese names. Here is the relevant html code:
<p style="font-family:'HanziPen SC'">
1a 拉夫堡,莱斯特,伦敦。</p>
<p style="font-family:'翩翩体-简'">
1b 拉夫堡,莱斯特,伦敦。</p><hr />
<p style="font-family:'Wawati SC'">
2a 拉夫堡,莱斯特,伦敦。</p>
<p style="font-family:'娃娃体-简'">
2b 拉夫堡,莱斯特,伦敦。</p><hr />
<p style="font-family:'Xingkai SC'">
3a 拉夫堡,莱斯特,伦敦。</p>
<p style="font-family:'行楷-简'">
3b 拉夫堡,莱斯特,伦敦。</p><hr />
<p style="font-family:'Yuppy SC'">
4a 拉夫堡,莱斯特,伦敦。</p>
<p style="font-family:'雅痞-简'">
4b 拉夫堡,莱斯特,伦敦。</p>
The text is in pairs labelled a and b. A Browser that recognises both the English and Chinese names for a font will render a and b text identically as it will be using the same font. A Browser that does not recognise the Chinese name will use a substitute font and hence a and b text will appear differently. It is expected that a Browser will always recognise an English font name but, as will demonstrated, Chinese font names are often not recognised. Figure 1 shows correct Browser behaviour and Figure 2 shows incorrect browser behaviour.
![]() |
| Figure 1: Correct Browser Behaviour, strings a and b rendered with same font |
![]() |
| Figure 2: Incorrect Browser Behaviour, strings a and b rendered with different fonts |
I tested with Chrome (31.0.1650.63), Firefox (v26.0) and Safari (v7.0.1). Test OS was OSX Mavericks 10.9.1. Only Firefox worked correctly!!! My OSX localization for the tests was English. I did switch my OSX to Chinese and repeated my tests but the results were the same.
Here is what W3C have to say in CSS Fonts Module Level 3 w3.org/TR/css3-fonts/#font-family-prop: "Some font formats allow fonts to carry multiple localizations of the family name. User agents must recognize and correctly match all of these names independent of the underlying platform localization, system API used or document encoding"
So, in due course, all Browsers should work with all Localized Font Names. As evidenced by my tests, that is not yet the case. So, what do we do in the meantime. I am with Kendra Schaefer's (www.kendraschaefer.com/2012/06/chinese-standard-web-fonts-the-ultimate-guide-to-css-font-family-declarations-for-web-design-in-simplified-chinese/) recommendations, which is to include all the Localized Font names in the font-family declaration eg
font-family: "Yuppy SC", "雅痞-简", sans-serif;
Wednesday, 2 January 2013
Unicode and Hangul
One of the Unicode blocks is Hangul Syllables, codepoints U+AC00➜U+D7AF. Each character in this block has a formal Unicode name written in upper case Latin eg
- U+AC85 겅 HANGUL SYLLABLE GEONG
- U+B268 뉨 HANGUL SYLLABLE NWIM
I draw your attention to the last word in the Unicode name. This may appear to be a string of random Latin characters. If though you use the Mac OSX GongjinCheong Romaja Input Method this string represents the sequence of key presses required to produce the Hangul Syllable. So, taking the first example above, typing the key sequence GEONG will produce the Hangul character 겅.
There are two cases where one needs to augment the key sequence in order to write the Hangul character.
There are two cases where one needs to augment the key sequence in order to write the Hangul character.
- When the syllable begins with a vowel then one needs to prefix with the silent placeholder ㅇwhich in the GongjinCheong Romaja Input Method is produced by typing X. Thus, U+C54B 앋 HANGUL SYLLABLE AD, is produced by typing the key sequence XAD
- When the syllable ends with ㄲ or ㅆ then one needs to type ⇧G or ⇧S, respectively. Thus, U+AC14 갔 HANGUL SYLLABLE GASS, is produced by typing the key sequence GA⇧S
Notes:
- Unicode characters can be viewed on Mac OSX using Character Viewer
- The GongjinCheong Romaja Input Method is enabled in System Preferences➞Language & Text➞Input Sources
Saturday, 20 October 2012
Twitter Character Count
【Update: Do not know when it happened but Twitter no longer differentiates between BMP and non BMP characters WRT character count. All characters now have a count of 1. I may, at some stage, delete this article but, for the time being, I will leave it here as an historical record of the evolution of Twitter.】
In a previous article I examined Sina Wēibó 新浪微博 character count for a user post schappo.blogspot.co.uk/2012/10/weibo-character-count.html Lets now examine twitter. The stated and generally understood limit is 140 characters for a tweet. This is not strictly true. The actual tweet limit is variable and ranges from 70 to 140, inclusive. Different characters have different counts, as follows:
In a previous article I examined Sina Wēibó 新浪微博 character count for a user post schappo.blogspot.co.uk/2012/10/weibo-character-count.html Lets now examine twitter. The stated and generally understood limit is 140 characters for a tweet. This is not strictly true. The actual tweet limit is variable and ranges from 70 to 140, inclusive. Different characters have different counts, as follows:
- Characters from Unicode range U+0000➜U+FFFF have a count of 1
- Characters from Unicode range ≥ U+010000 have a count of 2
Or, to put it another way — Characters in the Basic Multilingual Plane (BMP) have a count of 1 and characters in the other planes have a count of 2. The 2 Mahjong Tile characters used in the example below are from the Supplementary Multilingual Plane (SMP).
Lets illustrate with a made-up posting that contains characters from the 2 Unicode ranges, above. The following text has a tweet character count of 17.
Lets illustrate with a made-up posting that contains characters from the 2 Unicode ranges, above. The following text has a tweet character count of 17.
- one two 一二三四五
- 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 2 + 1 + 1 + 1 + 1 + 1 + 2 = 17
Saturday, 6 October 2012
Weibo Character Count
Same as all the other microblog systems I have encountered, Sina Wēibó 新浪微博 has a 140 character limit for a user post. This is not strictly accurate. The character limit is variable and ranges from 70 to 280, inclusive. It depends on which characters are included. Different characters have different counts, as follows:
- Characters from Unicode range U+0000➜U+00FF have a count of 0.5
- Characters from Unicode range U+0100➜U+FFFF have a count of 1
- Characters from Unicode range ≥ U+010000 have a count of 2
Some of the consequences of these differing counts are:
- If one writes in everyday English then one has up to 280 characters as these will be Latin characters in Unicode blocks Basic Latin and Latin-1 Supplement U+0000➜U+00FF. The Latin Script does though occur in several Unicode blocks en.wikipedia.org/wiki/Latin_characters_in_Unicode. Latin characters in Unicode blocks other than Basic Latin and Latin-1 Supplement will have counts of 1 or 2 and usage of them will reduce the 280 limit.
- For a Chinese only post then if all the Chinese characters used are in the Unicode Basic Multilingual Plane (BMP) then the limit will be the accepted 140 characters. There are many Chinese characters outside of the BMP and because they have a count of 2, usage of these will reduce the 140 limit. The extreme case being a limit of 70 if all characters used are Chinese characters outside of the BMP.
- In recent releases of OSX and iOS, Apple incorporated Emoji characters en.wikipedia.org/wiki/Emoji The majority of these Emoji characters are outside the BMP (ie ≥ U+010000) and so will have a count of 2.
Lets illustrate with a nonsensical posting that contains characters from the 3 Unicode ranges, above. The following text has a Weibo character count of 13.
- one two 🀂一二三四五🀀
- 0.5 + 0.5 + 0.5 + 0.5 + 0.5 + 0.5 + 0.5 + 0.5 + 2 + 1 + 1 + 1 + 1 + 1 + 2 = 13
Tuesday, 24 July 2012
My Adopted Chinese Name
In China there is a very famous Canadian by the name of Mark Rowswell dashan.com. One of the reasons he is so famous in China is that his Chinese is very very good. His adopted Chinese name is 大山 (dàshān) which means great or large mountain.
Several years ago I decided to also adopt a Chinese name. One day a name popped into my mind. Mark's Chinese is very good but my Chinese is only basic. Consequently, I chose the name 小山 (xiǎoshān) which means little mountain 😀
An advantage of having an adopted name is that one can change it and I can change it to reflect my progress in mastering the Chinese language. So as my Chinese improves I can change it to 中山 (zhōngshān) which means middle mountain. Then 大山 (dàshān) and finally, if I ever reach this level of proficiency, 巨山 (jùshān) which means gigantic mountain.
There are though, some days when I think my Chinese is so poor that maybe my adopted name should be 微山 (Wēishān) as this means micro mountain.
Several years ago I decided to also adopt a Chinese name. One day a name popped into my mind. Mark's Chinese is very good but my Chinese is only basic. Consequently, I chose the name 小山 (xiǎoshān) which means little mountain 😀
An advantage of having an adopted name is that one can change it and I can change it to reflect my progress in mastering the Chinese language. So as my Chinese improves I can change it to 中山 (zhōngshān) which means middle mountain. Then 大山 (dàshān) and finally, if I ever reach this level of proficiency, 巨山 (jùshān) which means gigantic mountain.
There are though, some days when I think my Chinese is so poor that maybe my adopted name should be 微山 (Wēishān) as this means micro mountain.
Monday, 27 February 2012
Western Brands on Weibo
The purpose of this article is to list some of the Western Companies/Brands that are using China's Sina Wēibó 新浪微博. The text in the square brackets is the Sina Wēibó 新浪微博 name. This article is a continuation of schappo.blogspot.co.uk/2011/08/companies-on-sina.html
- 7 For All Mankind [@7ForAllMankind] weibo.com/7forallmankind ☆
- Abercrombie & Fitch [@Abercrombie] weibo.com/abercrombieny ☆
- Accenture [@埃森哲中国] weibo.com/accenture ☆
- Accor Hotels [@雅高酒店AccorHotels] weibo.com/accorchina ☆
- Air Liquide [@液空中国] weibo.com/airliquidechina ☆
- AKG [@雅登-AKG中国] weibo.com/akgchina ☆
- AkzoNobel [@阿克苏诺贝尔中国] weibo.com/akzonobelinchina ☆
- Alberta Ferretti [@AlbertaFerretti] weibo.com/albertaferretti ☆
- ALDO [@ALDO1972] weibo.com/n/ALDO1972 ☆
- Alexander McQueen [@Alexander-McQueen] weibo.com/alexandermcqueen ☆
- Allen Edmonds [@AllenEdmonds中国] weibo.com/allenedmondschina ☆
- Allianz Insurance [@安联保险-Allianz] weibo.com/allianzone ☆
- Alpenliebe [@微有爱] weibo.com/alpenliebekindness ☆
- American Express [@美国运通中国官方微博] weibo.com/amexchina ☆
- Anya Hindmarch [@Anya_Hindmarch_Official] weibo.com/anyahindmarchlondon
- Argos [@Argos爱顾商城] weibo.com/2720491021 ☆
- ASOS [@ASOS] weibo.com/asosofficial
- Aspinal of London [@Aspinal-of-London] weibo.com/aspinaloflondonltd
- Associated Press [@美联社] weibo.com/apimages ☆
- Aston Martin [@阿斯顿马丁拉共达] weibo.com/astonmartinlagondacn ☆
- Aston Villa FC [@阿斯顿维拉足球俱乐部] weibo.com/AVFCOfficial ☆
- AVIS [@AVIS安飞士租车] weibo.com/avischina ☆
- Balenciaga [@Balenciaga] weibo.com/officialbalenciaga ☆
- Balmain [@瑞士宝曼手表] weibo.com/balmainwatches ☆
- Barbie [@Barbie芭比官方微博] weibo.com/barbieofficial ☆
- BASF [@巴斯夫大中华] weibo.com/basfinchina ☆
- Bayer [@拜耳中国官方微博] weibo.com/bayerchina ☆
- Bentley Motors [@宾利BentleyMotors] weibo.com/bentleymotorsuk ☆
- Bergdorf Goodman [@Bergdorfs] weibo.com/bergdorfs ☆
- Best Buy [@BestBuy百思买] weibo.com/bestbuycn ☆
- Bloomingdaleʼs [Bloomingdales_USA] weibo.com/bloomingdalesusa
- Blue Nile Inc [@BlueNileInc] weibo.com/bluenileinc ☆
- Bobbi Brown [@BobbiBrownChina] weibo.com/bobbibrownchina ☆
- Bonpoint [@Bonpoint-中国] weibo.com/bonpoint ☆
- Bosch [@博世中国] weibo.com/boschauto ☆
- Boucheron [@Boucheron宝诗龙微博] weibo.com/boucheronparis ☆
- Breitling [@百年灵BREITLING] weibo.com/breitlingchina ☆
- Bremont [@Bremont宝名表] weibo.com/bremont ☆
- British Airways [@英国航空] weibo.com/britishairways ☆
- Brompton Bicycle [@Brompton_bicycle_伯龙腾] weibo.com/bromptonbicycle ☆
- BVLGARI [@BVLGARI宝格丽] weibo.com/bulgari ☆
- BVLGARI Perfume [@宝格丽香水] weibo.com/bulgariperfume ☆
- Cambridge Satchel Co. [@The_Cambridge_Satchel_Company] weibo.com/jianqiaobao ☆
- Campo Marzio Design [@CampoMarzio中国区] weibo.com/campomarzio ☆
- Camus [@卡慕CAMUS] weibo.com/camuschina ☆
- CARAT London [@CARAT官方微博] weibo.com/caratlondon
- Caterpillar [@Caterpillar官方微博] weibo.com/caterpillarinchina ☆
- Cath Kidston [@CathKidstonChina] weibo.com/cathkidstonchina ☆
- Champagne Taittinger [@泰亭哲香槟] weibo.com/champagnetaittinger
- Cheerios [@雀巢脆谷乐] weibo.com/nestlecheerios ☆
- Chopard [@萧邦Chopard] weibo.com/chopardchina ☆
- Christian Louboutin [@ChristianLouboutin官方微博] weibo.com/LouboutinWorld ☆
- Christie's [@佳士得国际] weibo.com/christies ☆
- Clarisonic [@Clarisonic科莱丽-欧莱雅] weibo.com/clarisonicchina .
- Club Monaco [@Club_Monaco] weibo.com/clubmonaco ☆
- CME Group [@CMEGroup] weibo.com/cmegroup ☆
- Cows Creamery [@COWS冰激凌] weibo.com/cowscreamery
- Decanter [@Decanter醇鉴] weibo.com/decantercn ☆
- Ducati [@杜卡迪中国] weibo.com/ducatichina ☆
- Dulux [@多乐士Lets_Colour] weibo.com/letscolor ☆
- DuPont [@杜邦公司] weibo.com/dupont ☆
- eBay [@eBay] weibo.com/ebay ☆
- Elizabeth Arden [@伊丽莎白雅顿美丽沙龙] weibo.com/elizabetharden ☆
- EMC Corporation [@EMC中国-云计算] weibo.com/emcgreatchina ☆
- Eppendorf [@eppendorf官方微博] weibo.com/eppendorfchina ☆
- Ernst & Young [@安永EY] weibo.com/eyernstyoung ☆
- Etro [@ETRO艾绰] weibo.com/etrochina ☆
- Eurostar [@欧洲之星_Eurostar] weibo.com/eurostarchina
- Fairmont Hotels & Resorts [@费尔蒙酒店] weibo.com/fairmonthotels ☆
- Fendi [@FENDI] weibo.com/fendi ☆
- Financial Times [@FT中文网] weibo.com/ftchinese ☆
- Finnair [@芬兰航空Finnair] weibo.com/finnaircom
- Firefox [@火狐] weibo.com/firefox ☆
- Fisher-Price [@费雪中国官方微博] weibo.com/fisherprice ☆
- Fisherman's Friend [@渔夫之宝官方微博] weibo.com/ffgfwb ☆
- Fissler [@德国菲仕乐] weibo.com/fisslerchina2013 ☆
- Flipboard [@Flipboard] weibo.com/flipboard ☆
- Freescale Semiconductor [@飞思卡尔] weibo.com/freescale ☆
- Furla [@Furla_孚勒] weibo.com/furlaofficial ☆
- G-Star RAW [@G-STARCHINA] weibo.com/gstarchina ☆
- Geox [@健乐士GEOX] weibo.com/jianleshigeox ☆
- Girard-Perregaux [@GP芝柏表] weibo.com/gpchina ☆
- Glenmorangie [@格兰杰单一麦芽威士忌] weibo.com/glenmorangiechina ☆
- GNC [@GNCLiveWell] weibo.com/gnclivewell ☆
- GRAFF [@格拉夫GRAFF] weibo.com/graff ☆
- Gregory Mountain Products [@Gregory官方微博] weibo.com/gregory1977 ☆
- Grey Goose [@法国灰雁GreyGoose] weibo.com/greygoosechina ☆
- Guinevere Launcelot [@Guinevere_Launcelot] weibo.com/gltlondon
- Gymboree [@金宝贝国际早教微课堂] weibo.com/gymboree ☆
- H2O+ [@H2O水芝澳官方微博] weibo.com/h2ochina ☆
- Hackett London [@Hackett-London] weibo.com/hackettlondon ☆
- Halma [@HALMA中国] weibo.com/halma ☆
- Hardy Amies [@HardyAmies赫迪雅曼] weibo.com/HardyAmies
- Harry Winston [@海瑞温斯顿HarryWinston] weibo.com/harrywinston ☆
- Hasbro [@孩之宝中国] weibo.com/hasbrochina ☆
- Holland & Barrett [@HollandAndBarrett] weibo.com/hollandandbarrett ☆
- Hollister [@Hollister] weibo.com/hollister ☆
- Hooters [@美国猫头鹰餐厅-中国] weibo.com/hooterschina ☆
- Hublot [@宇舶表] weibo.com/hublothanhan ☆
- Hyatt [@凯悦酒店集团HYATT] weibo.com/hyatthotelscorp ☆
- IBM [@IBM中国] weibo.com/ibm100 ☆
- IMAX [@IMAX] weibo.com/imax ☆
- Irregular Choice [@IrregularChoice香港] weibo.com/irregularchoicehk ☆
- IWC [@IWC万国表] weibo.com/iwcchina ☆
- J.Lindeberg [@JLINDEBERG林德伯格] weibo.com/jlindeberg ☆
- Jack Wolfskin [@JackWolfskin官方微博] weibo.com/jackwolfskingermany ☆
- Jaeger-LeCoultre [@积家官方微博] weibo.com/jaegerlecoultrechina ☆
- Jo Malone [@JoMaloneLondon祖玛珑] weibo.com/jomalonelondon
- Juniper Networks [@瞻博网络] weibo.com/junipernetworks ☆
- Kate Spade New York [@katespade官方微博] weibo.com/katespadeny ☆
- Kipsta [@KIPSTA中国] http://weibo.com/kipstachina ☆
- Kleenex [@舒洁kleenex] http://weibo.com/n/舒洁kleenex ☆
- Lagostina [@拉歌蒂尼] weibo.com/lagostina ☆
- Lana Marks [@LANA-MARKS-CHINA] weibo.com/lanamarks ☆
- Lancaster [@兰嘉丝汀] weibo.com/lancasterchina ☆
- Le Coq Sportif [@lecoqsportif中国] weibo.com/lecoqsportif ☆
- Lindt [@Lindt瑞士莲巧克力] weibo.com/lindtchina ☆
- Lonely Planet [@LonelyPlanet] weibo.com/lonelyplanet ☆
- Luis Via Roma [@LUISAVIAROMA官方微博] weibo.com/luisaviaroma
- MAC Cosmetics [@MAC魅可] weibo.com/maccosmetics ☆
- Macy's [@美国梅西百货] weibo.com/MacysChina
- Manchester City FC [@曼城足球俱乐部MCFC] weibo.com/mcfcofficial
- Manchester United FC [@曼联足球俱乐部] weibo.com/manchesterunited ☆
- Mango [@MANGO中国官网] weibo.com/mangofashion ☆
- Marc Jacobs [@MarcJacobsIntl莫杰] weibo.com/marcjacobsintl ☆
- Maria Luisa [@MARIA_LUISA] weibo.com/marialuisa ☆
- Marimekko [@MARIMEKKO_玛莉美歌] weibo.com/marimekkoofficial ☆
- Marmot [@Marmot中国] weibo.com/marmot001 ☆
- Marni [@MARNI] weibo.com/officialmarni ☆
- Marvin Watches [@Marvin-瑞士摩纹表] weibo.com/marvinwatch ☆
- MasterCard [@万事达人] weibo.com/mastercardchina ☆
- Maxi-Cosi [@Maxi-Cosi] weibo.com/maxicosi ☆
- McLaren [@迈凯伦汽车] weibo.com/mclarenchina ☆
- Media Markt [@万得城电器] weibo.com/mediamarktchina ☆
- Medtronic [@美敦力中国] weibo.com/medtronicchina ☆
- Meltwater Group [@Meltwater] weibo.com/meltwater ☆
- Mettler Toledo [@梅特勒-托利多中国] weibo.com/mettlertoledo ☆
- Michael Kors [@Michael-Kors] weibo.com/michaelkors ☆
- Monster Cable [@Monster-魔声中国] weibo.com/monsterchina ☆
- Mothercare [@mothercare官方微博] weibo.com/mothercarechina ☆
- Movado [@摩凡陀Movado] weibo.com/movado ☆
- MTV [@MTV中文频道] weibo.com/mtvchina
- Mulberry [@Mulberry_Official] weibo.com/mulberryofficial ☆
- NASDAQ OMX [@纳斯达克交易所] weibo.com/nasdaqomx
- Neiman Marcus [@NeimanMarcus尼曼] weibo.com/neimanmarcuschina ☆
- NERF [@孩之宝NERF-热火] weibo.com/ilovenerf ☆
- New Balance [@新百伦newbalance] weibo.com/newbalanceofficial ☆
- New York Times [@纽约时报中文网] weibo.com/nytchinese ☆
- Nuxe Paris [@Nuxe欧树] weibo.com/nuxe ☆
- Old Navy [@OldNavyChina] weibo.com/oldnavychina
- Ovaltine [@阿华田Ovaltine] weibo.com/ovaltine001 ☆
- Oxford University Press [@牛津大学出版社全球学术出版] weibo.com/oupacademic ☆
- Pandora [@PANDORA珠宝] weibo.com/pandorajewellery ☆
- Papa John's Pizza [@棒约翰PapaJohns] weibo.com/papachina
- Paul Smith [@PaulSmith保罗史密斯] weibo.com/paulsmithofficial ☆
- Paula's Choice [@PaulasChoice宝拉珍选] weibo.com/paulaschoice01 ☆
- PayPal [@PayPal_China] weibo.com/paypalmarketing ☆
- Penguin Books [@企鹅出版社] weibo.com/penguinbooks ☆
- Perficient [@博克软件] weibo.com/perficientchina ☆
- Peugeot Scooters [@标致摩托] weibo.com/peugeotscooters ☆
- Pfizer [@辉瑞中国] weibo.com/pfizerchina
- Piaget [@PIAGET] weibo.com/piaget ☆
- Piaggio [@比亚乔机车] weibo.com/piaggio1884 ☆
- Pineider [@彼耐德Pineider] weibo.com/pineider ☆
- Pizza Hut [@必胜客欢乐餐厅] weibo.com/pizzahut ☆
- Pomellato [@Pomellato宝曼兰朵] weibo.com/pomellatoinchina ☆
- Pony [@ponychina] weibo.com/ponychina ☆
- Printemps [@春天百货Printemps] weibo.com/printempsparis ☆
- Pull-in [@PULLIN内衣] weibo.com/pullinasia ☆
- Razorfish [@RazorfishChina] weibo.com/razorfish
- Ritz-Carlton [@丽思卡尔顿酒店] weibo.com/ritzcarlton ☆
- Rockport [@ROCKPORT美国乐步] weibo.com/rockportchina ☆
- Roger Dubuis [@罗杰杜彼RogerDubuis] weibo.com/rogerdubuis ☆
- Roger Vivier [@RogerVivier_罗杰维维亚] weibo.com/rogervivier ☆
- Rovio Entertainment [@Rovio娱乐] weibo.com/rovioentertainment
- Rupert Sanderson [@RupertSanderson] weibo.com/rupertsanderson ☆
- Schneider Electric [@施耐德电气中国] weibo.com/schneidercn ☆
- SELECTED [@SELECTED中国官方微博] weibo.com/selectedchina
- Selfridges [@Selfridges] weibo.com/selfridgesuk .
- Sergio Rossi [@sergio_rossi] weibo.com/sergiorossi ☆
- Shell [@壳牌中国集团] weibo.com/shellinchina
- Sheraton Hotels & Resorts [@喜来登酒店及度假村Sheraton] weibo.com/sheratonhotels ☆
- Shopbop [@shopbop] weibo.com/shopbopchina ☆
- Sigma-Aldrich [@SigmaAldrich] weibo.com/sigmaaldrich ☆
- Skechers [@SKECHERS斯凯奇] weibo.com/skechers ☆
- Skyscanner [@Skyscanner天巡] weibo.com/skyscannertx ☆
- South Coast Plaza [@SouthCoastPlaza] weibo.com/southcoastplaza
- Standard Chartered Bank [@渣打银行中国] weibo.com/scbmainlandchina ☆
- Stickhouse [@Stickhouse] weibo.com/stickhouse
- Stiebel Eltron [@斯宝亚创StiebelEltron] weibo.com/stiebeleltron ☆
- Stroili Oro [@StroiliOro] weibo.com/stroilioro ☆
- TAG Heuer [@豪雅TAGHeuer] weibo.com/tagheuerchina ☆
- Ted Baker [@TedBakerLondon] weibo.com/tedbakerlondon ☆
- Tesco [@乐购中国官方微博] weibo.com/TESCOofficial ☆
- The Glenlivet [@格兰威特威士忌] weibo.com/theglenlivet ☆
- Thermo Fisher Scientific [@赛默飞] weibo.com/thermofishercn ☆
- Times Higher Education [@泰晤士报高等教育期刊] weibo.com/timeshighereducation ☆
- TLD Registry [@域通联达] weibo.com/tldregistry ☆
- Toblerone [@瑞士三角巧克力] weibo.com/toblerone ☆
- Tom & Jerry [@华纳兄弟-猫和老鼠] weibo.com/tomandjerryoffical ☆
- Topshop Shēnzhèn [@TOPSHOP深圳] weibo.com/topshopsz ☆
- Tottenham Hotspur [@热刺TottenhamHotspur] weibo.com/tottenhamhotspur ☆
- Truefitt & Hill [@TRUEFITT-HILL-CHINA] weibo.com/truefittandhill ☆
- Unisys [@优利中国] weibo.com/unisyschina
- Valentino [@Valentino官方微博] weibo.com/valentinoofficial ☆
- Van Cleef & Arples [@VanCleefArpels梵克雅宝] weibo.com/vancleefarpelschina ☆
- Vichy Laboratoires [@薇姿医生] weibo.com/vichybrand ☆
- Visa [@Visa中国] weibo.com/visachina ☆
- VMware [@VMware中国] weibo.com/vmware ☆
- Volvo [@沃尔沃集团中国] weibo.com/volvogroupchina ☆
- Wall Street Journal [@华尔街日报中文网] weibo.com/chinesewsj
- Wallpaper* Magazine [@WallpaperMagazine] weibo.com/wallpapermag ☆
- Walmart [@沃尔玛中国官方微博] weibo.com/wmcsr
- West Bromwich Albion [@西布朗足球俱乐部官微] weibo.com/westbrom
- Westin Hotels & Resorts [@Westin] weibo.com/westinhotels ☆
- Wiggle [@Wiggle中国] weibo.com/wigglechina .
- William & Son [@WilliamandSon] weibo.com/williamandson
- Wolfram [WolframChina] weibo.com/wolframchina
- YOOX [@YOOX网络概念店] weibo.com/yooxcn ☆
- Yves Rocher [@Yves-Rocher伊夫黎雪] weibo.com/yvesrocher1959 ☆
- Zatchels [@Zatchels] weibo.com/zatchelsuk ☆
- Zenith [@ZENITH真力时] weibo.com/zenithchina ☆
Monday, 20 February 2012
Language Characteristics
In this article I list some of the characteristics of natural languages and scripts as they are manifested and used in modern day IT. With languages there are always exceptions and so there will be some exceptions to these characteristics. I will not be delving into linguistic technicalities such as the distinction between mora and syllable or the distinction between logogram and ideogram. I will take a more broad brush approach.
Arabic
- Arabic is written in the Arabic script
- Written from right to left
- The space character (U+0020 SPACE) is used as a separator between words and sentences
- The sentence terminator full stop is the Unicode character U+002E FULL STOP
- Unicase ie no uppercase and lowercase letter forms
- A Keyboard Mapping is sufficient in order to write Arabic
- The Arabic script is inherently cursive and hence is presented/displayed in it's cursive form.
- Letters change shape according to their position within a word. These different shapes are named Initial, Medial, Final and Isolated forms. en.wikipedia.org/wiki/Arabic_alphabet#Letter_forms
Chinese
- Chinese is written in the Chinese script which consists of hànzì (汉字) characters, of which, there are tens of thousands
- Written from left to right. Once browsers implement CSS3 Writing Modes we may well see some return to the traditional vertical text in webpages dev.w3.org/csswg/css-writing-modes/#vertical-intro
- There is no space character separator between words and sentences
- The sentence terminator full stop is the Unicode character U+3002 IDEOGRAPHIC FULL STOP
- Unicase ie no uppercase and lowercase letter forms
- An Input Method is required in order to write Chinese
- All characters, including punctuation, are monospaced. Thus, for example, the list items separator in the text string "北京,南京,东京" is the single character U+FF0C FULLWIDTH COMMA. The text string "北京、南京、东京" uses the single character U+3001 IDEOGRAPHIC COMMA as the list items separator.
- With respect to number of characters required to communicate, Chinese is much more compact than English. Given a sentence written in English, the same sentence written in Chinese would require far fewer characters. This compactness gives Chinese a significant advantage over English for IDNs and when microblogging.
English
- English is written in the Latin script
- Written from left to right
- The space character (U+0020 SPACE) is used as a separator between words and sentences
- The sentence terminator full stop is the Unicode character U+002E FULL STOP
- Has uppercase and lowercase letter forms
- A Keyboard Mapping is sufficient in order to write English
Japanese
- Japanese is written in the Japanese scripts Kanji (漢字), Hiragana (ひらがな) and Katakana (カタカナ)
- Written from left to right. Once browsers implement CSS3 Writing Modes we may well see some return to the traditional vertical text in webpages dev.w3.org/csswg/css-writing-modes/#vertical-intro
- There is no space character separator between words and sentences
- The sentence terminator full stop is the Unicode character U+3002 IDEOGRAPHIC FULL STOP
- Unicase ie no uppercase and lowercase letter forms. Uppercase is sometimes used for emphasis in English. Similarly, Katakana is sometimes used for emphasis.
- An Input Method is required in order to write Japanese
- In general, Japanese, like Chinese is monospaced. The exception is that there are half-width forms of Katakana and some punctuation characters. The half-width forms are in Unicode block Half-width and Full-width Forms U+FF00 ➤ U+FFEF.
- With respect to number of characters required to communicate, Japanese is much more compact than English. Given a sentence written in English, the same sentence written in Japanese would require far fewer characters. This compactness gives Japanese a significant advantage over English for IDNs and when microblogging.
Korean
- Korean is written in the Hangeul (한글) script
- Written from left to right
- The space character (U+0020 SPACE) is used as a separator between words and sentences
- The sentence terminator full stop is the Unicode character U+002E FULL STOP
- Unicase ie no uppercase and lowercase letter forms
- An Input Method is required in order to write Korean
- The individual Korean letters (jamo/자모) are grouped into and displayed as Syllabic blocks. e.g. the individual jamo ㅎ ㅏ ㄴ ㄱ ㅜ ㄱ are combined to form the two Korean characters 한국
Russian
- Russian is written in the Cyrillic (Кириллица) script
- Written from left to right
- The space character (U+0020 SPACE) is used as a separator between words and sentences
- The sentence terminator full stop is the Unicode character U+002E FULL STOP
- Has uppercase and lowercase letter forms
- A Keyboard Mapping is sufficient in order to write Russian
Saturday, 31 December 2011
Browser Language
By Browser Language I do not mean the Browser User Interface Language. I am referring to the Browser Preferred Language for displaying Pages. I will use the acronym BL to mean Browser preferred Language for displaying pages.
An internationalised website will have pages in multiple languages. These pages can be displayed according to BL eg If BL is Korean then the website will send it's Korean pages to the browser.
With most browsers, the BL can be set in the preferences and can be set independent of the language settings of the OS. Some browsers do inherit their BL from the OS language setting.
This ability to change the BL has so much potential that few are aware of. Google are switched on to this potential. Google maps, if embedded correctly, will automatically adapt to BL. You can try it out for yourself. Visit lboro.ac.uk/about/findus.html and you will see a Google map of Loughborough. Now change the BL in your browser preferences and refresh. You will see menus displayed in the BL you chose. If you had chosen Japanese as your BL you would also see some place names transliterated into Japanese.
This ability for the user to change BL is a good thing, a very good thing. But... This function is buried down in the preferences. It is my experience that few people are aware that the BL can be changed and even less are aware of the possibilities this opens up.
My recommendation to all the browser manufacturers is that the BL preferences should be made manifest by bringing them up front. Put a BL graphic in a prominent position on the browser window so that it is always visible. This BL graphic will serve to inform the user of the current BL and allow the user to change the BL (eg a popup BL selection menu).
Such a BL graphic will:
An internationalised website will have pages in multiple languages. These pages can be displayed according to BL eg If BL is Korean then the website will send it's Korean pages to the browser.
With most browsers, the BL can be set in the preferences and can be set independent of the language settings of the OS. Some browsers do inherit their BL from the OS language setting.
This ability to change the BL has so much potential that few are aware of. Google are switched on to this potential. Google maps, if embedded correctly, will automatically adapt to BL. You can try it out for yourself. Visit lboro.ac.uk/about/findus.html and you will see a Google map of Loughborough. Now change the BL in your browser preferences and refresh. You will see menus displayed in the BL you chose. If you had chosen Japanese as your BL you would also see some place names transliterated into Japanese.
This ability for the user to change BL is a good thing, a very good thing. But... This function is buried down in the preferences. It is my experience that few people are aware that the BL can be changed and even less are aware of the possibilities this opens up.
My recommendation to all the browser manufacturers is that the BL preferences should be made manifest by bringing them up front. Put a BL graphic in a prominent position on the browser window so that it is always visible. This BL graphic will serve to inform the user of the current BL and allow the user to change the BL (eg a popup BL selection menu).
Such a BL graphic will:
- Raise user awareness of BL and the ability to change BL
- Encourage users to explore sites that adapt according to BL
- Encourage web developers to incorporate content, widgets and features that are BL adaptive
Here is an illustrative story. About a year ago, a Chinese person told me he had a problem when viewing some Google maps. His problem was that the map info was displayed in Chinese but he wanted to see the info displayed in English. This problem occurred when he viewed these maps from his own computer and he could not work out how to view the info in English.
I explained to him that what he was experiencing was not a problem but rather a symptom of a very powerful feature. The feature being Google maps auto adapting to BL. His Computer had a Chinese OS and the browser he used had it's BL set to Chinese. I told him how to change his browser's BL and then, of course, he could view the Google map info in any of the many supported languages.
A manifest BL graphic would have made it obvious what was happening and would have enabled him to explore and appreciate Google maps BL adaption and BL adaptive websites in general.
Subscribe to:
Posts (Atom)

