- a variable name is prefixed with $
- the first character must be a letter (a-z, A-Z) or an underscore (_)
- subsequent characters can be any mix of letters or digits (0-9) or underscore
- $Andre $age $previous_total
- $André $小山 $エクセレント $우수한 $🐉
Determination of valid variable names is at a low level, the byte level. A UTF-8 encoded character will consist of 1 to 4 bytes. Only characters in the Basic Latin Unicode block (which is the same as ASCII) use 1 byte for encoding. All other characters require 2 to 4 bytes for encoding. The byte values for these All other characters are always ≥ 80. The consequence is that if one uses non Basic Latin Unicode characters there are no restrictions whatsoever! Thus one can, for example, have Chinese, Japanese, Korean, Punjabi, Russian or Egyptian Hieorglyphs variable names. One can have Currency Symbol, Mathematical Operators or Emoji variable names. An opportunity to be creative.
There are perhaps certain practices one should avoid when using Unicode for your variable names. The below are actually 3 different (valid) variable names even though they appear visually identical.
- $André (uses U+00E9 LATIN SMALL LETTER E WITH ACUTE)
- $André (uses U+0065 LATIN SMALL LETTER E & U+0301 COMBINING ACUTE ACCENT)
- $Аndré (uses U+0410 CYRILLIC CAPITAL LETTER A)