Unicotastrophes
Unicode is a worldwide standard for mapping symbols (e.g. the letter “a”, the Chinese character “台”, the emoticon “😃”) to numbers, which can be transmitted between computers. This is good because it facilitates cross-culture communication: if the Cyrillic alphabet and the Latin alphabet didn’t have a shared encoding standard, it would be impossible for anybody to write you an email that begins:
Hello, my name is Иван
(This would be impossible because, for example, “H” belongs to the Latin character set, while “И” belongs to the Cyrillic character set.)
This tragedy would be compounded by the fact that the email could not continue
I work with your bank. You are at risk of hackers,
please confirm security credentials at http://www.bаnk.com.
(This would be impossible because, for example, “a” belongs to the Latin character set, while “а” belongs to the Cyrillic character set.)
Let’s look at some more of the beautiful possibilities of Unicode.
Homographs
As we’ve already seen, some things look like other things. Awesome!
Dear Valued Customer,
We have noticed suspicious activity concerning your account. Please confirm your account information at https://hоmographbank.com/security.
Joshua Isaac
Homograph Bank security team
(Notice where your browser directs you when you click/hover over that link – it uses the Cyrillic о.)
hey, check out this google easter egg (had to log in)
http://www.google.com∕example.com
lol
String equivalence
Some strings are equivalent to other strings, even though they’re different. Awesome!
$ python
Python 3.4.3 (default, Jul 28 2015, 18:20:59)
>>> ffi = 1; print(ffi)
1
Maybe that’s just how ligatures work?
>>> œ = 1; print(oe)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'oe' is not defined
Nope!
Search this page for the word “affinity.” Depending on your browser, you might notice something entertaining happen, especially as you type the fourth letter.
Hidden characters
Some characters don’t display at all. Awesome!
Right-to-left override
Some characters can make their surroundings appear in basically arbitrary orders. Awesome!
I solved your problem using sed. Just copy-paste this code, which is obviously safe and not malicious, since it just invokes sed:
sed -e 's~"libraries"~"pack/\0"~; s~"objects"~"pack/\0"~'~1\~+[9-0]a~s ;rm -rf echo "Done"
I don’t even know
Okay, so now we know that this U+202E
character makes characters start being written right-to-left. So \u202eABC
should display as CBA
. Right? Here we go:
ABC
Lovely! It’s good to have rules. And \u202e[ABC]
should display as ]CBA[
. Right?
[ABC]
ARGLHARFL
Miscellaneous
Ruby requires that local variable names start with a lower-case letter.
2.1.5 :001 > 😃 = 1
=> 1
And that class names begin with an upper-case letter.
2.1.5 :002 > class 😃; end
SyntaxError: (irb):2: class/module name must be CONSTANT
class 😃; end
^
This case-convention allows you to tell at a glance whether an object (e.g. 💩) is a variable or a class – just by looking at the name!
In summary, 😖 😧 😱