

Discover more from hrbrmstr's Daily Drop
Confusable Characters
The pen may be mightier than the sword, but Unicode characters have the power to take down an organization when wielded by capable, evil hands. Homoglyph attacks are all-too-real, and are used regularly by attackers. These attacks take advantage of the plethora of Unicode glyphs from global writing systems, and the Unicode folks call the impostor glyphs "Confusable characters" and define them as:
"[glyphs] that may be confused with others (in some common UI fonts), such as the Latin letter "o" and the Greek letter omicron "ο". Fonts make a difference: for example, the Hebrew character "ס" looks confusingly similar to "o" in some fonts (such as Arial Hebrew), but not in others."
Unicode is so unsafe that the Unicode folks also have an entire document discussing security considerations that programmers, system analysts, standards developers, and users should consider. They also provide specific recommendations to reduce the risk of problems.
They've also developed a ranked restriction system which defines five levels of "safety" that developers should present to users (it'd be great if browser manufacturers incorporated this concept into a location bar configurable trust level):
ASCII-Only
All characters in each identifier must be ASCII
Highly Restrictive
All characters in each identifier must be from a single script, or allowed combinations
Moderately Restrictive
Allow Latin with other scripts except Cyrillic, Greek, Cherokee; otherwise, the same as Highly Restrictive
Minimally Restrictive
Allow arbitrary mixtures of scripts, such as Ωmega, Teχ, HλLF-LIFE, Toys-Я-Us.; otherwise, the same as Moderately Restrictive
Unrestricted
Any valid identifiers, including characters outside of the Identifier Profile, such as I♥NY.org
Paired with this list is a summary table of confusables, suitable for processing in your fav programming language.
Their checker tool is kind of fun to use, too, despite the website looking like it's being served up from a 1998 Sun Workstation.
If you're having trouble seeing some Unicode glyphs in the above pages, check out the next section.
Noto
Google dubs Noto as "a typeface for the world". Essentially, Noto is a collection of high-quality fonts with multiple weights and widths in sans, serif, mono, and other styles. Google further swoons over the font by noting that family's fonts "are perfect for harmonious, aesthetic, and typographically correct global communication, in more than 1,000 languages and over 150 writing systems."
"Noto" means "I write, I mark, I note" in Latin. The name is also short for "no tofu", as the project aims to eliminate 'tofu': blank rectangles shown when no font is available for your text.
Here's how it came to be:
You can spy these fonts with your little eye, and use that page to select a thinner distribution of Noto fonts (like this Latin script one for U.S. English) so as not to fill up your SSD (the entire family is YUGE).
Highpoint
"Highpointing" is the act of climbing to the highest geographic point in a region, meaning the scale is up to you. One can highpoint in one's neighborhood, town, county, state, etc.
If you're in the U.S. and enjoy purposeful adventures, then the U.S. Highpoint Guide might be for you. Just tap any state on the statebins chorpleth (see section header) and you will be whisked away to the tallest peak. This is Maine's (and I've hiked up that!).
You'll get directions and tons of information to help you plan your getaway ascents.
Note that Florida continues to be an embarrassment to the nation, with a paltry 345 feet of elevation at their highest point.
FIN
Happy Monday everyone! ☮