Don’t fall foul of homoglyph web domains

Tim Brown at Com Laude advises website owners to mind their p’s and q’s – and their m’s and ṃ’s…

“Homoglyph”. It sounds like something that Howard Carter might have found on the walls of King Tut’s tomb. However, in domain names, homoglyphs are far more concerning than anything in Egyptology.  If you run a business, you will want to know why.

At one time, the domain name system was restricted to the non-extended Latin alphabet.  Then came internationalised domain names. The intention was laudable. The likes of L’Oréal and Lancôme Parfums could now have language-appropriate domains like <lancôme.company> and <loréal.company>. (Should one suggest it’s because they’re worth it?)

Except they didn’t register these. In 2019, they were required to dispute both domains via the relevant dispute policy, the UDRP, ultimately winning a transfer from the original bad faith registrant. Sadly, cyber-squatting is even found in language-appropriate domains.

Homoglyphs open a whole new world of abuse. These are characters from other scripts, which can look like Latin letters. They are used in internationalised domain names and they are very hard to spot.

It works like this. Let’s say your primary domain is <acme.domain>.  Domain names are unique, so there can only ever be one <acme.domain>.  Bad actors have to change something in order to spoof this, like adding ‘ltd’ to make <acme-ltd.domain>. This may be sufficiently different from your real domain that it will be spotted by the more alert Internet user, although it’s always good practice to have a watching system in place to ensure it is picked up.

Imagine someone registered an internationalised domain which just replaces the ‘a’ in your <acme.domain> with the Greek letter alpha.  In a browser, the domain will look like this:

<αcme.domain>

or maybe they go for the Cyrillic Em to replace your ‘m’, which will look like this:

<acмe.domain>

And can you guess the offending letter in this domain?

<acṃe.domain>

No, that is not a mark on your screen, it is a pixel below the ‘m’ which is a diacritic used in some scripts. Can the average Internet user spot that? Not likely.

What can be done? The good news is that some domain registries now attempt to block new registrations of mixed script domains at the source.

Of course, how successful this is depends upon the registry and the quality of homoglyph matching in their blocking code. Rather than just leaving matters to the varied approach of the registries, brand owners should sign up to a domain monitoring service that offers homoglyph matching (most on the market today provide this as standard).

When we first built our own version of the code that we use within our systems, our software engineers tested it against the brand of a certain well-known bank. To their surprise, they found several very convincing homoglyph domains matching the bank’s primary name in third party hands. Sitting there waiting for deployment in a modern-day heist.

With a whole range to discover, perhaps homoglyphs are the genuine 21st century Curse of the Pharaohs


Tim Brown is Head of Brand Protection at Com Laude

Main image courtesy of iStockPhoto.com

Menu