cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
tyzbit
New member
Status: New idea

Since 2003, Punycode has been available to encode a wider character set than ASCII into domains that still adhere to rules for domain names. This allows users to enter characters from languages with alphabets larger than or in the place of ASCII characters for domain names and have them resolve as expected.

A major downside of Punycode and support for expanded character sets for domain names is known as International Domain Name homo graph attack: https://en.wikipedia.org/wiki/IDN_homograph_attack#Homographs_in_internationalized_domain_names . This allows attackers to show a domain name that looks extremely similar to a more well-known domain name in order to pilfer data. From Wikipedia:

For example, a regular user of exаmple.com may be lured to click on it unquestioningly as an apparently familiar link, unaware that the third letter is not the Latin character "a" but rather the Cyrillic character "а" and is thus an entirely different domain from the intended one. 

Firefox (as released with English as the default language) handles this by decoding all internationalized domain names as their decoded forms, so `I❤️.ws` gets decoded as `xn--i-7iq.ws`. This is a safe way to defend against IDN attacks, however the user experience is poor. Whether the user clicked a link or even if the user directly typed the URL, the address becomes a string of letters that do not have any meaningful interpretation for a user.

I propose that for a subset of the extended character set allowed via Punycode, characters that do not have ANY ASCII lookalikes (for example, kanji or emojis) in the entire domain are not decoded and instead are rendered with their non-Punycode encoded characters.

In addition to a more pleasant experience for users using domains with characters other than ASCII, it would also improve the experience for domains with emoji in them. A large portion of emoji are also easily distinguishable from ASCII characters. If the domain has even one character that is visually similar to an ASCII character, Firefox should retain current behavior by decoding the domain and presenting that to the user.

3 Comments
Status changed to: New idea
Jon
Community Manager
Community Manager

Thanks for submitting an idea to the Mozilla Connect community! Your idea is now open to votes (aka kudos) and comments.

Kappa
Making moves

Not a fan. As far as I know, Firefox does not show punycode by default and that's a flag you have to enable manually. Which is the main problem here as it's an obvious attack vector.

However, if your suggestion were to be implemented:

  •  You'd have to create/maintain an "official" set of characters which do have lookalikes in God knows what Unicode page (granted, these lists exist already)

  • Any character that is missing from this list (either because it was added to Unicode at a later point, or because it was missed initially) would then be even more dangerous, since users would be fooled into thinking they're safe when they're not
Alestrix
New member

This already seems to be implemented to some extend.

When I visit the German domain http://fahrschule-fahrvergnügen.de, the umlaut is shown in the address bar. However, if I go to ɡoogle.com (using a homoglyph to g as first character), then it gets turned into xn--oogle-qmc.com.