cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Ironphoenix
New member
Status: New idea

Bad actors are using look-alike URL's in email and browsers' automatic encoding to direct people to malicious websites. A safe example is https://connеct.mozilla.org , which transforms to https://xn--connct-6of.mozilla.org/ , which doesn't exist. The Roman e is replaced with the Cyrillic е, which looks identical (side-by-side: eеeеeе). Browsers can help reduce the risk of people getting caught by this by having a setting which disables this encoding by default, and offering users the option of enabling it once when they click on a link with disallowed characters or as a permanent setting.

Thanks, and best regards,

Mike B.

5 Comments
Status changed to: New idea
Jon
Community Manager
Community Manager

Thanks for submitting an idea to the Mozilla Connect community! Your idea is now open to votes (aka kudos) and comments.

luis123456789
Strollin' around

How about no?

It already has been discussed at nauseam that showing non-Latin links as "weird" by default, promotes a US-centric internet and turns internationalized domains and communities into second-class citizens, among other racist effects. English is but one language in the world, it's not even close to a majority.

 

I'm not fully sure what a better UI is for showing the links, considering it's HTML standard that links can and should go decorated (ie.: a <a> element with an inner HTML description). Probably a tooltip with a basic WHOIS fetch, but that would require an external connection to a third party. As for the pages proper, this can be solved by Page Info, since the shield icon with page info is already shown for all pages there's no reason not to place domain info there.

 

 

Ironphoenix
New member

How about thinking a bit more, then? I agree that (further) privileging anglophones is not the way to go, but there are ways to generalize the idea.

Flagging anomalous characters could be context-dependent, e.g. warning for a Roman e in a Cyrillic context (Привет and Привeт being indistinguishable as well: the second uses a Roman e instead of the Cyrilic е). An index of lookalikes and expected context would not be too hard to set up, I think. It would also be good to handle hidden diacritics, such as a combining dot above (U+0307) with an i.

luis123456789
Strollin' around

In general the idea of flagging anomalous characters is not bad in a general sense. The problem is the context is usually not decided by the browser's or the page's set language: if your browser is set to English and German, you are reading Wikipedia in German and they quote a section in French wich includes a link with french characters, should that be marked anomalous? It certainly most likely isn't (unless you don't trust Wikipedia...).

Another option which I've seen discussed in the past is an opt-in to flagging links that have characters from mixed character sets regardless of context. This would in particular help with the case of hidden diacritics, because otherwise "untangling" the text into eg.: "·i" would produce a web render that for reproducibility and standarization purposes is wrong.

In this case it's important that the two emphasized constraints are respected: the flagging has to be opt-in and it has to happen regardless of context. Since otherwise we get into the problem of trying to mind-read the user and going into "did you just assume my gender language?".

 

The next closest aproach would be to flag characters in the href of URLs that don't match the lang="" attribute of their inherited context. But this requires, ofc, buy-in from page developers since they'd have to properly announce content language at the eg.: <div>, <span> level.

And then we get to the UI part: what do we mean with "flagging"? We can't add any effect that could be imitate (or undone!) with CSS, such as bolding the links or changing the color, because that alters presentation, is not reproducible, and adds a point for fingerprinting. Any flagging would have to be done at the chrome level, which means either in the toolbars, or at the link's contextual menu (in which case the warning is only accessible if the user context-clicks instead of action-clicks).

 

 

Ironphoenix
New member

I think the relevant context is extremely local: consider the adjacent characters as the relevant alphabet.

As for flagging, a popup warning when one action-clicks on the link would be my personal preference, at least on opt-in, and maybe even as a default setting with an option in the pop-up to disable the feature. I agree that people could probably find a way to neutralize an in-page flag.

Visibly different characters are less of an issue: https://connéct.mozilla.org/ is at least noticeable to someone paying attention, just like https://conncet.mozilla.org/ . The é is at home amid the Latin alphabet, и obviously isn't (e.g. https://coииect.mozilla.org/ ), but something that looks like e is visually ambiguous.