Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for IDN (punycode) handles #7043

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

drash-course
Copy link

Currently, all @handle.tld are displayed in their ASCII form. This is fine for most domains, but some TLDs accept IDN domains and it'd be nice to support them fully. I found at least one other issue #6354 discussing this.

Of course, this opens the can of worms that are IDN homograph attacks.

I'm implementing a strategy similar to Firefox and Chrome : detect what "scripts" are being used in the IDN domain label and forbid mixing problematic scripts. In those cases, the punycode (ASCII) version of the label is shown instead. I've also added detection for the uppercase i, which can be mistaken for the lowercase l in many fonts. There are certainly more to add.

The security of this feature can't be taken lightly, and I hope this PR can create a discussion to progress this issue forward.

@mary-ext
Copy link
Contributor

mary-ext commented Dec 10, 2024

this is really cool, are you basing this off Chrome/Firefox's exact behavior or just the general idea of what they're doing? I wonder if they have test suites for this that we can copy over.

@drash-course
Copy link
Author

this is really cool, are you basing this off Chrome/Firefox's exact behavior or just the general idea of what they're doing?

Just the general idea.
I've read https://bugzilla.mozilla.org/show_bug.cgi?id=1332714#c17 to understand the attack, and what Firefox considers their responsibility to filter.

Arguably it's the responsibility of Registrars to make sure people are not registering domains that are confusable. But in practice, those domains can be registered. So the status quo is that it's up to browsers and apps to carefully sanitize IDN domains and minimize the confusion potential with ad-hoc filters.

@surfdude29
Copy link
Contributor

It might be a good idea to open up a discussion about this in the atproto repo:

https://github.com/bluesky-social/atproto/discussions

@drash-course
Copy link
Author

drash-course commented Dec 11, 2024

It might be a good idea to open up a discussion about this in the atproto repo:

https://github.com/bluesky-social/atproto/discussions

I'm reading the AT Proto Handle spec and it seems they already took a position on IDN handles.

Internationalized Domain Names ("IDN", or "punycode") are not directly relevant to the low-level handle syntax. In their encoded form, IDNs are already valid hostnames, and thus valid handles. Such handles must be stored and transmitted in encoded ASCII form. Handles that "look like" IDNs, but do not parse as valid IDNs, are valid handles, just as they are valid hostnames. Applications may, optionally, parse and display IDN handles as Unicode.

@surfdude29 Do you think ATProto needs to standardize the filters apps should use to avoid IDN homograph attacks? Maybe ATProto can standardize what the security goals are. On the other hand, standardizing a specific implementation sounds unwise when you want to be able to react to novel attacks with new code / checks.

Edit: The ATProto discussion is at bluesky-social/atproto#3227

@surfdude29
Copy link
Contributor

@drash-course I didn't know about that part of the ATProto handle spec tbh, so I may well have been over-cautious in suggesting it would be a good idea to discuss it over at the atproto repository as well.

However, I think you have raised some excellent points in your post there, and I hope that others who know a lot more about it than me (including Bluesky team members like @bnewbold) see it and are able to give feedback.

* main: (58 commits)
  Fix tests
  Layout tweaks (bluesky-social#7150)
  Trending (Beta) (bluesky-social#7144)
  Fix emoji picker position (bluesky-social#7146)
  Tweak Follow dialog Search placeholder (bluesky-social#7147)
  New progress guide - 10 follows (bluesky-social#7128)
  Pipe statsig events to logger (bluesky-social#7141)
  Fix notifications borders (bluesky-social#7140)
  Refetch empty feed on focus (bluesky-social#7139)
  Read storage on window.onstorage (bluesky-social#7137)
  [ELI5] Tweak wording on the signup screen (bluesky-social#7136)
  alf error screen (bluesky-social#7135)
  add safe area view to profile error screen (bluesky-social#7134)
  Adjust gates (bluesky-social#7132)
  disable automaticallAdjustsScrollIndicatorInsets (bluesky-social#7131)
  Bump more native deps (bluesky-social#7129)
  Update more Expo packages (bluesky-social#7127)
  feat: widen recent search profile link for mobile devices (bluesky-social#7119)
  Fix video uploads on native (bluesky-social#7126)
  Fix post time localization on Android (bluesky-social#6742)
  ...

# Conflicts:
#	src/view/com/profile/ProfileSubpageHeader.tsx
#	src/view/screens/ProfileList.tsx
@drash-course drash-course marked this pull request as draft December 18, 2024 17:52
@drash-course
Copy link
Author

I'm working on a more standard sanitization algo that follows https://wiki.mozilla.org/IDN_Display_Algorithm, and I'll add tests. I'll update this PR when it's ready, hopefully in a few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants