Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: parse deep link correctly #573

Merged
merged 3 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions __tests__/ExpensiMark-HTML-test.js
Original file line number Diff line number Diff line change
Expand Up @@ -609,6 +609,60 @@ test('Test a url ending with a closing parentheses autolinks correctly', () => {
expect(parser.replace(testString)).toBe(resultString);
});

test('Test urls autolinks correctly', () => {
let testString = '[email protected] https://www.expensify.com\n' +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little hard to read and review, so I was thinking maybe we should add a test for each of these cases, but I think a better idea would be to make this an array with entries for each case, and I think even better would be an array of arrays, where the test string and the expected string are next to each other for easy comparison, readability, review, and updates.

If you agree, can you update accordingly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about dividing it into 3~4 test sections according to test string type, like Expensify/App#16762 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works, but I still arrays are better than string concats, and an array of array such that the test and expected strings are next to each other is better than both

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following implementation is acceptable ?

    const testCases = [
        {
            testString: '[email protected] https://www.expensify.com',
            resultString: '<a href="mailto:[email protected]">[email protected]</a> <a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a>',
        },
        {
            testString: '[email protected]://www.expensify.com',
            resultString: '<a href="mailto:[email protected]">[email protected]</a>-<a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a>',
        },
    ];

    testCases.forEach(testCase => {
        expect(parser.replace(testCase.testString)).toBe(testCase.resultString);
    });

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that looks better than what I had in mind :)

'[email protected]://www.expensify.com\n' +
'[email protected]/https://www.expensify.com\n' +
'[email protected]?https://www.expensify.com\n' +
'[email protected]>https://www.expensify.com\n' +
'https://staging.new.expensify.com/details/[email protected]\n' +
'staging.new.expensify.com/details\n\n' +
'https://www.expensify.com?name=test&[email protected]\n' +
'https://staging.new.expensify.com/[email protected]\n' +
'staging.new.expensify.com/[email protected]\n' +
'http://necolas.github.io/react-native-web/docs/?path=/docs/components-pressable--disabled\n' +
'-https://www.expensify.com /https://www.expensify.com @https://www.expensify.com\n' +
'expensify.com -expensify.com @expensify.com\n' +
'https//www.expensify.com\n' +
'//www.expensify.com?name=test&[email protected]\n' +
'//staging.new.expensify.com/[email protected]\n' +
'/[email protected]\n' +
'?name=test&[email protected]\n\n' +
'example.com/https://www.expensify.com\n' +
'[email protected] staging.new.expensify.com/[email protected]&redirectUrl=https://google.com\n' +
'[email protected] //staging.new.expensify.com/[email protected]&redirectUrl=https://google.com\n' +
'[email protected]://staging.new.expensify.com/[email protected]&redirectUrl=https://google.com\n' +
'[email protected]/https://example.com/[email protected][email protected]\n' +
'[email protected]/[email protected]/https://example.com/[email protected][email protected]';

let resultString = '<a href="mailto:[email protected]">[email protected]</a> <a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a><br />' +
'<a href="mailto:[email protected]">[email protected]</a>-<a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a><br />' +
'<a href="mailto:[email protected]">[email protected]</a>/<a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a><br />' +
'<a href="mailto:[email protected]">[email protected]</a>?<a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a><br />' +
'<a href="mailto:[email protected]">[email protected]</a>&gt;<a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a><br />' +
'<a href="https://staging.new.expensify.com/details/[email protected]" target="_blank" rel="noreferrer noopener">https://staging.new.expensify.com/details/[email protected]</a><br />' +
'<a href="https://staging.new.expensify.com/details" target="_blank" rel="noreferrer noopener">staging.new.expensify.com/details</a><br /><br />'+
'<a href="https://www.expensify.com?name=test&amp;[email protected]" target="_blank" rel="noreferrer noopener">https://www.expensify.com?name=test&amp;[email protected]</a><br />' +
'<a href="https://staging.new.expensify.com/[email protected]" target="_blank" rel="noreferrer noopener">https://staging.new.expensify.com/[email protected]</a><br />' +
'<a href="https://staging.new.expensify.com/[email protected]" target="_blank" rel="noreferrer noopener">staging.new.expensify.com/[email protected]</a><br />' +
'<a href="http://necolas.github.io/react-native-web/docs/?path=/docs/components-pressable--disabled" target="_blank" rel="noreferrer noopener">http://necolas.github.io/react-native-web/docs/?path=/docs/components-pressable--disabled</a><br />' +
'-<a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a> /<a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">https://www.expensify.com</a> @https://www.expensify.com<br />' +
'<a href="https://expensify.com" target="_blank" rel="noreferrer noopener">expensify.com</a> -<a href="https://expensify.com" target="_blank" rel="noreferrer noopener">expensify.com</a> @expensify.com<br />' +
'https//<a href="https://www.expensify.com" target="_blank" rel="noreferrer noopener">www.expensify.com</a><br />' +
'//<a href="https://www.expensify.com?name=test&amp;[email protected]" target="_blank" rel="noreferrer noopener">www.expensify.com?name=test&amp;[email protected]</a><br />' +
'//<a href="https://staging.new.expensify.com/[email protected]" target="_blank" rel="noreferrer noopener">staging.new.expensify.com/[email protected]</a><br />' +
'/details?login=<a href="mailto:[email protected]">[email protected]</a><br />' +
'?name=test&amp;email=<a href="mailto:[email protected]">[email protected]</a><br /><br />' +
'<a href="https://example.com/https://www.expensify.com" target="_blank" rel="noreferrer noopener">example.com/https://www.expensify.com</a><br />' +
'<a href="mailto:[email protected]">[email protected]</a> <a href="https://staging.new.expensify.com/[email protected]&amp;redirectUrl=https://google.com" target="_blank" rel="noreferrer noopener">staging.new.expensify.com/[email protected]&amp;redirectUrl=https://google.com</a><br />' +
'<a href="mailto:[email protected]">[email protected]</a> //<a href="https://staging.new.expensify.com/[email protected]&amp;redirectUrl=https://google.com" target="_blank" rel="noreferrer noopener">staging.new.expensify.com/[email protected]&amp;redirectUrl=https://google.com</a><br />' +
'<a href="mailto:[email protected]">[email protected]</a>-<a href="https://staging.new.expensify.com/[email protected]&amp;redirectUrl=https://google.com" target="_blank" rel="noreferrer noopener">https://staging.new.expensify.com/[email protected]&amp;redirectUrl=https://google.com</a><br />' +
'<a href="mailto:[email protected]\">[email protected]</a>/<a href="https://example.com/[email protected][email protected]" target="_blank" rel="noreferrer noopener">https://example.com/[email protected][email protected]</a><br />' +
'<a href="mailto:[email protected]">[email protected]</a>/<a href="mailto:[email protected]">[email protected]</a>/<a href="https://example.com/[email protected][email protected]" target="_blank" rel="noreferrer noopener">https://example.com/[email protected][email protected]</a>';

expect(parser.replace(testString)).toBe(resultString);
});

test('Test markdown style email link with various styles', () => {
const testString = 'Go to ~[Expensify]([email protected])~ '
+ '_[Expensify]([email protected])_ '
Expand Down
27 changes: 20 additions & 7 deletions lib/ExpensiMark.js
Original file line number Diff line number Diff line change
Expand Up @@ -446,12 +446,6 @@ export default class ExpensiMark {
let startIndex = 0;

while (match !== null) {
// we want to avoid matching email address domains
let abort = false;
if ((match.index !== 0) && (textToCheck[match.index - 1] === '@')) {
abort = true;
}

// we want to avoid matching ending ) unless it is a closing parenthesis for the URL
if (textToCheck[(match.index + match[2].length) - 1] === ')' && !match[2].includes('(')) {
match[0] = match[0].substr(0, match[0].length - 1);
Expand All @@ -476,9 +470,28 @@ export default class ExpensiMark {
}
replacedText = replacedText.concat(textToCheck.substr(startIndex, (match.index - startIndex)));

// we want to avoid matching email address domains
Copy link
Contributor

@cead22 cead22 Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// we want to avoid matching email address domains
// We want to avoid matching domains in email addresses so we don't render them as URLs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's from old PR, I just moved it to other line. but new comment is more readable.

let abort = false;
let shouldRetryByAtSign = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "by at sign" mean? are we doing another try by removing the @ sign, by adding the @ sign, or by testing against a new regex with/out @ sign?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also feeling that variable name is not reasonable.

I think isReparsedByAutolink is more readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think about new one?


if ((match.index !== 0) && (textToCheck[match.index - 1] === '@')) {
const domainRegex = new RegExp('^(([a-z-0-9]+\\.)+[a-z]{2,})(\\S*)', 'i');
const domainMatch = domainRegex.exec(match[2]);

// e.g. [email protected]/https://www.test.com
// If the matched string faces @ sign before it,
// We will retry to apply autolink rule to the string(e.g. /https://www.test.com) except for domain(e.g. expensify.com) after @ sign.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update this comment to make it clearer?

  • What is the e.g. an example of? shouldn't that be after another comment?
  • What does "faces @ sign before it"? I suggest "If the matches string has a leading @ sign" -- use "leading" or "trailing" depending on the case
  • It should say "why"

Copy link
Contributor Author

@Antasel Antasel Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the e.g. an example of? shouldn't that be after another comment?

yes, it's example of comment. It should be after comment normally, but I tended to point '/https://www.test.com' and '[email protected]' in comment details

What does "faces @ sign before it"? I suggest "If the matches string has a leading @ sign" -- use "leading" or "trailing" depending on the case

Your suggestion is good, I will change comment as you recommended. "if the matched string has a leading @ sign"

It should say "why"

By a Old PR, if matched string has leading @ sign, It simply aborts to apply autolink rule. But we have edge cases like [email protected]/https://www.test.com, where https://www.test.com should be parsed as url.
To explain above reason, I added a example of edge test string.

As a result, I will change comment as following:

// If the matched Url has a leading @ sign, The Url's domain can be domain of email address.
// At this case, The matched string should not be parsed as Url, 
 // but it can contain another Url which should be parsed. (e.g. [email protected]/https://www.test.com)
// So We will retry to apply autolink rule to the string(e.g. /https://www.test.com) except for domain(e.g. expensify.com) after @ sign.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Friendly reminder, your thoughts ?

if ((domainMatch !== null) && (domainMatch[3] !== '')) {
shouldRetryByAtSign = true;
replacedText = replacedText.concat(domainMatch[1] + this.replace(domainMatch[3], {filterRules: ['autolink']}));
} else {
abort = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add a comment explaining why we're setting abort to true here? It's not immediately obvious to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

This comment is to say why we are setting abort to true

}
}

if (abort || match[1].includes('<pre>')) {
replacedText = replacedText.concat(textToCheck.substr(match.index, (match[0].length)));
} else {
} else if (!shouldRetryByAtSign) {
const urlRegex = new RegExp(`^${LOOSE_URL_REGEX}$|^${URL_REGEX}$`, 'i');

// `match[1]` contains the text inside the [] of the markdown e.g. [example](https://example.com)
Expand Down
Loading