Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank spaces being accepted #83

Open
badosu opened this issue May 21, 2019 · 6 comments
Open

Blank spaces being accepted #83

badosu opened this issue May 21, 2019 · 6 comments

Comments

@badosu
Copy link

badosu commented May 21, 2019

After #74 got merged, the following happens:

pry> new_record.url
=> "http://this is not a URL"
pry> new_record.valid?
=> true

It was merged as a fix for #73. The issue complains that a certain url should be valid, but it contains an unescaped non-ASCII character (ç) which, per rfc3986, is invalid.

@minifast-winston
Copy link

Made a pull request (#85) to revert the URL encoding.

@minifast-winston
Copy link

It looks like this gem doesn't properly validate URL paths in some cases. For example, validating http://example.com/some/? doodads=ok against the default validator says the URL is valid, when it should be invalid because of the space.

It seems to be a combination of the presence of both the ? and characters in the path.

@minifast-winston
Copy link

It looks like this is behavior of Ruby's URI.parse(). http://ex ample.com throws an error, but http://example.com/? fun encodes to http://example.com/?%20fun.

It seems that either their is a problem with URI.parse(), or that its authors do not intend for it to validate querystrings - which seems odd.

@minifast-winston
Copy link

It looks like this is part of the implementation of Generic.parse() (https://github.com/ruby/ruby/blob/trunk/lib/uri/generic.rb#L75)

# At first, tries to create a new URI::Generic instance using
# URI::Generic::build. But, if exception URI::InvalidComponentError is raised,
# then it does URI::Escape.escape all URI components and tries again.

@minifast-winston
Copy link

It seems like relying on URI.parse() alone for validation isn't ideal since the authors have decided it should try to encode the URL in event of an exception. This means that invalid strings will pass validation.

@minifast-winston
Copy link

Added another commit in the pull request to handle the spaces in querystring issue. In addition to the usual checks, we now match the raw url value against the URI::regexp pattern for its scheme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants