You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The page does not load but redirects to this url: https://webarchive.nationalarchives.gov.uk/ukgwa/20200328185042/https://www.arcg is.com/sharing/rest/oauth2/authorize?client_id=opsdashboard&display=default&respo nse_type=token&expiration=20160&redirect_uri=https%3A%2F%2Fwww.arcgis.co m%2Fapps%2Fopsdashboard%2FpostSignIn.html&locale=en- gb&state=%7B%22redirect%22%3A%22https%3A%2F%2Fwww.arcgis.com%2Fap ps%2Fopsdashboard%2Findex.html%22%2C%22portalUrl%22%3A%22https%3A% 2F%2Fwww.arcgis.com%2Fsharing%2Frest%2F%22%7D which displays as a blank page.
Expected behavior
I'd expect the timeline page to show the correct URL timeline and allow visitors to view the history of capture for this specific URL - and not strip out the final part of the url.
Screenshots
Environment
• OS: Linux
• Browser Any
• Version PYWB 2.7
The text was updated successfully, but these errors were encountered:
Everything after a # in a URL is the fragment part, and it is never sent to the server, but is handled by the web browser. (Normally to scroll to a certain position on the page.) Hence a harvester can only harvest with a URL with the fragment part stripped. That is why Pywb strips it, and shows what it found in the index about the URL without fragment part.
But maybe Pywb could replace the fragment in the links, to trick the browser to scroll according to it. Or maybe that would be confusing.
Describe the bug
PYWB seems to be stripping out part of the URL when a timeline page is requested. For
example:
https://webarchive.nationalarchives.gov.uk/*/https://www.arcgis.com/apps/op sdashboard/index.html#/f94c3c90da5b4e9f9a0b19484dd4bb14
loads a timeline for
https://www.arcgis.com/apps/opsdashboard/index.html
Each instance shown is for index.html and not index.html#/f94c3c90da5b4e9f9a0b19484dd4bb14
Steps to reproduce the bug
https://webarchive.nationalarchives.gov.uk/*/https://www.arcgis.com/apps/opsdashbo
ard/index.html#/f94c3c90da5b4e9f9a0b19484dd4bb14
https://webarchive.nationalarchives.gov.uk/ukgwa/20200402132156/https://www.arcg is.com/apps/opsdashboard/index.html starts to load. Note that the string after the # symbol has been stripped out.
https://webarchive.nationalarchives.gov.uk/ukgwa/20200328185042/https://www.arcg is.com/sharing/rest/oauth2/authorize?client_id=opsdashboard&display=default&respo nse_type=token&expiration=20160&redirect_uri=https%3A%2F%2Fwww.arcgis.co m%2Fapps%2Fopsdashboard%2FpostSignIn.html&locale=en- gb&state=%7B%22redirect%22%3A%22https%3A%2F%2Fwww.arcgis.com%2Fap ps%2Fopsdashboard%2Findex.html%22%2C%22portalUrl%22%3A%22https%3A% 2F%2Fwww.arcgis.com%2Fsharing%2Frest%2F%22%7D which displays as a blank page.
Expected behavior
I'd expect the timeline page to show the correct URL timeline and allow visitors to view the history of capture for this specific URL - and not strip out the final part of the url.
Screenshots
Environment
• OS: Linux
• Browser Any
• Version PYWB 2.7
The text was updated successfully, but these errors were encountered: