You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Also when parsing links (a href) on a page port 80 should also be removed. Having links with and without port 80 will
result in playback issues since url can not be matched.
Same goes for https port 443
The text was updated successfully, but these errors were encountered:
This is an example of an url_norm in Solr with the port 80.
url_norm:"http://train-aarhus.dk:80/visbillede.asp?fp=brandnewheavies.jpg"
In this case the url comes from the ARC (not WARC) header:
Arc Header
http://train-aarhus.dk:80/visbillede.asp?fp=brandnewheavies.jpg 194.239.250.54 20001021042018 text/html 1699
HTTP/1.1 200 OK
Server: Microsoft-IIS/4.0
Also when parsing links (a href) on a page port 80 should also be removed. Having links with and without port 80 will
result in playback issues since url can not be matched.
Same goes for https port 443
The text was updated successfully, but these errors were encountered: