You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ArchiveBot job e3pq9nd3o10nud4gctgm2nnz0 for http://www.stevenholcomb.com/ (viewer) failed to recurse. It only grabbed the homepage, robots.txt, sitemap.xml, and two (broken) URLs in the sitemap.
I tested with a simpler command and was able to reproduce this with wpull --recursive --level inf --no-verbose --html-parser libxml2-lxml http://www.stevenholcomb.com/ on one of my pipelines with wpull 2.0.3. But when using html5lib, it recurses correctly.
With commit ec24bba (PR #393), however, I'm unable to reproduce it on another machine (different Python version, libraries, etc.). So maybe possibly this is fixed already, but it needs further investigation.
The server's sending UTF-16LE-encoded HTML (without advertising it in a header), which might play a role in this.
The text was updated successfully, but these errors were encountered:
ArchiveBot job e3pq9nd3o10nud4gctgm2nnz0 for http://www.stevenholcomb.com/ (viewer) failed to recurse. It only grabbed the homepage, robots.txt, sitemap.xml, and two (broken) URLs in the sitemap.
I tested with a simpler command and was able to reproduce this with
wpull --recursive --level inf --no-verbose --html-parser libxml2-lxml http://www.stevenholcomb.com/
on one of my pipelines with wpull 2.0.3. But when using html5lib, it recurses correctly.With commit ec24bba (PR #393), however, I'm unable to reproduce it on another machine (different Python version, libraries, etc.). So maybe possibly this is fixed already, but it needs further investigation.
The server's sending UTF-16LE-encoded HTML (without advertising it in a header), which might play a role in this.
The text was updated successfully, but these errors were encountered: