Replies: 1 comment 4 replies
-
Hi @ardhrubo and thanks for your interest in Crawlee. Could you send a code snippet that illustrates the problem you're facing? I have a hard time understanding how |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I am working with Crawlee and want to retrieve data from an external API without launching a browser.
I want to send a request to the external source, fetch the data, and then process it using the Cheerio crawler. Using Crawlee’s BasicCrawler, I can fetch data from external sources, but I am facing an issue with the built-in URL strategy.
While BasicCrawler allows me to fetch data, the default enqueuing strategy doesn't seem to allow crawling the same hostname or subdomains the way Cheerio and Puppeteer crawlers do.
This is causing a problem when trying to process the fetched URLs that belong to the same or different subdomains.
Is there a way to enable BasicCrawler to crawl the same hostname or subdomain, similar to the behavior provided by Cheerio or Puppeteer crawlers?
Any advice or workaround to overcome this limitation would be very helpful.
Beta Was this translation helpful? Give feedback.
All reactions