Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support jamie oliver scraping (and fix the entire project) #44

Merged
merged 49 commits into from
May 8, 2021

Conversation

shaniqwa
Copy link
Contributor

@shaniqwa shaniqwa commented Apr 28, 2021

Hi there,
very cool project. I wanted to explore scraping and adding a new website, did Jamie Oliver as suggested here: ​#32

but when I opened a PR I have found out that many of the tests fail... for a good reason - there have been quite a few changes out there. Hopefully, things will work better now..!

In this PR:

  • update puppeteer to the latest version
  • added recipe description field
  • Done lot's of scraper fixing which fixed all tests
  • Added support for Jamie Oliver website

For some reason, I'm still getting 1 fail test on Travis. it failed to fetchDOMModel: response status 503

 177 passing (3m)
  1 failing
  1) closetCooking
       should fetch the expected recipe:
     Error: No recipe found on page
     at ClosetCookingScraper.defaultError (helpers/BaseScraper.js:7:164)
      at ClosetCookingScraper.fetchDOMModel (helpers/PuppeteerScraper.js:9:1619)
      at processTicksAndRejections (node:internal/process/task_queues:96:5)
      at async ClosetCookingScraper.fetchRecipe (helpers/BaseScraper.js:19:169)
      at async Context.<anonymous> (test/helpers/commonRecipeTest.js:15:26)

but on my local machine all test pass: 178 passing (4m)

would appreciate help getting this last one to pass as well :)
cheers

UPDATE: I have added a validation, skipping the test in case the server is not responding. let me know what you think

@shaniqwa shaniqwa changed the title support jamie oliver scraping support jamie oliver scraping (and fix the entire project) Apr 28, 2021
@jadkins89
Copy link
Owner

Hey there! Thanks for all this work in getting the package up to date, adding a new prop, and scrape. I'll take a look at it ASAP, hopefully by this weekend. I see you're still debugging, so let me know if you make any headway.

@shaniqwa
Copy link
Contributor Author

@jadkins89 Hi 🙋‍♀️ I still don't know why the request to closetCooking is getting 503 when coming from the Travis server, but I have added some workaround - to pass tests if the server is not responding. (should be skipped properly and not passed... I can continue working on it if you think this is a good direction).

@jadkins89
Copy link
Owner

@shaniqwa That solution will work for now. Certain websites seem to have occasional issues with being scraped from TravisCI. We may get some false positives but I don't know if there is a perfect solution here.

One thing I'd like removed before merging is the dependency to moment. It is only being used in one scrape, so the added bloat doesn't seem worth it.

Thank you so much your time and effort bringing the scrapes back up to par!

@jadkins89 jadkins89 self-requested a review May 2, 2021 19:48
@shaniqwa
Copy link
Contributor Author

shaniqwa commented May 3, 2021

@jadkins89 done :)

@jadkins89 jadkins89 closed this May 8, 2021
@jadkins89 jadkins89 reopened this May 8, 2021
@jadkins89 jadkins89 merged commit 07ebd24 into jadkins89:master May 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants