Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating with selectorgadget #15

Open
Jagdeep1 opened this issue Mar 24, 2012 · 2 comments
Open

Integrating with selectorgadget #15

Jagdeep1 opened this issue Mar 24, 2012 · 2 comments
Labels

Comments

@Jagdeep1
Copy link

Hi,

I am trying to integrate selector gadget with scraper code. Please help in giving some pointers to start with. If this integration will workout I will give an option to user for selecting xpath generated by selectorgadget because it gives it with css elements as well.

Please help me out..

Regards
Jagdeep

@mnmldave
Copy link
Owner

Cool, class predicates for xpath would be great. Just note that I'm really busy right now so I won't be this responsive in the future and it will be awhile until I can properly examine a pull request, especially if it's using any third-party code as I will at least need to check out the license.

Anyway, the automatic xpath is generated by this xpath() jQuery function. This is probably where you will want to start. If you keep all your submitted code within the anonymous function starting on line 145, then that would be perfect.

Important: To keep things working more-or-less like they do now, you must guarantee that all ancestor predicates select only one node. For instance consider this html:

<html>
  <body>
    <ul class="list">
      <li>...</li>
      <li>...</li>
    </ul>
    <ul class="list">
      <li>...</li>
      <li>...</li>
    </ul>
  </body>
</html>

If the user scraped <li> elements from the first list, Scraper would create an xpath such as //ul[0]/li which correctly selects list items from the first <ul> only. If we selected //ul[@class="list"]/li then we'd select list items from both lists, which is not what the user indicated. So please make sure each segment of the xpath (except the last) selects only one element from the DOM.

I don't think an option needs to be presented to the user for this as it would further complicate the UI and class predicates are much more intuitive than the index.

I wrote scraper very hastily and did not write tests, but please place tests for this in https://github.com/mnmldave/scraper/blob/master/src/test/spec/jquery-xpath.spec.js for now. You can run these tests by pointing your browser at chrome-extension://the-id-of-your-extension/test/SpecRunner.html.

Thanks,
dh.

@Jagdeep1
Copy link
Author

Thanks a lot... working on this.. will update u soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants