-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inner text of node? #101
Comments
Ah, never mind. But |
@Randshot <a href="http://example-com">Link Name</a> created tree <a href="http://example-com">
-text: Link Name for get text from myhtml_collection_t *nodes = myhtml_get_nodes_by_tag_id(tree, NULL, MyHTML_TAG_A, NULL);
myhtml_node_text( myhtml_node_child(nodes->list[0]) ); or see serialization functions == innerText in JS myhtml_serialization_tree_callback(a_node->child, callback, NULL);
// or buffer
mycore_string_raw_t str = {0};
myhtml_serialization_tree_buffer(a_node->child, &str); or get all the text nodes at once myhtml_collection_t *nodest= myhtml_get_nodes_by_tag_id(tree, NULL, MyHTML_TAG__TEXT, NULL);
myhtml_node_text( nodes->list[0] ); Use P.S.: Yes, wrapper C ++ is needed, who would do ?! |
I have started working on one. |
Thanks! |
@lexborisov For example, I have a |
@Randshot node->next(); /* class node... */
next() {
node->next; /* get from C structure or myhtml_node_next(node)*/
} |
@Randshot any updates of your wrapper? |
@hbakhtiyor I haven't had any time for it lately. I will update you when I have some progress. |
Hi, |
Hi, |
dump.log |
Work fine. myhtml_parse(tree, MyENCODING_UTF_8, res.html, res.size);
myhtml_collection_t *collection = myhtml_get_nodes_by_tag_id(tree, NULL, MyHTML_TAG_SCRIPT, NULL);
for (size_t i = 0; i < collection->length; i++) {
mycore_string_raw_t str = {0};
if(collection->list[i]->child == NULL) {
printf("Oh, God! This not work, I can't believe this is not working\n");
exit(1);
}
myhtml_serialization_tree_buffer(collection->list[i]->child, &str);
printf("%s\n", str.data);
mycore_string_raw_destroy(&str, false);
} |
and, we have no |
Thanks... Yes, myhtml_node_child(), not get_child_node() (typo) At some time, my parse_node() function will parse TAG_SCRIPT, and this is where I'm doing the myhtml_node_child(node) -> NULL. |
This is maybe linked to I just tried parse_without_whitespace example, and I see that <script> is empty |
I confirm that this is because of MyHTML_TREE_PARSE_FLAGS_SKIP_WHITESPACE_TOKEN. Is a script a whitespace? |
I think there's a bug with MyHTML_TREE_PARSE_FLAGS_SKIP_WHITESPACE_TOKEN flag |
myhtml_collection_t *text=myhtml_get_nodes_by_tag_id_in_scope(tree,NULL,classname_list->list[i]->child,MyHTML_TAG__TEXT, NULL); const char *title=myhtml_node_text(text->list[0],NULL); |
If you want "true" analog of innerText (!= textContent), i have some example: https://github.com/Azq2/perl-html5-dom/blob/f57c11343a3c8ab77a5162083791560de7d6746b/DOM.xs#L282 written by spec. If you want more simple textContent - https://github.com/Azq2/perl-html5-dom/blob/f57c11343a3c8ab77a5162083791560de7d6746b/DOM.xs#L252 |
Hi,
I am trying to get the inner text of an node.
I tried different means to get the 'Link Name' part, but I always get NULL back.
The text was updated successfully, but these errors were encountered: