Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot grab entire text of <td> node when it includes a <br> tag #193

Open
samueldaniel opened this issue Jul 8, 2024 · 0 comments
Open

Comments

@samueldaniel
Copy link

samueldaniel commented Jul 8, 2024

I am trying to parse a <table>:

  void parse_metric(
      myhtml_tree_t* tree,
      myhtml_collection_t* rows,
      int row_idx,
      std::string metric_name,
      int col_idx) {
      myhtml_collection_t* cols = myhtml_get_nodes_by_tag_id_in_scope(
          tree, nullptr, rows->list[row_idx], MyHTML_TAG_TD, nullptr);
      if (cols && cols->list && cols->length && col_idx <= cols->length) {
          myhtml_tree_node_t* text_node = myhtml_node_child(cols->list[col_idx]);
          if (text_node) {
              const char* text = myhtml_node_text(text_node, nullptr);
              if (text) {
                  printf("%s: %s\n", metric_name.c_str(), text);
              }
          }
      }
  }

There is only one <table> in the whole tree. So i wrote a function that takes the rows of the table, an index for that row, and then the index of the column i want from that row.

The <td> in question looks like this: <td >CLEAN TARE COMPLETE <br>( 116.0 mT )</td>

I am expecting the text variable to contain CLEAN TARE COMPLETE <br>( 116.0 mT ) or even just CLEAN TARE COMPLETE ( 116.0 mT ).

But all I'm getting is CLEAN TARE COMPLETE. How can I capture the text after the <br> tag?

@samueldaniel samueldaniel changed the title Cannot grab text entire of <td> node when it includes a <br> tag Cannot grab entire text of <td> node when it includes a <br> tag Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant