You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m trying to understand the specific markdown structure used by the Jina Reader API when converting HTML to markdown. For instance, I’ve observed the following mappings:
<h1> tags are mapped to ==========
<h2> tags are mapped to ------
Is this the standard markdown structure followed by the Jina Reader API? Additionally, I’ve noticed that the output can sometimes vary. Is this due to the use of a heuristic method or some other factor?
Thanks!
The text was updated successfully, but these errors were encountered:
We are using turndown for HTML to Markdown transformation. Whether h1/h2 gets transformed into ## or ==/-- can be configured with turndown, but we have not customized this option and followed the default.
The default output sometimes changes because Reader automatically switches the use of readability for some level of smart trimming.
If readability would apparently not work for the page we fall back to a rule-based approach known as markdown.
If you find the markdown format preferable, you can specify x-respond-with: markdown or x-return-format: markdown to stabilize the return format.
Hi,
I’m trying to understand the specific markdown structure used by the Jina Reader API when converting HTML to markdown. For instance, I’ve observed the following mappings:
<h1>
tags are mapped to==========
<h2>
tags are mapped to------
Is this the standard markdown structure followed by the Jina Reader API? Additionally, I’ve noticed that the output can sometimes vary. Is this due to the use of a heuristic method or some other factor?
Thanks!
The text was updated successfully, but these errors were encountered: