-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do we get this into Boost? #111
Comments
I would like to refactor the code to provide a more consistent set of interfaces, including a low-level interface like what you describe. It would also be nice to have smaller lookup tables and to reduce the overlap between the existing tables, but I'm not sure how much of that will be possible. Finally, I want to add string2float and string2double parsers based on similar principles - I have a design in my head but haven't had the time to implement it. Personally, I think header-only libraries are a mistake. I understand why they're attractive, but they seem undesirable from a build performance perspective. That said, if there's a way I can make it easier for you and @StephanTLavavej to construct header-only libraries in a way that allows you to easily pull updates from here, that works for me. It's more important to make it easy for you to use the code than to insist on such principles (but I had to say it anyway :-p). |
You have to read my weasel-words carefully: "...with the option of separate compilation" On paper header-only is bad I agree but when it comes to obtaining the most valuable currency in the universe (i.e. GitHub stars) a header-only library presents a lower barrier to entry for newbies especially if they are not experts at configuring their build system. For seasoned users or when the newbies manage to create a large program which starts to take too long to compile, they can define a macro such as Public-facing header file declares the necessary complete types: Functions use the macro Then at the bottom of the file we conditionally include the function definitions: Users who want the separate compilation simply define It really is all about choice, this brings the best of both worlds while sacrificing nothing (well, a little bit of extra effort to organize the headears in a particular way). |
The necessary changes for My derived implementation is header-only, despite the fact that we ship separately compiled DLLs/LIBs, partly because adding separately compiled code is surprisingly onerous for us (due to binary compatibility concerns; we can do it as we did for Boost.Math and Filesystem, but it's a big deal), but mostly so that the large static tables are an opt-in cost. The compiler throughput implications of the roughly half-megabyte of source code and static tables are significant, but my feeling is that this can be easily avoided by dragging Currently, the code that I've directly derived from Ryu (shipping in Note that my implementation has one potentially significant limitation - it handles only 32-bit float and 64-bit double, as MSVC doesn't have a wider
Due to how I'm rebasing massively invasive changes on top of your I guess I would start by asking: Vinnie, could we simplify this problem by simply layering Boost on top of There is also the possibly-C++20 fmt library coming, which will also be layered on top of charconv (somewhat invasively, AFAICT). Is there a reason for Boost to duplicate that work? |
I do not understand what is being asked. Are you suggesting that Boost should build an implementation on top of The purpose of duplicating the work in Boost is to make it universally available (all compilers and stdlibs that Boost targets). |
Yeah. With to_chars possibly being provided as a "bolt-on" standalone implementation. I'd like to know how many users are stuck on old toolsets or old Standard modes and can't easily upgrade, yet have access to the latest Boost. In MSVC we've done a lot of work to address the former (maintaining bincompat in the 2015/2017/2019 release series and updating far more frequently, so users are continuously on the latest release). |
At least one user, me. Beast only requires C++11. I need all the GitHub stars I can get, and requiring C++17 is for the most part a death sentence with respect to obtaining a large user-base, unless your library is so incredibly useful that people will jump through any hoop in order to use it (I'm thinking TensorFlow here). Most libraries are not like that so we have to support C++11 which is still quite popular according to this developer survey: and this developer survey: I will probably fork Beast to require a newer C++ some time in 2021 when the toolchains have settled down. 10 years is enough. |
Hi all, I'm in the same need as Vinnie. I just probably need it more. ( I'm working on a formatting library that I hope to propose to boost this year [1] ). Just a few days ago I started to study how that could be achieved, and then I saw this conversation. Afaics, a header-only version of ryu would not be feasible in C. The main reason is that it would be necessary to define all functions and all global variables as static, implying a copy in each translation unit during the build process. And that would be unacceptable given the size of those functions and especially of those lookup tables ryu has. In C++ that would be easily solved by defining those tables as static arrays inside inline functions. The linker would then ensure that there would be only one definition of each of them in the program. In, C however, inline functions don't work that way. [2] The other issue is that a header only C library would expose all names into the global scope. So this is how I was considering to do is to enable the current code to work in two modes: as a C compiled library and as a C++ header-only library. This implies adding some more I think the C++ header-only api should:
This seems possible. If no one is interested in doing that, I will give a try. Btw, I have a version of the [1] https://github.com/robhz786/stringify |
Some thoughts on above. I think, for the C++ version to be palatable it needs to support "separate compilation" as I described in my earlier reply: (#111 (comment)). This way the user has a choice between the convenience of header only, and the performance of separate compilation. The main challenge of a C++ port will be to keep the C and C++ versions in sync, as it is clear that there is considerable active development going on and no promise of code stability. For the tables, one way is to make a header file that contains only the table values and then it can be included inside a function:
With respect to the |
Perhaps the C++ non-header-only version could be just a thin wrapper over the C api. For example,
|
It would be weird to have symbols starting with BOOST_ checked in here, and @StephanTLavavej might be unhappy about that, too. That said, I could be convinced to maintain a list of symbols and a customizable renaming script if that helps. Using the same header for C and C++ and having it expose different APIs depending on context seems ... not ideal to me. I'd rather have separate headers if that's not too much of a problem. |
Agreed - the name I used was expository only. It could just as easy be prefixed with |
I take advantage of C++17 by leaving the tables at file scope and marking them as
It's perfectly fine for C++ code to include
I'm achieving this by continual rebasing, although that probably isn't something that other people want to do, and unfortunately the result can't be ingested by non-Standard-Library-implementers (as all names are ugly, I assume C++17, etc.). What are the other language ports doing - effectively perma-forking and not picking up upstream improvements?
Kinda - it would be weird but it wouldn't be very problematic. I mechanically rename everything to prepend double underscores, so treating names specially is a minor complication, but not major. The only names that I currently special-case are umul128 and shiftright128 since adding double underscores creates collisions and near-collisions with the intrinsic names. If BOOST names appeared, I would either rename it away (likely), or let the addition of double underscores prevent any conflicts. Renames to make the macro constants more collision-resistant by adding more RYU etc. would be welcome. (I additionally change them to be |
Hi all, created pull request to enable header-only mode. |
I have made tremendous progress on a port of ryu to C++, which can be either compiled as a separate static or dynamic library (default) or as header-only (by setting a macro). |
Cool! Do you have that in a place that we could look at? Also, are you planning to upstream any of your changes? |
Yes: https://github.com/vinniefalco/json/tree/develop/include/boost/json/detail/ryu This is in detail/ because it is not a public API. I could have done it with fewer diffs I think (e.g. keep using stdint.h instead of cstdint). Unfortunately I can't submit these changes to the upstream. Header-only C++ is fundamentally incompatible with C. |
Here's an example of the difference from d2fixed_full_table.h. Here's an original table declaration:
Here is the C++ declaration
In theory this could be accomplished with a macro although in C++ the table is accessed by calling a function, e.g.
Could this work? |
Yes, declaring macros for table definition and access sounds reasonable. |
We were discussing how to get this into Boost and an obvious solution came up, ask @StephanTLavavej ! We could use something header-only with the option of separate compilation. I was working on a port but I stopped because the changes to support this use case would be extensive and this project looks like it is still in development and might see considerably more commits that I would have to port to my C++-mangled header-only solution.
The routines which I ultimately need are
And of course an implementation of C++17's
to_chars
(so that users of older C++ versions have access to it).There are a number of Boost libraries that would benefit from this code such as
lexical_cast
and a few others.The text was updated successfully, but these errors were encountered: