-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interested in collaborating #48
Comments
Hi Jonathan, Examples: Numeric types: Date/time: Errors as strings: statrs error: The failing test: Contributions: Marcus |
I'm glad you found it helpful! To give you a bit more background about myself, I trade on the crypto markets, at relatively "high frequency" (the infrastructure is very young, it's milliseconds that matter still, not nanoseconds). Although I studied programming, I worked as a political reporter after school; my interest in trading came from an experience of using machine learning to forecast the outcomes of congressional votes, which we hoped to sell to hedge funds. (The forecasts were perfect, the sales less so -- it's still hard to trade an informational edge like that, we found). Anyway, that's all to say it's less that I know exactly what I want to use here than I am very excited by the wealth of knowledge the code contains and all that I can learn from it. For instance, I had never encountered the term "bump" before yesterday, which I gather means to update calculations incrementally on receiving market new data. It's also to point out that my reference point of latency-sensitive code that runs 24/7 (no "trading day") and must never crash is pretty different from running a daily analysis - so my performance concerns may be overzealous. I collect lots and lots of market data but I doubt it's the kind needed here. However, I found the functions in One major thing I noticed from my further review of the code was the extensive use of single-use reference count types ( On one previous occasion, I remember facing a similar problem that I solved by enabling the "rc" feature in serde (it's listed in the Cargo.toml. I believe it allows In general, it's far preferable to use a generic Can you explain what problems you faced that led you to these design choices? I have in mind several occasions where I ended up doing similar things simply because it seemed to be the only way to accomplish what I wanted the code to do. I gather that the intended use is to be able to load the underlying data from file, so serialization is important. Was the use of reference counting types primarily for convenience, or necessity (cyclic data structures, etc.)? On a related front, many of the reference counted types use I've forked the library and begun some work around the edges, I plan on submitting various pull requests here and there coming forward. First significant one will be to construct an example based on the monte carlo tests. Thanks for making this code public! The scope of the project is quite amazing and I'm very pleased I found it. |
Hello,
Earlier this evening I randomly came across your post on users.rust-lang.org from several months ago looking for feedback on this library. I wasn't previously aware of the project and was impressed by its breadth and the amount of work that's already gone into it. I run a trading operation on a rust codebase so this is definitely of major interest to me! I'd be interested in collaborating and wanted to send some initial thoughts from a couple hours with the code.
You mentioned in the post you were new to rust, so I included various recommendations from my own experience. Perhaps some of it will be obvious or things you already know - but I thought it couldn't hurt. Like you, I see rust as an incredible tool for this application, owing to the great performance, powerful type system, modern packaging system, etc. But there are definitely still bumps along the way during the learning curve.
In any event, here are various thoughts I jotted down while diving into the project for the first time:
The number one thing I wanted/expected to see are some examples. Examples are integrated into cargo (
cargo run --example pricing
) and I often start there when I'm looking at a project for the first time.On the other hand, there's plenty of tests. (Second step is often
git clone
thencargo test
for me). Perhaps some of the tests include code that would be suitable to use in examples. Do you have any recommendations on which ones to look at?One of the biggest issues I face is needing to use different number types in different situations and performance and ergonomics pitfalls associated with juggling all of them, converting from one to the other, etc. In many cases, correct precision is required and a decimal type is needed. (I have been using the rust wrapper to the decNumber lib for over a year, it seems by my lights to be the best choice at the moment). In many other cases, the performance hit from using decimal is very, very steep compared to the floating point primitives (f64/f32). Rust tends very much towards prudence when it comes to floating point, with a carefully designed api, but it still takes a lot of care to avoid blowups. A third option that sometimes presents itself is to use integer types as fixed-point, for instance when the precision of prices is static. This can be very fast and precise, but it's not always available. I saw that the library generally uses f64, I was wondering if you had further thoughts on how that has worked for you in this application.
I'm intrigued by the date functionality here; it might be worth pursuing it as a stand-alone crate. Currently the chrono crate has a pretty good handle on power/completeness/correctness and it works very well with serde for deserialization. However, the main chrono
DateTime<Utc>
type is somewhat heavyweight: it's ani32
for the date, twou32
s for the time, plus an offset enum (which I believe is zero-sized, however). I frequently use nanosecond timestamps stored in an i64/u64 for performance, and there's no good library that I've seen that focuses less on the kitchen sink, and more on the most performant possible way to store/operate on times.Regarding the
Error
struct defined incore::qm
:(Haven't been able to tell if this is widely used throughout the library or not). Error-handling has been a moving target in rust, with the std
Error
trait falling out of favor and several different crates trying to make error-handling more ergonomic have risen and fallen. Personally, I have come to prefer using Error enums defined on a per-crate or per-mod basis. In this case, it might look something like this:A lot of the time I keep it even simpler with "zero-size" enums with no data associated with each variant, but the variant name alone is enough to act on the error.
The advantages of localized, lightweight error types include:
In rust, matching on or otherwise manipulating/operating on types is very ergonomic and powerful, while matching/operating on strings is relatively painful, owing to the strict ownership semantics and String/str divergence.
If latency is a priority, as it often is in trading, incurring a heap allocation (for the
message
String) just to handle an error is somewhat wasteful.This is especially true because in idiomatic rust, program crashes are very rare and there's little need to gather as much information as possible for piecing together what went wrong after the fact. Your debugging time is more likely to be spent on the front-end getting the program to compile in the first place.
Regarding this comment/question:
The
Normal::new
function from the statrs crate returnsResult<T, StatsError>
. SinceBlack76::new
returnsResult<Self, qm::Error>
, you would only be able to use the question mark operator if aFrom<_>
impl exists to convert aStatsError
into aqm::Error
. You already have a long list of theseFrom<_>
impls incore::qm
, so perhaps you discovered this along the way.Incidentally, the
statrs::StatsError
is a great example of a robust, but lightweight error enum that covers the various types of errors that can come from the library without any allocations.&'static str
(string literals) used in many of the variants are very ergonomic for that purpose.In this case, however, handling the error isn't even necessary. The error can only arise if either argument is NaN or a negative value is passed as the
std_dev
argument, which is clearly is not the case since it's being passed float literals. In my judgement, this is a perfect use of.unwrap()
or.expect("statrs::Normal::new(0.0, 1.0)")
. I typically note why anunwrap
is safe with a quick comment.One test failed on my machine (using nightly-2018-12-02), which seems to be a floating point issue:
thread 'math::brent::tests::problem_of_the_day' panicked at 'result=192.80753261197134 expected=192.80752497643797', src/math/brent.rs:146:9
Hope these initial comments are helpful to you. I'm looking forward to delving further into this. I'd be interested to hear what the current status on the project is, and if you have any ideas for areas where collaboration would be especially useful.
Jonathan
The text was updated successfully, but these errors were encountered: