-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add detection of main under more circumstances #863
Conversation
It's really nice to have detailed explanation of the regex change. The small caveat in your PR,
While it's indeed uncommon to write I wonder if it's possible to solve the problem of I just did a small benckmark with following code, const SINGLELINE_COMMENTS_RE = /\/\/.*$/gm;
const BLOCK_COMMENTS_RE = /\/\*.*\*\//gs;
const url = "https://raw.githubusercontent.com/rust-lang/rust/master/library/alloc/src/collections/btree/map.rs";
fetch(url).then(res => res.text())
.then(text => {
const code = text;
console.log("length of code: ", code.length);
const startTime = performance.now();
const code_without_comments = code.replaceAll(SINGLELINE_COMMENTS_RE, "").replaceAll(BLOCK_COMMENTS_RE, "");
const endTime = performance.now();
console.log("It takes ", (endTime - startTime) / 1000, "seconds");
console.log("trimmed code length: ", code_without_comments.length);
}); and got result(repeated, it usually cost less than 1 millisecond which is barely tangible to user),
Without comments, detection of const HAS_MAIN_FUNCTION_RE = /^(.*;)?\s*(pub\s+)?\s*(const\s+)?\s*(async\s+)?\s*fn\s+main\s*\(\s*)?\s*\)/m;
We'd better just add a notice when Current warning doesn't guide user to change mode to BTW, you can add the regex tests to this file . |
Thank you for the response. I'm sorry but I'm not quite sure what action is being requested. |
Sorry for the lengthy words. I just realized that detection is done in real-time. My overall suggestion is that we'd better add a notice to guide user to change execution mode manually when main() detection fails. And change of regex should be accompanied by tests. |
Thanks for explaining, I have suggested a change to the notice here and I'm not sure where the tests should be added. If you can guide me to them, I'd be happy to add them as they are documented in the comments here already. |
It's certainly possible, but the cost of such a fix is unclear to me. The best answer I can think of would involve running the Rust parser itself (or something equivalent) and having it report if a function named The current architecture of the playground doesn't have a great way of doing this. Right now, it would involve sending the code to the backend, running something, then getting a response. That's far too slow for something that should roughly correspond to every keystroke. Another technique would be to compile something into Wasm and execute that in the browser. It would allow us to reuse some authoritative Rust parser without a network connection. However, it would bring in a large new dependency which might not work for all of our browsers. I don't think either of these directions is appropriate right now, but are worth considering!
I'm not sure exactly which editor you are referring to here, but if you mean Ace / Monaco, this is another interesting idea I hadn't considered. Those libraries already have some established JS for parsing Rust code (with differing quality). If we could piggyback off of the work that they've already done, we could get the result "for free". However, it's not clear how that would work for the simple editor, or if Ace/Monaco have appropriate APIs to query what we want.
This seems like the best compromise at this point in time. A quick search found Regex to match a C-style multiline comment and there are a few JS libraries out there that claim to remove comments (although many have lots of open issues). However, a purpose-built JS (or just a file here) that removes Rust comments1 may be more appropriate. A benefit to this approach is that it would also apply to cases beyond just /*
#[test]
*/
fn x() {} Footnotes
|
All that being said, I'm OK with incremental improvements to the current scheme. Your unique test cases do need to be added to the test file (as mentioned above, those are in |
Ok thanks, will add them when I get a chance. |
Has anyone considered macro-expanded This is naive but: couldn't this logic be done with the compiler? Then you have an authoritative answer as to whether the crate has a main function or not. Trying to parse Rust's grammar with regexs probably isn't going to work in the long run; I'm sure we can come up with failing examples indefinitely. |
Please take the time to read previous comments. |
Gosh, I can't believe I missed that lol. I will note that just the lexing/parsing stage is very fast (especially if it wasn't over the network), |
TLDRI don't think I will add the testing myself. Happy to help if someone else wants to take it to the finish line but I won't be able to get it there myself. DetailsI've really wanted to setup the testing but I haven't been able to make the time to install and become familiar with yarn. And I've been doing some reflection on the things on my todo list and doing a bit of house keeping. Realistically this isn't something I can do quickly so it will never get done in that category and hence it would have to be otherwise productive time that I'd have to allocate to this. And while I really want to help out I don't realistically think I will schedule the time to learn yarn over something else. If someone else wants to take what I've done and bring it to the finish line, I'm happy to help with that but don't think I will get around to doing it myself. |
The cases that are intended to be added by this commit are: - Code with no comments but ends in a semicolon before the declaration of `main`. No comments was required to prevent `//` being included to comment out the line. This was done by not allowing `/` as part of the code before the `;` While this is more restrictive than necessary, it didn't seem likely that the code before `main` would include `/` so didn't seem worthwhile to take the performance hit to make the check less restrictive. To prevent long regex scans going multiple lines I also disallowed newlines in the code by including `\n` and `\r` as disallowed characters. The whole point is code on the same line as `main` so this seemed fine and would help prevent unnecessarily matching all lines before main when it doesn't matter. - Allow multiline comments within the arguments area of main that both start and end within the brackets. I'm not expecting a substantial performance cost from this one because this case is near the end of the matching. I used the following test cases ```rust fn main() { // Should work println!("Hello, world!"); } //;fn main() { //Shouldn't work println!("Hello, world!"); } use std; fn main(){ println!("Hello, world!"); } // Should work const fn main() { // Should work println!("Hello, world!"); } /* fn main() {} */ // Shouldn't work fn main(/* comment */) { // Should work // snip } ``` Co-authored-by: Jake Goulding <[email protected]>
43cc728
to
39186ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I'm sorry I wasn't able to bring this to completion. Thanks for the help. |
Foreword: I tried my best to not affect the runtime performance of the regex matching but I was only able to account for the cases I could understand / think of. I've listed those below and how I tried to mitigate them.
The cases of main detection that are intended to be added by this PR are:
Code with no comments but ends in a semi colon before the declaration of main. No comments was required to prevent
//
being included to comment out the line. This was done by not allowing/
as part of the code before the;
While this is more restrictive than necessary it didn't seem likely that the code before the main would include/
so didn't seem worthwhile to take the performance hit to make the check less restrictive. To prevent long regex scans going multiple lines I also disallowed newlines in the code by including\n
and\r
as disallowed characters. (The whole point is code on the same line as main so this seemed fine and would help prevent unnecessarily matching all lines before main when it doesn't matter).Allow multiline comments within the arguments area of main that both start and end within the brackets. Not expecting a substantial performance cost from this one because this case is near the end of the matching.
I used the following test cases: