Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move all the signature param parsing logic out of perly.y into a helper API #22967

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

leonerd
Copy link
Contributor

@leonerd leonerd commented Jan 31, 2025

This PR moves the code relating to building optrees to handle the params variables of signatured subroutines out of the code blocks embedded directly in perly.y and into helper functions, defined in toke.c.

As a direct benefit, there is less actual code in perly.y, as all of the behaviour is abstracted out to toke.c where editing is much easier. In addition, the parser is now abstracted away from the actual optree shape that signatures create. This will allow much more flexibility in future edits - such as OP_MULTIPARAM, no-snails, or adding named parameters.

The actual parser signature struct is opaque from the header files, defined and used exclusively by toke.c. This gives it much more robustness against future modifications. Any future abilities or extensions of behaviour should be provided by helper functions, not by exposing plain structs.

This change does remove the three sig_* fields out of the PL_parser structure, so there is a small chance that some CPAN modules might break, if they have very intimate interactions with the subsignature parser. Offhand, the only module I know of that would break is one of my own - XS::Parse::Sublike. Any such logic would be immediately broken by any other proposed changes mentioned above that this change unlocks as a gateway, so that would be expected anyway.

At present, none of these new functions are documented as API yet as they are not yet exposed as such. This will be considered later, once enough additional parsing infrastructure exists to even allow these functions to be usefully used by third-party module authors.

@leonerd leonerd force-pushed the perly_y-subsignature-parser branch from 68c56c3 to 00a408e Compare January 31, 2025 21:18
Copy link
Contributor

@iabyn iabyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ambivalent about moving the signature elements off the yystack and storing them in a different place. I can see why this is handy, but I have an instinctive dislike of special-case handling. But I guess I won't say no.

A few specific criticisms of the code. First, there are no code comments above each function giving a brief outline of what the function does. Especially as these functions do something unusual (i.e. adding ops not to the yystack or a partial optree, but to something hidden away in the parser struct.) Second, shouldn't these function be in op.c rather than toke.c? They're concerned with parsing, not lexing. Finally, there is a mixture of 'if(foo)' and 'if (foo)'. I loathe and despise former form, but whichever you pick, please at least make it consistent. A petty point, but one I am willing to die for.

@leonerd
Copy link
Contributor Author

leonerd commented Feb 3, 2025

I'm ambivalent about moving the signature elements off the yystack and storing them in a different place. I can see why this is handy, but I have an instinctive dislike of special-case handling. But I guess I won't say no.

Yeah; I agree it's not ideal and not the same as the other stuff in the parser, though I couldn't think of a better way around it. The entire point is that eventually, there'll be some code in in class.c somewhere around starting to parse a method body that has to inject the $self lexical, by calling one of these helper API functions. That'll be in a situation where it doesn't have access to, or can put results on, the parser stack. So I think it's kindof inevitable at that point.

I did try to think if there was some useful pointer value I could have these pieces all return, and collect up on the yystack just to eventually turn up in the finish function, but in the end that felt like it would involve inventing a lot of new things just for that purpose, so I decided this was better. At least a "there's nothing special here, it's NULL" is easy enough to understand.

A few specific criticisms of the code. First, there are no code comments above each function giving a brief outline of what the function does. Especially as these functions do something unusual (i.e. adding ops not to the yystack or a partial optree, but to something hidden away in the parser struct.)

Ahyes, indeed. I wasn't going to add the full =for apidoc structure yet but I could still add some developer-eyes freestyle commentary to explain what's going on.

Second, shouldn't these function be in op.c rather than toke.c? They're concerned with parsing, not lexing.

Mmm, yes that probably makes a lot more sense. I tend to think of op.c as more the low-level implementation of simply being an optree, rather than higher-level parts that assemble those. But it probably makes more sense there indeed.

Finally, there is a mixture of 'if(foo)' and 'if (foo)'. I loathe and despise former form, but whichever you pick, please at least make it consistent. A petty point, but one I am willing to die for.

Oops, that's just laziness on my part, being a mixture of code I wrote, and bits and pieces I moved out of perly.y. I shall tidy them up for consistency.

Thanks for comments

@leonerd leonerd force-pushed the perly_y-subsignature-parser branch from 00a408e to b4f97e3 Compare February 3, 2025 22:37
@leonerd
Copy link
Contributor Author

leonerd commented Feb 3, 2025

Latest forcepush has rebased on blead, and addressed the various of those later comments at least. I don't think there's much I can do about the first comment; the one about not keeping state on the yystack, without more of a redesign.
If anyone else has any thoughts there..?

op.c Outdated
assert((varop->op_private & OPpARGELEM_MASK) == OPpARGELEM_SV);
assert(varop->op_targ);

PADNAME *pn = PadnamelistARRAY(PL_comppad_name)[varop->op_targ];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pn is unused in non-DEBUGGING builds.

op.c:16286:18: warning: unused variable ‘pn’ [-Wunused-variable]
16286 |         PADNAME *pn = PadnamelistARRAY(PL_comppad_name)[varop->op_targ];
      |                  ^~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, yeah. I wonder what's a better solution - move the entire expression inside the assert(), so it's quite long but no variable; or add a PERL_UNUSED_VAR() right above it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just folded it inline in the end.

op.c Outdated
((sigil == '@') ? OPpARGELEM_AV : OPpARGELEM_HV));
assert(varop->op_targ);

PADNAME *pn = PadnamelistARRAY(PL_comppad_name)[varop->op_targ];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one too:

op.c: In function ‘Perl_subsignature_append_slurpy’:
op.c:16369:18: warning: unused variable ‘pn’ [-Wunused-variable]
16369 |         PADNAME *pn = PadnamelistARRAY(PL_comppad_name)[varop->op_targ];
      |                  ^~

op.c Outdated
};

yy_parser_signature *
Perl_subsignature_parser_dup(pTHX_ const yy_parser_signature * const proto, CLONE_PARAMS * const param)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a test case where this can be called?

It doesn't appear to be called by make test (gcov results):

        -:16201:
        -:16202:yy_parser_signature *
    #####:16203:Perl_subsignature_parser_dup(pTHX_ const yy_parser_signature * const proto, CLONE_PARAMS * const param)
        -:16204:{
    #####:16205:    PERL_ARGS_ASSERT_SUBSIGNATURE_PARSER_DUP;
    #####:16206:    PERL_UNUSED_ARG(param);
        -:16207:
    #####:16208:    yy_parser_signature *signature;
    #####:16209:    Newxz(signature, 1, yy_parser_signature);
    #####:16210:    ptr_table_store(PL_ptr_table, proto, signature);
        -:16211:
    #####:16212:    signature->elems = proto->elems;
    #####:16213:    signature->optelems = proto->optelems;
    #####:16214:    signature->slurpy = proto->slurpy;
        -:16215:
        -:16216:    /* op pointers are shared between threads */
    #####:16217:    signature->elemops = proto->elemops;
        -:16218:
    #####:16219:    return signature;
        -:16220:}
        -:16221:
        -:16222:static void
      596:16223:destroy_subsignature_context(pTHX_ void *p)
        -:16224:{
      596:16225:    yy_parser_signature *signature = (yy_parser_signature *)p;
        -:16226:
      596:16227:    if(signature->elemops)
    #####:16228:        op_free(signature->elemops);

If it is called I suspect we'd get a double op free on elemops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way I could think to even trigger this is by starting a new thread in a BEGIN block during some code in the middle of a subroutine signature; a.la:

./perl -Ilib -Mthreads -E'sub f($x = do { BEGIN { threads->create(sub { say "Threaded" })->join } }) {}; say "OK"'
Threaded
OK

I'll see if I can turn that into a real test and exercise it more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having added a test, it turns out that I don't think it's necessary to clone the signature parser during thread clone, as the parser won't be necessary or active in the newly-cloned thread at all. This means we can get rid of subsignature_parser_dup() entirely.

@leonerd leonerd force-pushed the perly_y-subsignature-parser branch from b4f97e3 to 032f025 Compare February 5, 2025 17:13
@leonerd leonerd added the squash-before-merge Author must squash the commits down before merging to blead label Feb 5, 2025
…er API

Provide a subsignature_*() API

Added:
 * subsignature_start()
 * subsignature_append_slurpy()
 * subsignature_append_positional()
 * subsignature_finish()

Call these from code blocks in perly.y

Make the actual parser signature struct opaque, hidden in toke.c. This
gives it much more robustness against future modifications.
@leonerd leonerd force-pushed the perly_y-subsignature-parser branch from d12c59d to e467b54 Compare February 6, 2025 12:45
@leonerd
Copy link
Contributor Author

leonerd commented Feb 6, 2025

Now rebased and squashed, and added perldelta.

@leonerd leonerd removed the squash-before-merge Author must squash the commits down before merging to blead label Feb 7, 2025
@leonerd leonerd merged commit c6beb2b into Perl:blead Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants