-
Notifications
You must be signed in to change notification settings - Fork 602
Gain control of macro namespace visibility #23885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
Can you cite some examples of where "we" (Perl programmers in general, or, at a minimum, CPAN authors) have gotten into trouble with this? |
1f76cdb to
63bdd5f
Compare
|
CI tests on Windows are failing. One aspect of these failures is that where we are expecting an escaped backslash ( |
Prior to this commit, a commented-out prototype was created for all macros that we have argument and return type information for. This might be useful information for a reader of proto.h. This commits stops doing this for private macros. It makes a future commit slightly easier, and I'm unsure of the usefulness of this anyway. But I could be persuaded otherwise.
Where 'weird' is defined as meaning something where the normal rules don't apply, so something we generate is unlikely to be correct. This only affects one element, and it uses aTHX in a way that is incompatible with it being automated.
These keywords all need another word to indicate the parameter type. Previously only 'struct' was considered to have. This changed showed an error in one entry embed.fnc, which is also corrected in this commit.
These macros are not for external use, so don't need a Perl_ prefix
My Linux man page says the arguments to this function should be const, but the compiler refuses to compile it when so. Easiest to just cast them away; the function isn't going to change them.
This is required for the next few commits that start automatically creating long Perl_name functions for the elements in embed.fnc that are macros and don't already have them in the source. Only macros can take a parameter that has to be a literal string, so don't fit with the next few commits. This is the only case in embed.fnc like that, so I'm deferring dealing with it for now.
A function can't return something of that type, but this has always been a macro, so this hasn't been caught.
They were marked as core-only.
Future commits make the new location a better choice
This commit undefines all macros that are visible to XS code but shouldn't be. This stops macro namespace pollution by perl. It works by changing embed.h to have two modes, controlled by a #ifdef that is set by perl.h. perl.h now #includes embed.h twice. The first time works as it always has. The second sets the #ifdef, and causes embed.h to #undef the macros that shouldn't be visible. This call is just before perl.h returns to its includer, so that these macros have come and gone before the file that #included perl.h is affected by them. The list of macros is determined by the visibility given by the apidoc lines documenting them, and by painstaking experiments with our test suite. Those experiments, and some manual inspection, have produced three long lists of items beyond what the apidoc lines currently give. One list is for items that the re extension to Perl requires. A second list is for items that other Perl extensions require. The third list is for items that at least one module shipped with perl requires (or that I know something on CPAN requires) even though the items aren't marked as being visible. There are over 700 items on this list. And smoking cpan with this will add others. The experiments were done automatically, and I have not manually done much manual inspection. I have wanted this ability to happen for a long time; and now things have come together to enable it. This allows us to have a clear-cut boundary with CPAN. It means you can add macros that have internal-only use without having to worry about making them likely not to clash with user names. It shows precisely what our names are that are visible to CPAN, and we can change some of them to be less likely to clash.
63bdd5f to
3a90558
Compare
|
Getting control of our boundaries is a big deal. That's why we have EXPORT in perl code. In the past, we've run into trouble with people using interfaces that weren't supposed to be public, and then complaining if we changed them. The first big step to controlling that that I can think of is @xenu adding the ability to restrict visibility of non-public functions. I fairly recently extended that to functions that are visible to perl extensions, though I think there are bypasses to that if someone really wants to. This step extends this to macros. Again, a determined programmer can defeat these. But it's harder to do. But we have willy-nilly created symbols that infringe on the XS writer's name space. Writers have had to work around this at times. There are modules that people can load that try to undo this. This p.r. has us undoing much of that intrusion. A recent example of a minor issue is #23614 (comment) We as a project need to have a discussion about what sorts of symbols can intrude. I have always operated under the assumption that we could reserve symbols ending in an underscore for our use. I don't know where I got that idea; probably because others who preceded me had made that assumption. But this needs to be clarified., This p.r. brings out into the open for examination just what macros are visible to XS writers. |
It doesn't. Setting As a quick example it undefs SvSetSV_and(), which is used by the API SvSetSV() and SvSetMagicSV() macros. It also undefs many flag check macros and accessors that are API (even if not in embed.fnc) and are accessible as functions, eg My complaint in the above comment was that the |
What I meant is that this p.r. undefines that symbol, so it is not visible to the outside world. And if someone tried to use the macro that needs this symbol, it wouldn't compile. This would alert us to the problem. Whether or not this is a good enough symbol name is outside the scope of this p.r., so the discussion about that should be elsewhere, and yes it does need to be discussed and resolved. And if this p.r. were to be merged as-is, the API would be unusable. I'm sorry if it came across that that is what I expected to happen. The essence of this p.r., without the individual elements, is to gain control of what we expose to the XS writer's name space. I strongly believe that this is something we should do, and I have wanted to have it done for a long time. Maybe I should have labelled this as a POC. The code added here looks for symbols that are exposed to the XS namespace that we don't document as existing, and symbols whose names are definitely reserved for perl's use. It undefs the rest unless they are on one of the exception lists. The exception lists could be expanded to include everything it finds, causing nothing to be undefined., Then nothing would break, but new phantom symbols that get created would be caught and have to be resolved. This is how we've done similar things in the past, diag.t is to keep people from adding undocumented messages to new code, but has an exception list so that it could go into blead without breaking things. That list has been whittled down over time. And we can look through that list all in one place and see if there are symbols that need to be exposed as a consequence of some other macro using them, but whose names are such that they should be changed to avoid potential conflicts. And it means that I don't have to be so careful in creating symbols inside headers that are not going to be used outside; as they will get automatically undefined. |
|
OpTYPE_set() was moved to op.h and isn't guarded by PERL_CORE. I know @leonerd uses it, (example) but he also defines it if not present. So I think the intent was to make it public, which would require the There are many other names that seem to me at first glance shouldn't be undefined by default:
(up to "C") |
This p.r. undefines all macros that are visible to XS code but shouldn't be. This stops macro namespace pollution by perl.
It works by changing embed.h to have two modes, controlled by a #ifdef that is set by perl.h. perl.h now #includes embed.h twice. The first time works as it always has. The second sets the #ifdef, and causes embed.h to #undef the macros that shouldn't be visible. This call is just before perl.h returns to its includer, so that these macros have come and gone before the file that #included perl.h is affected by them.
The list of macros is determined by the visibility given by the apidoc lines documenting them, and by painstaking experiments with our test suite. Those experiments, and some manual inspection, have produced three long lists of items beyond what the apidoc lines currently give.
One list is for items that the re extension to Perl requires.
A second list is for items that other Perl extensions require.
The third list is for items that at least one module shipped with perl requires (or that I know something on CPAN requires) even though the items aren't marked as being visible. There are over 700 items on this list. And smoking cpan with this will add others.
The experiments were done automatically, and I have not manually done much manual inspection.
I have wanted this ability to happen for a long time; and now things have come together to enable it.
This allows us to have a clear-cut boundary with CPAN.
It means you can add macros that have internal-only use without having to worry about making them likely not to clash with user names.
It shows precisely what our names are that are visible to CPAN, and we can change some of them to be less likely to clash.