Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark receiver objects of an instance method #20384

Closed
wants to merge 1 commit into from

Conversation

babsingh
Copy link
Contributor

Marking the receiver object of an instance method ensures that it
is not collected by the garbage collector (GC) while the instance
method is executing.

If the receiver object is not marked, there is a risk that the GC
might collect it. In such cases, finalization mechanisms, such as
PhantomReferences and Cleaners, could indicate that the object has
been collected while it is still in use. This issue is resolved if
the receiver object is marked during the execution of its instance
method.

Fixes: ibmruntimes/Semeru-Runtimes#93

Marking the receiver object of an instance method ensures that it
is not collected by the garbage collector (GC) while the instance
method is executing.

If the receiver object is not marked, there is a risk that the GC
might collect it. In such cases, finalization mechanisms, such as
PhantomReferences and Cleaners, could indicate that the object has
been collected while it is still in use. This issue is resolved if
the receiver object is marked during the execution of its instance
method.

Fixes: ibmruntimes/Semeru-Runtimes#93

Signed-off-by: Babneet Singh <[email protected]>
@gacholio
Copy link
Contributor

I don't believe this is correct - it's legal for bytecodes to overwrite the receiver slot with an int value (which is handled correctly by the local mapper).

I suggest you create a test which stores -1 into local slot 0 and then calls System.gc() a couple of times. I strongly suspect this will crash with this change.

I believe the code you replaced is also incorrect - the scenario just doesn't come up very often.

@gacholio
Copy link
Contributor

The test should be run -Xint - I believe the JIT detects the receiver overwrite and simply abandons compilation of the method.

@gacholio
Copy link
Contributor

The "early" (actually on-time) collection of the receiver is completely valid. If there are any references to the object in any stack or reachable object it will not be collected. The issue we've seen in the past is NIO fetching a long memory address from the object as the last use of that object, then the object gets collected and the finalizer runs freeing the native memory before returning control to the method which fetched the address. Use of that address then crashes. This is why the keepAlive/reachabilityFence was added to such methods.

@gacholio
Copy link
Contributor

The issue asserts that the RI works (though this problem is likely so intermittent I'm not sure that's true). Try running the suggested test on the RI - unless they're explicitly copying the receiver into a hidden live slot, they should crash too.

This would be another potential solution for the issue, though adding another hidden slot to stack frames will slow us down a bit and we already have 2 or 3, so may be complex to implement.

@gacholio
Copy link
Contributor

Another option would be to update the local mapper to detect the overwrite. If one does not occur, keep the receiver alive. If an int value is written over it, proceed as we do today.

@gacholio
Copy link
Contributor

After a bit more thought, I think the best solution is to move your code to the top of the function - declare the receiver as "read as object" from the start, and let the mapper override that along the appropriate paths if an int overwrite occurs. We could do this for all of the object arguments if desired (this is what the debug mapper does, which is why debug mode works).

Assuming that the new test fails as I expect with the current change, I think it would be worth adding the test into the builds.

@babsingh
Copy link
Contributor Author

babsingh commented Oct 22, 2024

create a test which stores -1 into local slot 0 and then calls System.gc() a couple of times

Modified the provided test, and it doesn't crash.

+        int local = -1;
+        for (int i = 0; i < 100; i++) {
+            // Calling System.gc() in a loop is usually sufficient to trigger
+            // cleanup in a small program like this.
+            System.gc();
+            try {
+                Thread.sleep(1);
+            } catch (InterruptedException e) {}
+        }
         final var request = PingRequest.newBuilder()
             .setMessage(message);
         final var response = stub.ping(request.build());

The test should be run -Xint

It's run with -Xint. It doesn't crash, but consistently fails with an exception.

GrpcTest > test() FAILED
    io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdownNow invoked
        at app//io.grpc.StatusRuntimeException.fillInStackTrace(StatusRuntimeException.java:68)
        at app//io.grpc.StatusRuntimeException.<init>(StatusRuntimeException.java:58)
        at app//io.grpc.StatusRuntimeException.<init>(StatusRuntimeException.java:50)
        at app//io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268)
        at app//io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249)
        at app//io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
        at app//dev.prodist.foo.proto.FooGrpc$FooBlockingStub.ping(FooGrpc.java:160)
        at app//dev.prodist.GrpcClient.ping(GrpcClient.java:29)
        at app//dev.prodist.GrpcTest.test(GrpcTest.java:33)

Try running the suggested test on the RI

The RI passes consistently with the modified test. On the other hand, OpenJ9 fails consistently.

best solution is to move your code to the top of the function - declare the receiver as "read as object" from the start, and let the mapper override that along the appropriate paths if an int overwrite occurs.

Doesn't work, it seems like the mapper overwrites it with an int.

... which is why debug mode works

Might have applied my fix or nogc option previously, which would have made the debug mode work.

Is there a simple way to verify if our behavior is correct w.r.t the RI i.e. the object is being collected correctly?

@gacholio
Copy link
Contributor

I'd like to see a standalone test for this, which cannot be done solely with the java compiler. Probably the easiest way to do it would be to have a java method like:

public void test() {
   int i = 0;
   i = -1; // edit this one
   System.gc();
   System.gc();
}

and hexedit the class file so the second store is into slot 0 instead of slot 1.

@gacholio
Copy link
Contributor

Doesn't work, it seems like the mapper overwrites it with an int.

Did you do it after the initial zeriong of the resultArray?

@babsingh
Copy link
Contributor Author

Did you do it after the initial zeriong of the resultArray?

Yes, and before mapAllLocals is invoked.

+       /* Ensure that the receiver is marked for all instance methods. */
+       if (J9_ARE_NO_BITS_SET(romMethod->modifiers, J9AccStatic)) {
                *resultArrayBase |= 1;
        }

+       mapAllLocals(portLib, romMethod, (PARALLEL_TYPE *) scratch, pc, resultArrayBase);
+

@gacholio
Copy link
Contributor

Is that not the correct behaviour? Or are you saying it's declared int without the int write to the receiver?

@babsingh
Copy link
Contributor Author

babsingh commented Oct 22, 2024

... and hexedit the class file so the second store is into slot 0 instead of slot 1.

re #20384 (comment): Using hexedit, replaced istore_1 (3C) with istore_0 (3B) in the suggested test. Neither RI nor OpenJ9 crashed.

Is that not the correct behaviour? Or are you saying it's declared int without the int write to the receiver?

Yes, it is the correct behavior. Moving the fix before mapAllLocals does imply an int overwrite. But, what do we say to the user who is expecting us to match the RI? The object is unreachable; and the difference in behaviour is due to the non-deterministic nature of garbage collection?

@gacholio
Copy link
Contributor

If you have actually written -1 to the receiver and marked it as object, then a crash is the only "correct" behaviour - anything else means the test is not operating correctly.

@gacholio
Copy link
Contributor

You ran the test with this change in place?

@babsingh
Copy link
Contributor Author

I had tried the change where the fix was moved before mapAllLocals. With the initial proposed fix, there is a crash:

000000000001AA00: Object neither in heap nor stack-allocated in thread main
000000000001AA00:	O-Slot=00000000001310D0
000000000001AA00:	O-Slot value=00000000FFFFFFFF
000000000001AA00:	PC=00007FC6ECAB62FC
000000000001AA00:	framesWalked=2
000000000001AA00:	arg0EA=00000000001310D0
000000000001AA00:	walkSP=00000000001310B0
000000000001AA00:	literals=00000000001EA878
000000000001AA00:	jitInfo=0000000000000000
000000000001AA00:	method=00000000001EA878 (Test.test()V) (Interpreted)
000000000001AA00:	stack=000000000012A1C0-00000000001311B0
14:29:16.397 0x183900    j9mm.479    *   ** ASSERTION FAILED ** at /root/openj9-openjdk-jdk23/openj9/runtime/gc_glue_java/MarkingSchemeRootMarker.cpp:53: ((MM_StackSlotValidator(MM_StackSlotValidator::NOT_ON_HEAP, object, stackLocation, walkState).validate(_env)))
JVMDUMP039I Processing dump event "traceassert", detail "" at 2024/10/23 07:29:16 - please wait.
...

@gacholio
Copy link
Contributor

OK that's what I expected (and proves that the current change is not correct). I suggest you look at the debug mapper to see when/how it marks the object arguments. I maintain that the correct fix is to mark the receiver (and I recommend the other object arguments) as object before starting the mapper. I don't know why it didn't just work, so some debugging may be necessary.

@babsingh
Copy link
Contributor Author

I maintain that the correct fix is to mark the receiver (and I recommend the other object arguments) as object before starting the mapper

babsingh@c85a8ea implements the above suggestion. The object still gets collected early.

debug mapper / debug mode: try j9localmap_DebugLocalBitsForPC

babsingh@687a002 enables the debug mapper. With the debug mapper, the object stays alive and is no longer collected early.

j9localmap_LocalBitsForPC is more aggressive than j9localmap_DebugLocalBitsForPC in the way it analyzes and marks local variables as object references. Aggressiveness relates to determining whether a local object is being used at a specific point in the bytecode execution, and this helps optimize garbage collection and runtime performance. In contrast, j9localmap_DebugLocalBitsForPC adopts a conservative approach; it overestimates and plays safe by marking objects alive even if they are not being used.

Overall, both implementations work correctly. Even though j9localmap_LocalBitsForPC collects the object early, I don't see any future uses of the object in the application at the point of collection. So, we are not doing anything wrong by collecting the object early.

  • The user has a workaround where it defines the variable outside the scope of the test method to keep it alive: see link.
  • Currently, j9localmap_DebugLocalBitsForPC is only enabled with JVMTI. But, we can provide a JVM cmdline option to switch to j9localmap_DebugLocalBitsForPC. @tajila @gacholio Thoughts, will this be useful?
  • @gacholio Would you like the change that marks objects before starting the mapper to be pushed?

@gacholio
Copy link
Contributor

gacholio commented Nov 1, 2024

If the code at the start doesn't fix the issue, then there's no point pushing that.

Given that we don't want to modify the local mapper too much, I think the best solution might be to check for any stores back to slot 0 during the bytecode walk, and if there are none, do your existing code at the end.

We should still (in the future) consider doing this for all of the incoming arguments, as this really isn't special.

@gacholio
Copy link
Contributor

gacholio commented Nov 4, 2024

We may also want to consider simply always using the debug mapper. I doubt we get any real benefit from the aggressive collection of locals during a method's run. @tajila What do you think?

@tajila
Copy link
Contributor

tajila commented Nov 6, 2024

We may also want to consider simply always using the debug mapper. I doubt we get any real benefit from the aggressive collection of locals during a method's run. @tajila What do you think?

I think we can give this a try, if there is no noticeable difference in key benchmarks then we should make the switch.

@gacholio
Copy link
Contributor

gacholio commented Nov 6, 2024

@babsingh Please give this a try and get some measurements,

@babsingh
Copy link
Contributor Author

babsingh commented Nov 7, 2024

Given that we don't want to modify the local mapper too much, I think the best solution might be to check for any stores back to slot 0 during the bytecode walk, and if there are none, do your existing code at the end.

I implemented the above suggestion in babsingh@68ae493. @gacholio Can you confirm if it was correctly implemented? This re-introduces the earlier crash from #20384 (comment).

I think we can give this a try, if there is no noticeable difference in key benchmarks then we should make the switch.

Yes, I can give it a try. Which benchmarks are we targeting (e.g. Liberty startup, footprint, throughput)?

@tajila
Copy link
Contributor

tajila commented Nov 7, 2024

Yes, I can give it a try. Which benchmarks are we targeting (e.g. Liberty startup, footprint, throughput)?

Yup these ones.

@gacholio
Copy link
Contributor

gacholio commented Nov 7, 2024

Let's wait for the debug mapper results. I see no reason why any crash would be introduced by your change.

@babsingh
Copy link
Contributor Author

Opened #20569 to make OpenJ9's default mapper. I have run the following benchmarks: Liberty DT8/DT10 Startup & Footprint, JRuby and Nashorn. No performance difference is seen between j9localmap_DebugLocalBitsForPC and j9localmap_LocalBitsForPC.

@gacholio gacholio marked this pull request as draft November 12, 2024 01:41
@gacholio gacholio closed this Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Object is cleaned too soon during unit testing
3 participants