Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource leak with p4java not closing RpcStreamOutput #2

Open
brasky opened this issue Apr 21, 2023 · 5 comments
Open

Resource leak with p4java not closing RpcStreamOutput #2

brasky opened this issue Apr 21, 2023 · 5 comments

Comments

@brasky
Copy link

brasky commented Apr 21, 2023

Hello! My company uses the jenkins p4-plugin https://github.com/jenkinsci/p4-plugin and have been experiencing an issue on Windows where files are locked by java.exe which cause future builds on the same agent to fail. Here's an example from the jenkins pipeline:

image

I grabbed a memory dump and looked at it and noticed FileDescriptors with no references waiting for GC. If I force GC to run the build works fine, the file locks are gone. The parent of the FileDescriptor was:

image

So I started looking at p4java and have found multiple places where an RpcOutputStream is opened but never closed:.
Here it's only closed in the handling one exception but nowhere else:

Here it's just never closed:

RpcOutputStream outStream = this.fileCommands.getTempOutputStream(cmdEnv);

This is by no means an exhaustive list, but I wanted to make an issue before continuing to see what you thought. For now we are forced to run GC at the beginning of every build which is definitely not ideal. Would love some help!

@skumar7322
Copy link
Contributor

Hi @brasky,
Apologies for delayed response.

We acknowledge the existence of this resource leak and have prioritized its resolution in the upcoming release.

Please note that we do not handle issues on GitHub. If you consider this issue to be of high priority, kindly create a support ticket with Perforce.
https://www.perforce.com/support/request-support

Thanks

@brasky
Copy link
Author

brasky commented Jan 24, 2024

@skumar7322 I made a ticket a few months after making this issue and have emailed about it maybe 3 or 4 times in the last year. Case # 00947568

I appreciate your response, thanks for looking into this.

@skumar7322
Copy link
Contributor

Hi @brasky,
The communication with the server happens in multiple calls. We create the stream at the start and keep it open until the command succeeds.

The code block ClientFunctionDispatcher.java: 140 is intermediatory so closing stream here is not expected. This stream is closed at the end of the call I.e at the end of the OneShotServerImpl.java:: execMapCmdList() method.
For a file the stream is closed by ClientFile.java:: close().

To investigate further please provide the following information:

  • This bug is reproduced after a failed build or the state of the previous build does not matter.
  • How often does this bug occur?
  • We are unable to reproduce it, Could you provide steps.
  • Provide the logs of the jenkins build.

Thanks
Sandeep

@skumar7322
Copy link
Contributor

@brasky any update on this?

@brasky
Copy link
Author

brasky commented May 23, 2024

@skumar7322 We mitigated this problem a year ago by running GC at the beginning of every build. As far as I can remember the status of previous builds were mostly successful, but this issue would cause the build to fail.

To hit this, we were running many builds in a row (every night we run many game builds of various configurations) from a depot that was probably around 100-200GB. Each build was cleaning the source by deleting the source folder and deleting the p4 workspace. We are running maybe 12 builds and had 6 jenkins agents roughly, so each agent may have 1-3 jobs a night back to back.

While deleting the files in the workspace, we'd hit the exception in the original post (file is open by something). I used a sysinternals tool called Handle to identify that java was the culprit. I got a java memory dump and saw that RpcOutputStream had the file handle open but that it didn't have a path to gc root, so I ran GC and the file handle was cleaned up. We would his this issue most nights

There is a race condition between p4 syncs and garbage collection, my guess is that your environment has some configuration difference either with java, the size of your depot, the OS of your jenkins agents, not sure exactly.

To set up a repro environment would take a decent amount of time which I haven't had lately...

Right now we are running:

  • Server 2019 jenkins agents (before we were using Windows 10 desktop OS for our jenkins agents I believe)
  • Amazon Corretto openjdk 17.0.10.7.1
  • p4 2023.2 using partitioned clients and parallel syncs with 12 threads
  • p4 stream size ~280GB, 600k files, 134k folders

I cannot give jenkins logs because we don't have logs going back over a year. The logs were basically what I sent. I did not see any logs from perforce that were of note.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants