Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major Performance issue even with new caching? #156

Open
kohlerdominik opened this issue Mar 2, 2022 · 7 comments
Open

Major Performance issue even with new caching? #156

kohlerdominik opened this issue Mar 2, 2022 · 7 comments
Labels
needs investigation Work is needed to figure out the root cause of the problem.

Comments

@kohlerdominik
Copy link

Hi @cebe

Almost 2 years ago we had a rather big issue with the performance for parsing an API-spec, where all components were store in another (the same) file. It took a couple of minutes to process. My collegue then proposed this PR #65

We ended up using this fork for the last two years now. This week I'm updating dependencies, and having seen that you merged #67 made me realize, that I can finally get rid of our fork.

However, running our test-suite after the update was rather painful: Instead of ~1s to parse our specification, like with our fork, it now took ~30s. I spent a couple of hours analyzing, but I couldn't find the issue. It looks like resolving references is painfully slow, it seems to take 100-300ms for each reference. Clearly, this was much faster in version 1.4, where our fork originates.


Therefore, I hope you can find a minute to look into it as well. I prepared a minimal test setup:

$ git clone https://github.com/lextira/openapi-parse-test
$ composer update

# now we have a composer install with our fork, see https://github.com/lextira/openapi-parse-test/blob/master/composer.json
$ php test.php
Start Reading...
Read took 1248 miliseconds!
Parsed file is 791837 characters long.

$ git checkout with-openapi-1.6
$ composer update

# new we are at the latest release of the official repository
$ php test.php
Start Reading...
Read took 37302 miliseconds!
Parsed File is 791837 characters long.

So with v1.6, reading the exact same file from the exact same URL takes almost 30 times longer than with v1.4 (and our custom cache implementation, which is very similar to your implemented cache now). For context, the specification used for the test is our official API documentation and can be found here: https://docs.lexgate.ch/api/

@cebe
Copy link
Owner

cebe commented Mar 11, 2022

Thanks for the feedback. I also have performance issues on a big project of mine but I had no time to look into this yet. The fact that it takes much less time on your fork is some valuable info . I'll check that out.

@cebe cebe added the needs investigation Work is needed to figure out the root cause of the problem. label Mar 11, 2022
@kohlerdominik
Copy link
Author

kohlerdominik commented Mar 23, 2022

Hi @cebe

Looks like the fix from @marcelthole makes things significantely faster here. Could you already have a look at it?

However, @marcelthole your fix introducing caching for references. But could you also find out, WHY resolving references is now so much slower than in 1.4?

@marcelthole
Copy link
Contributor

not exactly. If i look into the diff one part is the huge amount of ReferenceContext::reduceDots calls in the current version.

Here the difference of the xdebug trace
grafik

@PatchRanger
Copy link

PatchRanger commented Mar 24, 2023

I confirm performance issues.
I have studied the code and have a suggestion: is it possible, in addition to the built-in ReferenceContextCache, to have a fallback to the user cache? Then the results of parsing and resolving $ref will become available between library calls - and this answers exactly the reason that creates performance problems, right?
I mean, it's not enough to have a $ref-cache INSIDE one library call, it's necessary that the results are shared BETWEEN calls - and this is achieved by relying on a custom cache.
In my opinion, it should be possible to pass to the new ReferenceContext user cache (for example, PSR-6 CacheItemPoolInterface). I suppose if a custom cache is passed to each ReferenceContext constructor, then this should solve problems with the performance of parsing and resolving $ref BETWEEN calls.

UPD I found that in my case, performance problems were caused by frequent cache misses due to https://gitlab.com/yii2-extended/yii2-psr6-cache-bridge/-/issues/3 .
Therefore, please consider all mine of the above as an abstract feature suggestion, not a bug report.

@NickSdot
Copy link

NickSdot commented Oct 16, 2023

Hi all.

I went through all the performance PRs and issues here, but none of them (1, 2) was really fully right or was fully solving the issue for me.

So, I went ahead and created my own fork (partly based on other suggestions here). Parsing the huge Stripe OpenApi Spec is back to ~1 second.

Fork: master...NickSdot:cebe-php-openapi:improve-performance

An important fact, one major performance hit – in my case – is not solely caused by cebe/php-openapi but by a bug in symfony/yaml which we depend on here. I am trying to get this fixed upstream. It sadly is still not merged, so I currently override the class in question in my project.

Parsing the Stripe spec in fact ran for ages and eventually failed. The actual cause was kinda hard to spot. This is, because the cebe/php-openapi (understandably) was not validating for this external misbehaviour. Hence, I want to especially highlight this change in my fork b65633f

Maybe someone is interested to try my fork and see if it will solve your issues as well. I am keen to PR this here or to any fork. Just let me know.

cc: @cebe, @kohlerdominik, @marcelthole, @Aribros

Cheers

@NickSdot
Copy link

NickSdot commented Nov 2, 2023

Quick heads up that the Symfony bug I was referring to in my previous comment got merged: symfony/symfony#52408

@kohlerdominik
Copy link
Author

I could verify, that the suggested change from @NickSdot indeed seems to resolve the performance issue (with an additional small fix) 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs investigation Work is needed to figure out the root cause of the problem.
Projects
None yet
Development

No branches or pull requests

5 participants