Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not automatically cache models in temp dirs #462

Merged
merged 4 commits into from
Oct 31, 2023

Conversation

helena-intel
Copy link
Collaborator

@helena-intel helena-intel commented Oct 25, 2023

Currently, CACHE_DIR for exported models is set to a temporary directory unique to the model export. This is not useful, the cache will never be used. It also can cause issues with large exported models on GPU (model loading crashes).

This PR disables automatic model caching if the model save directory is a temporary directory. That doesn't just affect exported models, automatic model caching is now also disabled if users manually save models in /tmp (or Windows equivalent) - but I do not think that that is an issue. Users can always enable model caching manually by setting CACHE_DIR.

I also aligned the cache dir for diffusion models with other models, so now all cache dirs are in a specific _cache directory (previously for stable diffusion the cache dir was the model dir).

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 25, 2023

The documentation is not available anymore as the PR was closed or merged.

Path.is_relative_to() was added in Python 3.9
Speedup is small on the Github Actions runner hardware so this test regularly
fails even with a speedup threshold of only 1.1
@helena-intel
Copy link
Collaborator Author

The tests failed at:
FAILED tests/openvino/test_modeling.py::OVModelForCausalLMIntegrationTest::test_compare_with_and_without_past_key_values - AssertionError: False is not true : With pkv latency: 336.185 ms, without pkv latency: 361.155 ms, speedup: 1.074
This test fails quite often and I disabled it in this PR. I tested this earlier, when this test first started failing and we reduced the SPEEDUP threshold, and back then I did see speedup when testing manually on Ice Lake Xeon.

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix @helena-intel

@@ -339,11 +339,11 @@ def compile(self):
if self.request is None:
logger.info(f"Compiling the model to {self._device} ...")
ov_config = {**self.ov_config}
if "CACHE_DIR" not in self.ov_config.keys():
# Set default CACHE_DIR only if it is not set.
if "CACHE_DIR" not in self.ov_config.keys() and not str(self.model_save_dir).startswith(gettempdir()):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about :

Suggested change
if "CACHE_DIR" not in self.ov_config.keys() and not str(self.model_save_dir).startswith(gettempdir()):
if "CACHE_DIR" not in self.ov_config.keys() and not isinstance(self.model_save_dir, TemporaryDirectory):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tested this, this evaluated to False, model_save_dir was a PosixPath, not a TemporaryDirectory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you share an example so that I can try to reproduce it ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from tempfile import TemporaryDirectory
from optimum.intel.openvino import OVModelForSequenceClassification

model = OVModelForSequenceClassification.from_pretrained("hf-internal-testing/tiny-random-distilbert", export=True)
print(type(model.model_save_dir))
print(isinstance(model.model_save_dir, TemporaryDirectory))

(you can also add these prints to modeling_base.py)

@@ -466,7 +484,6 @@ class OVModelForCausalLMIntegrationTest(unittest.TestCase):
"pegasus",
)
GENERATION_LENGTH = 100
SPEEDUP_CACHE = 1.1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer we keep the test and set SPEEDUP_CACHE to 1 (as inference when leveraging the pkv should not be slower than without )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked my logs for failed runs and this was honestly the first one:
image
But I will put it back if you want.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, doesn't it mean that there is a serious issue for model inference with use_cache=True ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Github Actions runner (Skylake with 2 cores) that is probably. But I don't expect that that's common for inference in production. On more recent Xeon I don't see this issue, use_cache is faster than without.

if "CACHE_DIR" in self.ov_config.keys()
else {**self.ov_config, "CACHE_DIR": str(encoder_cache_dir)}
)
ov_encoder_config = self.ov_config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to modify self.ov_config by adding a cache path specific to the encoder / decoder component so I think we should create a copy that we can modify instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will fix that!

@echarlaix echarlaix merged commit 5320512 into main Oct 31, 2023
@echarlaix echarlaix deleted the helena/openvino_nocache_temp branch October 31, 2023 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants