TST: Potentially Skip 8bit bnb regression test if compute capability …

…is too low (huggingface#1998) * TST Potentially Skip 8bit bnb regression test The 8bit bnb LoRA regression test results are dependent on the underlying compute capability. The logits are slightly different depending on the version (up to 0.5 abs diff). Therefore, we now check the compute capability for this test and skip it if it's too low. This check may require updating if the hardware of the CI worker is updated. Note that I have already invalidated the old regression artifacts and created a new one. * Fix pytest skip to work without cuda * Instead of skipping, add a comment to explain After internal discussion, we think this is the most practical solution for the time being.
dengdifan · Aug 16, 2024 · 0222450 · 0222450
1 parent 4c3a76f
commit 0222450
Showing 1 changed file with 12 additions and 0 deletions.
diff --git a/tests/regression/test_regression.py b/tests/regression/test_regression.py
@@ -585,6 +585,9 @@ def load_base_model(self):
         return model
 
     def test_lora_8bit(self):
+        # Warning: bnb results can vary significantly depending on the GPU. Therefore, if there is a change in GPU used
+        # in the CI, the test can fail without any code change. In that case, delete the regression artifact and create
+        # a new one using the new GPU.
         base_model = self.load_base_model()
         config = LoraConfig(
             r=8,
@@ -598,6 +601,9 @@ def test_adalora(self):
         self.skipTest(
             "Skipping AdaLora for now, getting TypeError: unsupported operand type(s) for +=: 'dict' and 'Tensor'"
         )
+        # Warning: bnb results can vary significantly depending on the GPU. Therefore, if there is a change in GPU used
+        # in the CI, the test can fail without any code change. In that case, delete the regression artifact and create
+        # a new one using the new GPU.
         base_model = self.load_base_model()
         config = AdaLoraConfig(
             init_r=6,
@@ -641,6 +647,9 @@ def load_base_model(self):
         return model
 
     def test_lora_4bit(self):
+        # Warning: bnb results can vary significantly depending on the GPU. Therefore, if there is a change in GPU used
+        # in the CI, the test can fail without any code change. In that case, delete the regression artifact and create
+        # a new one using the new GPU.
         base_model = self.load_base_model()
         config = LoraConfig(
             r=8,
@@ -652,6 +661,9 @@ def test_lora_4bit(self):
     def test_adalora(self):
         # TODO
         self.skipTest("Skipping AdaLora for now because of a bug, see #1113")
+        # Warning: bnb results can vary significantly depending on the GPU. Therefore, if there is a change in GPU used
+        # in the CI, the test can fail without any code change. In that case, delete the regression artifact and create
+        # a new one using the new GPU.
         base_model = self.load_base_model()
         config = AdaLoraConfig(
             init_r=6,