Skip to content

Commit

Permalink
Parallelize remaining single threaded add and mul kernels.
Browse files Browse the repository at this point in the history
  • Loading branch information
james-choncholas committed Nov 10, 2024
1 parent caab041 commit 0c0ecac
Show file tree
Hide file tree
Showing 3 changed files with 459 additions and 251 deletions.
37 changes: 22 additions & 15 deletions examples/benchmark.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,16 @@
"name": "stderr",
"output_type": "stream",
"text": [
"2024-10-29 21:41:36.488386: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
"2024-10-29 21:41:36.514318: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
"To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
"2024-11-10 01:15:42.366708: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
"2024-11-10 01:15:42.367179: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.\n",
"2024-11-10 01:15:42.369814: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.\n",
"2024-11-10 01:15:42.376276: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
"WARNING: All log messages before absl::InitializeLog() is called are written to STDERR\n",
"E0000 00:00:1731201342.387674 117259 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
"E0000 00:00:1731201342.391135 117259 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
"2024-11-10 01:15:42.402569: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
"To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
"2024-11-10 01:15:43.514528: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)\n"
]
},
{
Expand Down Expand Up @@ -64,7 +71,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"0.5060953950014664\n"
"0.5067476989970601\n"
]
}
],
Expand All @@ -85,7 +92,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"0.17610475800029235\n"
"0.1824721849989146\n"
]
}
],
Expand All @@ -106,7 +113,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"0.5579067959988606\n"
"0.5525883640002576\n"
]
}
],
Expand All @@ -127,7 +134,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"0.7779848270001821\n"
"0.18785935000050813\n"
]
}
],
Expand All @@ -148,7 +155,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"0.44140414300272823\n"
"0.19354859399754787\n"
]
}
],
Expand All @@ -169,13 +176,13 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.415064643999358\n"
"0.17636438800036558\n"
]
}
],
"source": [
"def ct_ct_mul():\n",
" return enc_a * enc_a\n",
" return enc_a * 4\n",
"\n",
"time = min(timeit.Timer(ct_ct_mul).repeat(repeat=3, number=1))\n",
"print(time)"
Expand All @@ -190,7 +197,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"0.8980931189980765\n"
"0.7591958359989803\n"
]
}
],
Expand All @@ -211,7 +218,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"0.7085658289979619\n"
"0.7197426070015354\n"
]
}
],
Expand All @@ -232,7 +239,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"27.201639023998723\n"
"27.007230432998767\n"
]
}
],
Expand All @@ -253,7 +260,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"360.84974249600054\n"
"370.09557524000047\n"
]
}
],
Expand All @@ -274,7 +281,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"6.758062808999966\n"
"5.250123850997625\n"
]
}
],
Expand Down
Loading

0 comments on commit 0c0ecac

Please sign in to comment.