-
Notifications
You must be signed in to change notification settings - Fork 56
Building TensorFlow
The instructions provided below specify the steps to build TensorFlow version 2.5.0 on Linux on IBM Z for the following distributions:
- Ubuntu (18.04, 20.04, 21.04)
- When following the steps below please use a standard permission user unless otherwise specified.
- A directory
/<source_root>/
will be referred to in these instructions, this is a temporary writable directory anywhere you'd like to place it.
If you want to build TensorFlow using manual steps, go to STEP 1.2.
Use the following commands to build TensorFlow using the build script. Please make sure you have wget installed.
wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/2.5.0/build_tensorflow.sh
# Build Tensorflow
bash build_tensorflow.sh [Provide -t option for executing build with tests]
If the build completes successfully, go to STEP 2. In case of error, check logs
for more details or go to STEP 1.2 to follow manual build steps.
export SOURCE_ROOT=/<source_root>/
- Ubuntu 18.04
sudo apt-get update
sudo apt-get install sudo wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran -y
sudo ldconfig
sudo pip3 install --upgrade pip
sudo pip3 install --no-cache-dir numpy==1.19.5 wheel scipy portpicker protobuf==3.13.0
sudo pip3 install keras_preprocessing --no-deps
- Ubuntu 20.04
sudo apt-get update
sudo apt-get install sudo wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran -y
sudo ldconfig
sudo pip3 install --upgrade pip
sudo pip3 install --no-cache-dir numpy==1.19.5 wheel scipy==1.6.3 portpicker protobuf==3.13.0
sudo pip3 install keras_preprocessing --no-deps
- Ubuntu 21.04
sudo apt-get update
sudo apt-get install sudo wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran -y
sudo ldconfig
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 60 --slave /usr/bin/g++ g++ /usr/bin/g++-7
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 40 --slave /usr/bin/g++ g++ /usr/bin/g++-10
sudo update-alternatives --auto gcc
sudo pip3 install --upgrade pip
sudo pip3 install --no-cache-dir numpy==1.19.5 wheel scipy==1.6.3 portpicker protobuf==3.13.0
sudo pip3 install keras_preprocessing --no-deps
- Ensure
/usr/bin/python
points to Python3 to build TensorFlow in a Python3 environment
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 40
- Install grpcio
export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True
sudo -E pip3 install grpcio
- Build Bazel v3.7.2 -- Instructions for building Bazel can be found here.
Note: Bazel community has not officially supported Ubuntu 20.04 and Ubuntu 21.04 yet, but you could still follow the building instruction above to build Bazel on Ubuntu 20.04 and Ubuntu 21.04. Please note that if you intend to use the build script of bazel on Ubuntu 20.04 or Ubuntu 21.04, you will need to edit line 211 to change
"ubuntu-18.04"
into"ubuntu-18.04" | "ubuntu-20.04" | "ubuntu-21.04"
.
-
Download source code
cd $SOURCE_ROOT git clone https://github.com/linux-on-ibm-z/tensorflow.git cd tensorflow git checkout v2.5.0-s390x
-
Configure
./configure You have bazel 3.7.2- (@non-git) installed. Please specify the location of python. [Default is /usr/bin/python3]: Found possible Python library paths: /usr/lib/python3/dist-packages /usr/local/lib/python3.6/dist-packages Please input the desired Python library path to use. Default is [/usr/lib/python3/dist-packages] /usr/lib/python3/dist-packages Do you wish to build TensorFlow with ROCm support? [y/N]: N No ROCm support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: N No CUDA support will be enabled for TensorFlow. Do you wish to download a fresh release of clang? (Experimental) [y/N]: N Clang will not be downloaded. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details. --config=mkl # Build with MKL support. --config=mkl_aarch64 # Build with oneDNN support for Aarch64. --config=monolithic # Config for mostly static monolithic build. --config=ngraph # Build with Intel nGraph support. --config=numa # Build with NUMA support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. --config=v2 # Build TensorFlow 2.x instead of 1.x. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished
-
Build TensorFlow
bazel build //tensorflow/tools/pip_package:build_pip_package
Note: TensorFlow build is resource intensive operation. If build continues to fail try increasing the swap space and reduce the number of concurrent jobs by specifying
--jobs=n
in the build command above, wheren
is the number of concurrent jobs. Note: Building TensorFlow from source can use a lot of RAM. If your system is memory-constrained, limit Bazel's RAM usage with: --local_ram_resources=2048.
cd $SOURCE_ROOT/tensorflow
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_wheel
sudo ln -s /usr/include/locale.h /usr/include/xlocale.h (On Ubuntu 18.04 only)
sudo pip3 install /tmp/tensorflow_wheel/tensorflow-2.5.0-cp*-linux_s390x.whl
- Run TensorFlow from command Line and check the installed version
$ cd $SOURCE_ROOT $ python -c "import tensorflow as tf; print(tf.__version__)" 2.5.0 $ /usr/bin/python3 >>> import tensorflow as tf >>> tf.add(1, 2).numpy() 3 >>> hello = tf.constant('Hello, TensorFlow!') >>> hello.numpy() b'Hello, TensorFlow!' >>>
-
Run complete testsuite
cd $SOURCE_ROOT/tensorflow bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test --test_tag_filters=-gpu,-benchmark-test,-v1only,-no_oss,-oss_serial -k --test_timeout 300,450,1200,3600 --build_tests_only --test_output=errors -- //tensorflow/... -//tensorflow/compiler/... -//tensorflow/lite/... -//tensorflow/core/platform/cloud/...
Note:
//tensorflow/core/platform/cloud
skipped due to BoringSSL, refer #14039 for details. -
Run individual test
bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test //tensorflow/<module_name>:<testcase_name>
For example,
bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test //tensorflow/python/kernel_tests:topk_op_test
Note: 1. Following tests are failing on s390x and x86:
//tensorflow/python/distribute:checkpointing_test_tpu
//tensorflow/python/distribute:custom_training_loop_gradient_test_tpu
//tensorflow/python/distribute:metrics_v1_test_tpu
//tensorflow/python/distribute:moving_averages_test_tpu
//tensorflow/python/distribute:strategy_combinations_test_tpu
//tensorflow/python/distribute:tf_function_test_tpu
//tensorflow/python/distribute/integration_test:saved_model_test_tpu
//tensorflow/python/eager:remote_cloud_tpu_pod_test
//tensorflow/python/eager:remote_cloud_tpu_test
//tensorflow/python/keras/distribute:checkpointing_test_tpu
//tensorflow/python/keras/distribute:custom_training_loop_metrics_test_tpu
//tensorflow/python/keras/distribute:custom_training_loop_optimizer_test_tpu
//tensorflow/python/keras/distribute:keras_metrics_test_tpu
//tensorflow/python/keras/distribute:keras_models_test_tpu
//tensorflow/python/keras/layers/preprocessing:discretization_distribution_test_tpu
//tensorflow/python/keras/layers/preprocessing:hashing_distribution_test_tpu
//tensorflow/python/tpu:async_checkpoint_test
//tensorflow/python/distribute:strategy_common_test_tpu
//tensorflow/python/distribute:strategy_gather_test_tpu
//tensorflow/python/keras/distribute:keras_stateful_lstm_model_correctness_test_tpu
//tensorflow/python/keras/distribute:keras_utils_test_tpu
//tensorflow/python/tpu:tpu_embedding_v2_correctness_test
//tensorflow/python/tpu:tpu_embedding_v2_test
//tensorflow/python/keras/distribute:keras_save_load_test_tpu
//tensorflow/python/keras/distribute:saved_model_mixed_api_test_tpu
//tensorflow/python/keras/distribute:saved_model_save_load_test_tpu
//tensorflow/python/distribute:input_lib_type_spec_test_tpu
//tensorflow/python/keras/distribute:keras_embedding_model_correctness_test_tpu
//tensorflow/python/distribute:input_lib_test_tpu
//tensorflow/python/keras/distribute:ctl_correctness_test_tpu
//tensorflow/python/keras/distribute:keras_image_model_correctness_test_tpu
2. Following tests are failing due to lack of certain CPU operations in SystemZ LLVM backend.
//tensorflow/python/keras/optimizer_v2:adam_test
//tensorflow/core/kernels/mlir_generated:abs_cpu_f16_f16_gen_test
//tensorflow/core/kernels/mlir_generated:abs_cpu_f32_f32_gen_test
//tensorflow/core/kernels/mlir_generated:abs_cpu_f64_f64_gen_test
//tensorflow/core/kernels/mlir_generated:abs_cpu_i16_i16_gen_test
//tensorflow/core/kernels/mlir_generated:abs_cpu_i32_i32_gen_test
//tensorflow/core/kernels/mlir_generated:abs_cpu_i64_i64_gen_test
//tensorflow/core/kernels/mlir_generated:abs_cpu_i8_i8_gen_test
//tensorflow/core/kernels/mlir_generated:add_v2_cpu_f16_f16_gen_test
//tensorflow/core/kernels/mlir_generated:add_v2_cpu_f32_f32_gen_test
//tensorflow/core/kernels/mlir_generated:add_v2_cpu_f64_f64_gen_test
//tensorflow/core/kernels/mlir_generated:add_v2_cpu_i32_i32_gen_test
//tensorflow/core/kernels/mlir_generated:add_v2_cpu_i64_i64_gen_test
//tensorflow/core/kernels/mlir_generated:cos_cpu_f16_f16_gen_test
//tensorflow/core/kernels/mlir_generated:cos_cpu_f32_f32_gen_test
//tensorflow/core/kernels/mlir_generated:cos_cpu_f64_f64_gen_test
//tensorflow/core/kernels/mlir_generated:rsqrt_cpu_f16_f16_gen_test
//tensorflow/core/kernels/mlir_generated:rsqrt_cpu_f32_f32_gen_test
//tensorflow/core/kernels/mlir_generated:rsqrt_cpu_f64_f64_gen_test
//tensorflow/core/kernels/mlir_generated:sin_cpu_f16_f16_gen_test
//tensorflow/core/kernels/mlir_generated:sin_cpu_f32_f32_gen_test
//tensorflow/core/kernels/mlir_generated:sin_cpu_f64_f64_gen_test
//tensorflow/core/kernels/mlir_generated:sqrt_cpu_f16_f16_gen_test
//tensorflow/core/kernels/mlir_generated:sqrt_cpu_f32_f32_gen_test
//tensorflow/core/kernels/mlir_generated:sqrt_cpu_f64_f64_gen_test
//tensorflow/core/kernels/mlir_generated:square_cpu_c128_c128_gen_test
//tensorflow/core/kernels/mlir_generated:square_cpu_c64_c64_gen_test
//tensorflow/core/kernels/mlir_generated:square_cpu_f16_f16_gen_test
//tensorflow/core/kernels/mlir_generated:square_cpu_f32_f32_gen_test
//tensorflow/core/kernels/mlir_generated:square_cpu_f64_f64_gen_test
//tensorflow/core/kernels/mlir_generated:square_cpu_i32_i32_gen_test
//tensorflow/core/kernels/mlir_generated:square_cpu_i64_i64_gen_test
//tensorflow/core/kernels/mlir_generated:tan_cpu_f16_f16_gen_test
//tensorflow/core/kernels/mlir_generated:tan_cpu_f32_f32_gen_test
//tensorflow/core/kernels/mlir_generated:tan_cpu_f64_f64_gen_test
3. Below mentioned test cases expect ICU encoding data to be present for big endian format. If needed, this data can be manually generated on s390x for test cases to pass.
//tensorflow/python/kernel_tests:unicode_decode_op_test
//tensorflow/python/kernel_tests:unicode_transcode_op_test
4. Test case
//tensorflow/tools/docs:tf_doctest
is failing on s390x due to the precision issue of a third-party function implementation. The failure could be safely ignored.5. Test case
//tensorflow/python/kernel_tests/linalg:linear_operator_circulant_test
may fail due to tolerence threshold issue. It will pass by applying the following patch:diff --git a/tensorflow/python/kernel_tests/linalg/linear_operator_circulant_test.py b/tensorflow/python/kernel_tests/linalg/linear_operator_circulant_test.py index 1d3313d6504..302c4ff57e3 100644 --- a/tensorflow/python/kernel_tests/linalg/linear_operator_circulant_test.py +++ b/tensorflow/python/kernel_tests/linalg/linear_operator_circulant_test.py @@ -627,7 +627,7 @@ class LinearOperatorCirculant2DTestNonHermitianSpectrum( [matrix_tensor, matrix_t, imag_matrix]) np.testing.assert_allclose(0, imag_matrix, atol=1e-6) - self.assertAllClose(matrix, matrix_transpose, atol=0) + self.assertAllClose(matrix, matrix_transpose, atol=1e-6) def test_real_spectrum_gives_self_adjoint_operator(self): with self.cached_session():
The information provided in this article is accurate at the time of writing, but on-going development in the open-source projects involved may make the information incorrect or obsolete. Please open issue or contact us on IBM Z Community if you have any questions or feedback.