Skip to content

Building TensorFlow

aborkar-ibm edited this page Nov 12, 2019 · 44 revisions

Building TensorFlow

The instructions provided below specify the steps to build TensorFlow version 1.15.0 on Linux on IBM Z for the following distributions:

  • Ubuntu (16.04, 18.04, 19.04)

General Notes:

  • When following the steps below please use a standard permission user unless otherwise specified.
  • A directory /<source_root>/ will be referred to in these instructions, this is a temporary writable directory anywhere you'd like to place it.

Step 1: Building and Installing TensorFlow v1.15.0

1.1) Build using script

If you want to build Tensorflow using manual steps, go to STEP 1.2.

Use the following commands to build Tensorflow using the build script. Please make sure you have wget installed.

wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/1.15.0/build_tensorflow.sh

# Build Tensorflow
bash build_tensorflow.sh    [Provide -t option for executing build with tests]

If the build completes successfully, go to STEP 2. In case of error, check logs for more details or go to STEP 1.2 to follow manual build steps.

1.2) Install the dependencies

export SOURCE_ROOT=/<source_root>/
  • Ubuntu 16.04
sudo apt-get update  
sudo apt-get install -y pkg-config zip g++ zlib1g-dev unzip git vim tar wget automake autoconf libtool make curl maven python3-pip python3-virtualenv python3-numpy swig python3-dev libcurl3-dev python3-mock python3-scipy bzip2 libhdf5-dev patch git patch libssl-dev  
sudo apt-get install --no-install-recommends python3-sklearn  
sudo pip3 install numpy==1.16.2 future wheel backports.weakref portpicker futures==2.2.0 enum34 keras_preprocessing keras_applications h5py tensorflow_estimator==1.15.1

#Create symlink python from python3
sudo ln -s /usr/bin/python3 /usr/bin/python
  • (For Ubuntu 16.04) Download and install AdoptOpenJDK (OpenJDK 11 with HotSpot) from here
export JAVA_HOME=/<path to JDK>/
export PATH=$JAVA_HOME/bin:$PATH
  • Ubuntu (18.04, 19.04)
 sudo apt-get update  
 sudo apt-get install -y pkg-config zip g++ zlib1g-dev unzip git vim tar wget automake autoconf libtool make curl maven openjdk-11-jdk python3-pip python3-virtualenv python3-numpy swig python3-dev libcurl3-dev python3-mock python3-scipy bzip2 libhdf5-dev patch git patch libssl-dev  
 sudo apt-get install --no-install-recommends python3-sklearn  
 sudo pip3 install numpy==1.16.2 future wheel backports.weakref portpicker futures enum34 keras_preprocessing keras_applications h5py tensorflow_estimator==1.15.1
 
 #Create symlink python from python3
 sudo ln -s /usr/bin/python3 /usr/bin/python

Note: Make sure python -V shows 3.x version.

  • Install grpcio
 export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True
 sudo -E pip3 install grpcio==1.24.3  # Ubuntu 16.04
 sudo -E pip3 install grpcio          # Ubuntu 18.04
 sudo -E GRPC_PYTHON_LDFLAGS="" pip3 install grpcio # Ubuntu 19.04

1.3) Build Bazel

  • Download Bazel

     cd $SOURCE_ROOT   
     mkdir bazel && cd bazel  
     wget https://github.com/bazelbuild/bazel/releases/download/0.26.1/bazel-0.26.1-dist.zip
     unzip bazel-0.26.1-dist.zip 
     chmod -R +w .
  • Build Bazel

    env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh
    export PATH=$PATH:$SOURCE_ROOT/bazel/output/  

    Note: While building Bazel, if build fails with an error java.lang.OutOfMemoryError: Java heap space, apply below patch and rebuild Bazel.

    • Create a patch file $SOURCE_ROOT/bazel/scripts/bootstrap/patch_compile.diff with the following contents:

      --- compile.sh.orig     2019-11-01 03:10:36.253437981 -0400
      +++ compile.sh  2019-11-01 03:10:59.793427971 -0400
      @@ -127,7 +127,7 @@
      	# Useful if your system chooses too small of a max heap for javac.
      	# We intentionally rely on shell word splitting to allow multiple
      	# additional arguments to be passed to javac.
      -  run "${JAVAC}" -classpath "${classpath}" -sourcepath "${sourcepath}" \
      +  run "${JAVAC}" -J-Xms1g -J-Xmx1g -classpath "${classpath}" -sourcepath "${sourcepath}" \
        	 -d "${output}/classes" -source "$JAVA_VERSION" -target "$JAVA_VERSION" \
       	 -encoding UTF-8 ${BAZEL_JAVAC_OPTS} "@${paramfile}"
    • Apply the patch file

      cd $SOURCE_ROOT/bazel/scripts/bootstrap/
      patch --ignore-whitespace compile.sh < patch_compile.diff 
    • Rebuild

      cd $SOURCE_ROOT/bazel/
      env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh
      export PATH=$PATH:$SOURCE_ROOT/bazel/output/

1.4) Build TensorFlow

  • Download source code

    cd $SOURCE_ROOT
    git clone https://github.com/linux-on-ibm-z/tensorflow.git
    cd tensorflow
    git checkout v1.15.0-s390x
  • Configure

    ./configure
    Extracting Bazel installation...
    WARNING: An illegal reflective access operation has occurred
    WARNING: Illegal reflective access by com.google.protobuf.UnsafeUtil to field java.nio.Buffer.address
    WARNING: Please consider reporting this to the maintainers of com.google.protobuf.UnsafeUtil
    WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
    WARNING: All illegal access operations will be denied in a future release
    WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
    You have bazel 0.26.1- (@non-git) installed.
    Please specify the location of python. [Default is /usr/bin/python]:
    
    
    Found possible Python library paths:
      /usr/lib/python3/dist-packages
      /usr/local/lib/python3.7/dist-packages
    Please input the desired Python library path to use.  Default is [/usr/lib/python3/dist-packages]
    
    Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
    No XLA JIT support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
    No OpenCL SYCL support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with ROCm support? [y/N]: N
    No ROCm support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with CUDA support? [y/N]: N
    No CUDA support will be enabled for TensorFlow.
    
    Do you wish to download a fresh release of clang? (Experimental) [y/N]: N
    Clang will not be downloaded.
    
    Do you wish to build TensorFlow with MPI support? [y/N]: N
    No MPI support will be enabled for TensorFlow.
    
    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
    
    
    Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N
    Not configuring the WORKSPACE for Android builds.
    
    Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
            --config=mkl            # Build with MKL support.
            --config=monolithic     # Config for mostly static monolithic build.
            --config=gdr            # Build with GDR support.
            --config=verbs          # Build with libverbs support.
            --config=ngraph         # Build with Intel nGraph support.
            --config=numa           # Build with NUMA support.
            --config=dynamic_kernels        # (Experimental) Build kernels into separate shared objects.
            --config=v2             # Build TensorFlow 2.x instead of 1.x.
    Preconfigured Bazel build configs to DISABLE default on features:
            --config=noaws          # Disable AWS S3 filesystem support.
            --config=nogcp          # Disable GCP support.
            --config=nohdfs         # Disable HDFS support.
            --config=noignite       # Disable Apache Ignite support.
            --config=nokafka        # Disable Apache Kafka support.
            --config=nonccl         # Disable NVIDIA NCCL support.
    Configuration finished
  • Build Tensorflow

    bazel --host_jvm_args="-Xms512m" --host_jvm_args="-Xmx1024m" build --define=tensorflow_mkldnn_contraction_kernel=0 //tensorflow/tools/pip_package:build_pip_package

1.5) Build and install TensorFlow wheel

cd $SOURCE_ROOT/tensorflow
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_wheel
sudo pip3 install /tmp/tensorflow_wheel/tensorflow-1.15.0-cp*-linux_s390x.whl

Step 2: Verify TensorFlow (Optional)

  • Run TensorFlow from command Line

     $ cd $SOURCE_ROOT
     $ /usr/bin/python3
      >>> import tensorflow as tf
      >>> tf.enable_eager_execution()
      >>> tf.add(1, 2).numpy()
      3
      >>> hello = tf.constant('Hello, TensorFlow!')
      >>> hello.numpy()
      'Hello, TensorFlow!'
      >>>  

Step 3: Execute Test Suite (Optional)

  • Run complete testsuite

    cd $SOURCE_ROOT/tensorflow
    bazel --host_jvm_args="-Xms512m" --host_jvm_args="-Xmx1024m" test --define=tensorflow_mkldnn_contraction_kernel=0 --host_javabase="@local_jdk//:jdk" --test_tag_filters=-gpu,-benchmark-test -k   --test_timeout 300,450,1200,3600 --build_tests_only --test_output=errors -- //tensorflow/... -//tensorflow/compiler/... -//tensorflow/lite/... -//tensorflow/core/platform/cloud/... -//tensorflow/java/...

    Note: Skipping some test modules due to an issue related to boringssl : #error Unknown target CPU #14039 as well as an issue related to java : Building Java resource jar failed #19770.

    Note: In case of test fails with '_NamespacePath' object has no attribute 'sort' error, upgrade setuptools dependency with below command and rerun the test:

    sudo pip3 install --upgrade setuptools
    
  • Run individual test

    cd $SOURCE_ROOT/tensorflow
    
    bazel --host_jvm_args="-Xms512m" --host_jvm_args="-Xmx1024m" test --define=tensorflow_mkldnn_contraction_kernel=0  --test_timeout 300,450,1200,3600 --host_javabase="@local_jdk//:jdk" //tensorflow/<module_name>:<testcase_name>

    For example,

    cd $SOURCE_ROOT/tensorflow
    
    bazel --host_jvm_args="-Xms512m" --host_jvm_args="-Xmx1024m" test --define=tensorflow_mkldnn_contraction_kernel=0  --test_timeout 300,450,1200,3600 --host_javabase="@local_jdk//:jdk" //tensorflow/python/kernel_tests:topk_op_test
    • Below tests fail on Z which belong to contrib folder and are removed from master:
      //tensorflow/contrib/compiler:xla_test
      //tensorflow/contrib/distributions:independent_test
      //tensorflow/contrib/distributions:wishart_test(for Ubuntu 19.04 only)
      //tensorflow/contrib/layers:layers_test
      //tensorflow/contrib/metrics:metric_ops_test
      /tensorflow/contrib/optimizer_v2:optimizer_v2_test(for Ubuntu 16.04 only)

    • Below tests fail on Z and investigation is in progress:
      //tensorflow/python/kernel_tests:decode_raw_op_test
      //tensorflow/python/kernel_tests:unicode_decode_op_test
      //tensorflow/python/kernel_tests:unicode_transcode_op_test
      //tensorflow/python:cluster_test
      //tensorflow/python:cost_analyzer_test

    • Below tests are failing on s390x and those are equivalent to Intel:
      //tensorflow/python/compiler/xla:xla_test
      //tensorflow/python/debug:dist_session_debug_grpc_test
      //tensorflow/python/distribute:values_test
      //tensorflow/python/eager:backprop_test(for Ubuntu 19.04 only)
      //tensorflow/python/eager:def_function_xla_jit_test
      //tensorflow/python/eager:def_function_xla_test_cpu
      //tensorflow/python/keras:callbacks_test
      //tensorflow/python/ops/parallel_for:xla_control_flow_ops_test
      //tensorflow/python/tpu:tpu_test
      //tensorflow/python/training/tracking:util_xla_test_cpu
      //tensorflow/go:test
      //tensorflow/tools/api/tests:api_compatibility_test(looks flaky)
      //tensorflow/python/autograph/pyct/static_analysis:activity_py3_test(for Ubuntu 16.04 only)
      //tensorflow/python/distribute:distribute_lib_test(for Ubuntu 16.04 only)
      //tensorflow/python/keras/optimizer_v2:optimizer_v2_test(for Ubuntu 16.04 only)
      //tensorflow/python/keras:base_layer_test(for Ubuntu 16.04 only)

References:

https://www.tensorflow.org/
https://github.com/tensorflow/tensorflow
http://bazel.io/

Clone this wiki locally