-
Notifications
You must be signed in to change notification settings - Fork 3
Setting up the system
Standalone version does not require tensorflow environment. We write a simple tensor wrapper and buffer wrapper to mimic the bahavior of tensorflow. To access the standalone version, first download our code:
$ git clone https://github.com/miglopst/cs263_spring2018.git
The code is located in the folder:
cs263_spring2018/tensorflow/tensorgc/
To compile the code, type make
in the above folder. We can run two test. To run linear allocation test, open main.cc
and comment out:
std::cout << "===start random initialization ===" << std::endl;
random_initialization_test();
std::cout << "===end random initialization ===" << std::endl;
To run random allocation test, open main.cc
and comment out:
std::cout << "===start linear initialization ===" << std::endl;
linear_initialization_test();
std::cout << "===end linear initialization ===" << std::endl;
To run the code, use:
$ ./tracing > output.log
To output debug information, use:
$ export DEBUG_FLAG=X
X indicates which part of TensorGC to debug. X=0
is main.cc; X=1
is tensor.cc; X=2
is buffer.cc; X=3
is roottracer.cc; X=4
is buftracer.cc.
To run integrated tensorGC code, you need to download our docker image and configure it properly, and download our github code and build it. Another requirement is you need to have a GPU with certain compute compatibility supported by tensorflow. Please check tensorflow's website for more details.
We use a docker image for ubuntu 16.09 and CUDA 9. To get our docker image, run:
$ docker pull gupeng/tensorflow1-7
You can check the downloaded image name by:
$ docker images
You can check container name by:
$ docker container ps
Create a working directory in your current OS and download our TensroGC code (integrated with tensorflow):
$ git clone https://github.com/miglopst/cs263_spring2018.git
Then we should start the docker container using (note that working directory in your current OS should already contains tensorflow code):
$ nvidia-docker run -it -v [working directory in your current OS]:[target directory in the docker container] [image name]
Once we start the docker, we should build tensorflow with TensorGC. We should go into the root directory with downloaded github code (target directory in the docker container). We configure tensorflow using:
$ ./configure
In the configuration, we disable all unnecessary setting except CUDA support. After the configuration, we can build tensorflow using bazel:
$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
We can also build a debug version which has more debugging information:
$ bazel build --config=cuda --compilation_mode=dbg --strip=never //tensorflow/tools/pip_package:build_pip_package
Then we build the package:
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
Before install the tensorflow package, we should uninstall the old version:
$ pip uninstall tensorflow
Finally, we install the tensorflow package:
pip install /tmp/tensorflow_pkg/[THE NEWEST BUILT TENSORFLOW PACKAGE]
Here we have four benchmarks that can run:
- LeNet on MNIST, and explore the batchsize:
python tf_example/tuturials/mnist/mnist_deep.py --batch_size [128/256/512/1024]
- 4-layer DNN on MNIST, and explore the batchsize:
python tf_example/tuturials/mnist/mnist_deep.py --batch_size [128/256/512/1024]
- resnet on MNIST
python tf_example/tuturials/mnist_ensemble/train.py --model_name [resnet]
- vggnet on MNIST
python tf_example/tuturials/mnist_ensemble/train.py --model_name [vggnet]
- if you would like to try different GC threshold, We are sorry that currently you have to modify the source code. Please set the GC threshold in line 679 of tf_core/framework/tensor.cc:
BufTracer<TensorBuffer> TensorBuffer::buf_tracer = BufTracer<TensorBuffer>(1000*1024*1024);
Currently, it is left blank and set as the default value of the constructor of BufTracer. (refer to tf_core/tensor_gc/buf_tracer.h) After modifying the value, you need to rebuild the project before run.
We provide two ways to gather debugging information. The first uses std::cout
to print information to stdout
, and the second uses LOG(ERROR)
to print information to stderr
.
To collect debugging information, run an example code in tf_example/tutorials/mnist:
python mnist_deep.py > log.txt 2>err.txt
where log.txt has stdout information, and err.txt has stderr information.
- To deal with log.txt, please run the code of profile_log.py.
- To deal with err.txt, please run the code of profile_err.py.