Skip to content
This repository was archived by the owner on Apr 28, 2023. It is now read-only.

Commit b72b7f1

Browse files
Redo the build system
Given the recent developments in PyTorch world and the complexity now involved in building ATen, it has become counter-productive to build third-party dependencies from scratch. Instead, we use conda to get the dependencies for ATen, Caffe2, Halide, Tapir and PyTorch. ISL is built from source from the CMakeLists.txt Google libraries are built and linked statically from a third-party repository that I created. They are not expected to change, possibly ever. Static linking of third-party libraries is becoming the de-facto standard in the conda world. After discussions with @soumith and @Yangqing it seems reasonable for us to follow the same path: avoid dynamic linking of system libraries. Note that in the upcoming incantation of PyTorch ***even the nvidia libraries will be statically linked***. This will avoid extremely annoying issues of library conflicts and mismatches between various components. For us it means that in a near future we will be able to rely on a fully packaged ATen and Caffe2 that will not bring in shared libraries in the conda environment. Immediately however this most recent mode is not yet functional and we still build a caffe2 conda package on our own. Additionally when installing PyTorch we need to manually `conda remove cudatoolkit --force` to avoid multiple cuda libraries from various places. There is not good foolproof solution other than our dependencies statically linking the world. On the positive side this commit improves the state of our world as follows: 1. Single unified build for dev and deployment to python packages 2. Build.sh only does one thing (it is still useful to perform some sanity checks but it only ends up calling cmake for tc) 3. Simplified everything (build, docker files, conda packages, dependencies management) 4. Anyone should be able to build the TC master easily and use it from C++ or python 5. Build instructions are simplifed from 4 TL;DR pages to a single short one 6. Building from scratch is significantly faster 7. Docker and CI become much simpler and faster 8. No more caches in build .sh The tradeoffs are: 1. we don't bother with the gcc pre/post 5 ABI issues, we go for GCC5 2. we only officially support Ubuntu16.04 for now with GCC5.4 and Cuda9.0/Cudnn7.1 (other distributions may work too but the compiler/cuda versions are set in stone for now) 3. core team has to use conda Also, redo Docker + Contbuild CircleCI builds without cuda even though it does pull the cuda dependencies for pytorch and caffe2. This can be fixed in the future once we have the unified pytorch/caffe2 build system and conda packages made available to us. CircleCI does not have GPUs and cannot **download them from ether**, as a consequence we can't test python installation here (our python bindings require cuda atm) and we only run CPU tests. The Jenkins build has been reduced to running on a single image. The process is simplified by avoiding to build a docker image, save it and download it again. Instead we just pull an image from dockerhub, update the conda dependencies then compile and run tests.
1 parent 34542a5 commit b72b7f1

File tree

94 files changed

+1176
-4059
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

94 files changed

+1176
-4059
lines changed

.circleci/config.yml

Lines changed: 14 additions & 208 deletions
Original file line numberDiff line numberDiff line change
@@ -1,247 +1,53 @@
11
version: 2
22
jobs:
3-
"build-1404":
3+
"build-1604":
44
working_directory: ~/TensorComprehensions
55
resource_class: xlarge
66
docker:
7-
- image: tensorcomprehensions/linux-trusty-gcc4.9-cuda8-cudnn7-py3-conda:1
7+
- image: tensorcomprehensions/tc-cuda9.0-cudnn7.1-ubuntu16.04-devel
88

99
steps:
1010
- checkout
1111
- run:
12-
name: check_formatting
13-
command: |
14-
cd ~/TensorComprehensions
15-
CLANG=/usr/local/clang+llvm-tapir5.0/bin/clang-format ./check_format.sh
16-
17-
- run:
18-
name: submodules
19-
command: |
20-
git submodule sync
21-
git submodule update --init --recursive
22-
23-
- restore_cache:
24-
keys:
25-
- v2-caffe2-{{ checksum ".git/modules/third-party/caffe2/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1404
26-
27-
- restore_cache:
28-
keys:
29-
- v1-aten-{{ checksum ".git/modules/third-party/pytorch/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1404
30-
31-
- restore_cache:
32-
keys:
33-
- v1-isl-{{ checksum ".git/modules/third-party/islpp/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1404
34-
35-
- restore_cache:
36-
keys:
37-
- v1-halide-{{ checksum ".git/modules/third-party/halide/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1404
38-
39-
- run:
40-
name: build
41-
command: |
42-
cd ~/TensorComprehensions
43-
export TC_DIR=$(pwd)
44-
VERBOSE=1 USE_CONTBUILD_CACHE=1 CORES=16 CMAKE_VERSION="cmake" ATEN_NO_CUDA=0 CLANG_PREFIX="`/usr/local/clang+llvm-tapir5.0/bin/llvm-config --prefix`" ./build.sh --all
45-
46-
- save_cache:
47-
key: v2-caffe2-{{ checksum ".git/modules/third-party/caffe2/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1404
48-
paths:
49-
- third-party-install/bin/convert_caffe_image_db
50-
- third-party-install/bin/convert_db
51-
- third-party-install/bin/db_throughput
52-
- third-party-install/bin/make_cifar_db
53-
- third-party-install/bin/make_mnist_db
54-
- third-party-install/bin/predictor_verifier
55-
- third-party-install/bin/print_registered_core_operators
56-
- third-party-install/bin/run_plan
57-
- third-party-install/bin/speed_benchmark
58-
- third-party-install/bin/split_db
59-
- third-party-install/bin/inspect_gpus
60-
- third-party-install/bin/print_core_object_sizes
61-
- third-party-install/bin/tutorial_blob
62-
- third-party-install/caffe
63-
- third-party-install/caffe2
64-
- third-party-install/include/caffe
65-
- third-party-install/include/caffe2
66-
- third-party-install/lib/libcaffe2.so
67-
- third-party-install/lib/libcaffe2_gpu.so
68-
- third-party/caffe2/build_host_protoc
69-
70-
- save_cache:
71-
key: v1-aten-{{ checksum ".git/modules/third-party/pytorch/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1404
72-
paths:
73-
- third-party-install/share/ATen
74-
- third-party-install/include/ATen
75-
- third-party-install/include/TH
76-
- third-party-install/include/THC
77-
- third-party-install/include/THCS
78-
- third-party-install/include/THCUNN
79-
- third-party-install/include/THNN
80-
- third-party-install/include/THS
81-
- third-party-install/include/cpuinfo.h
82-
- third-party-install/lib/libATen.so
83-
- third-party/pytorch/aten/build/src/ATen/test/
84-
85-
- save_cache:
86-
key: v1-isl-{{ checksum ".git/modules/third-party/islpp/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1404
87-
paths:
88-
- third-party-install/include/isl
89-
- third-party-install/lib/libisl.so
90-
- third-party-install/lib/libisl-static.a
91-
- third-party/islpp/build/isl_test
92-
- third-party/islpp/build/isl_test_int
93-
94-
- save_cache:
95-
key: v1-halide-{{ checksum ".git/modules/third-party/halide/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1404
96-
paths:
97-
- third-party-install/include/HalideBuffer.h
98-
- third-party-install/include/Halide.h
99-
- third-party-install/include/HalideRuntimeCuda.h
100-
- third-party-install/include/HalideRuntime.h
101-
- third-party-install/include/HalideRuntimeHexagonHost.h
102-
- third-party-install/include/HalideRuntimeMetal.h
103-
- third-party-install/include/HalideRuntimeOpenCL.h
104-
- third-party-install/include/HalideRuntimeOpenGLCompute.h
105-
- third-party-install/include/HalideRuntimeOpenGL.h
106-
- third-party-install/include/HalideRuntimeQurt.h
107-
- third-party-install/lib/libHalide.so
108-
- third-party-install/lib/libHalide.a
109-
110-
- run:
111-
name: test_isl
12+
name: conda_tapir_halide
11213
command: |
113-
cd ~/TensorComprehensions
114-
LD_PRELOAD=$(pwd)/third-party-install/lib/libisl.so ./third-party/islpp/build/isl_test
115-
LD_PRELOAD=$(pwd)/third-party-install/lib/libisl.so ./third-party/islpp/build/isl_test_int
116-
117-
- run:
118-
name: test_cpu
119-
command: |
120-
cd ~/TensorComprehensions
121-
./test_cpu.sh
14+
. /opt/conda/anaconda/bin/activate
15+
source activate tc_build
16+
conda install -y -c nicolasvasilache llvm-tapir50 halide
12217
123-
"build-1604":
124-
working_directory: ~/TensorComprehensions
125-
resource_class: xlarge
126-
docker:
127-
- image: tensorcomprehensions/linux-xenial-gcc5-cuda9-cudnn7-py3:1
128-
129-
steps:
130-
- checkout
13118
- run:
13219
name: check_formatting
13320
command: |
21+
. /opt/conda/anaconda/bin/activate
22+
source activate tc_build
13423
cd ~/TensorComprehensions
135-
CLANG=/usr/local/clang+llvm-tapir5.0/bin/clang-format ./check_format.sh
24+
CLANG=${CONDA_PREFIX}/bin/clang-format ./check_format.sh
13625
13726
- run:
13827
name: submodules
13928
command: |
14029
git submodule sync
14130
git submodule update --init --recursive
14231
143-
- restore_cache:
144-
keys:
145-
- v2-caffe2-{{ checksum ".git/modules/third-party/caffe2/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1604
146-
147-
- restore_cache:
148-
keys:
149-
- v1-aten-{{ checksum ".git/modules/third-party/pytorch/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1604
150-
151-
- restore_cache:
152-
keys:
153-
- v1-isl-{{ checksum ".git/modules/third-party/islpp/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1604
154-
155-
- restore_cache:
156-
keys:
157-
- v1-halide-{{ checksum ".git/modules/third-party/halide/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1604
158-
15932
- run:
16033
name: build
16134
command: |
35+
. /opt/conda/anaconda/bin/activate
36+
source activate tc_build
16237
cd ~/TensorComprehensions
16338
export TC_DIR=$(pwd)
164-
VERBOSE=1 USE_CONTBUILD_CACHE=1 CORES=16 CMAKE_VERSION="cmake" ATEN_NO_CUDA=0 CLANG_PREFIX="`/usr/local/clang+llvm-tapir5.0/bin/llvm-config --prefix`" BUILD_TYPE=Release ./build.sh --all
165-
166-
- save_cache:
167-
key: v2-caffe2-{{ checksum ".git/modules/third-party/caffe2/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1604
168-
paths:
169-
- third-party-install/bin/convert_caffe_image_db
170-
- third-party-install/bin/convert_db
171-
- third-party-install/bin/db_throughput
172-
- third-party-install/bin/make_cifar_db
173-
- third-party-install/bin/make_mnist_db
174-
- third-party-install/bin/predictor_verifier
175-
- third-party-install/bin/print_registered_core_operators
176-
- third-party-install/bin/run_plan
177-
- third-party-install/bin/speed_benchmark
178-
- third-party-install/bin/split_db
179-
- third-party-install/bin/inspect_gpus
180-
- third-party-install/bin/print_core_object_sizes
181-
- third-party-install/bin/tutorial_blob
182-
- third-party-install/caffe
183-
- third-party-install/caffe2
184-
- third-party-install/include/caffe
185-
- third-party-install/include/caffe2
186-
- third-party-install/lib/libcaffe2.so
187-
- third-party-install/lib/libcaffe2_gpu.so
188-
189-
- save_cache:
190-
key: v1-aten-{{ checksum ".git/modules/third-party/pytorch/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1604
191-
paths:
192-
- third-party-install/share/ATen
193-
- third-party-install/include/ATen
194-
- third-party-install/include/TH
195-
- third-party-install/include/THC
196-
- third-party-install/include/THCS
197-
- third-party-install/include/THCUNN
198-
- third-party-install/include/THNN
199-
- third-party-install/include/THS
200-
- third-party-install/include/cpuinfo.h
201-
- third-party-install/lib/libATen.so
202-
- third-party/pytorch/aten/build/src/ATen/test/
203-
204-
- save_cache:
205-
key: v1-isl-{{ checksum ".git/modules/third-party/islpp/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1604
206-
paths:
207-
- third-party-install/include/isl
208-
- third-party-install/lib/libisl.so
209-
- third-party-install/lib/libisl-static.a
210-
- third-party/islpp/build/isl_test
211-
- third-party/islpp/build/isl_test_int
212-
213-
- save_cache:
214-
key: v1-halide-{{ checksum ".git/modules/third-party/halide/HEAD" }}-{{ checksum "build.sh" }}-{{ checksum ".circleci/config.yml" }}-{{ arch }}-1604
215-
paths:
216-
- third-party-install/include/HalideBuffer.h
217-
- third-party-install/include/Halide.h
218-
- third-party-install/include/HalideRuntimeCuda.h
219-
- third-party-install/include/HalideRuntime.h
220-
- third-party-install/include/HalideRuntimeHexagonHost.h
221-
- third-party-install/include/HalideRuntimeMetal.h
222-
- third-party-install/include/HalideRuntimeOpenCL.h
223-
- third-party-install/include/HalideRuntimeOpenGLCompute.h
224-
- third-party-install/include/HalideRuntimeOpenGL.h
225-
- third-party-install/include/HalideRuntimeQurt.h
226-
- third-party-install/lib/libHalide.so
227-
- third-party-install/lib/libHalide.a
228-
229-
- run:
230-
name: test_isl
231-
command: |
232-
cd ~/TensorComprehensions
233-
LD_PRELOAD=$(pwd)/third-party-install/lib/libisl.so ./third-party/islpp/build/isl_test
234-
LD_PRELOAD=$(pwd)/third-party-install/lib/libisl.so ./third-party/islpp/build/isl_test_int
39+
VERBOSE=1 WITH_CUDA=OFF CLANG_PREFIX="`${CONDA_PREFIX}/bin/llvm-config --prefix`" BUILD_TYPE=Release ./build.sh
23540
23641
- run:
23742
name: test_cpu
23843
command: |
44+
. /opt/conda/anaconda/bin/activate
45+
source activate tc_build
23946
cd ~/TensorComprehensions
24047
./test_cpu.sh
24148
24249
workflows:
24350
version: 2
24451
build:
24552
jobs:
246-
- "build-1404"
24753
- "build-1604"

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@ conda
1818
tensor_comprehensions.egg-info/
1919
tensor_comprehensions/version.py
2020
tensor_comprehensions/*.proto
21+
tensor_comprehensions/*_pb2.py
2122
slurm-*
2223
examples/results*
2324
*.pyc
2425
test_python/tc_test/*
26+
install

.gitmodules

Lines changed: 3 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -7,31 +7,14 @@
77
path = third-party/cub
88
url = https://github.com/nicolasvasilache/cub.git
99
branch = nvrtc-cub
10-
[submodule "third-party/halide"]
11-
path = third-party/halide
12-
url = https://github.com/halide/Halide.git
13-
14-
# Temporary hack
15-
[submodule "third-party/caffe2"]
16-
path = third-party/caffe2
17-
url = https://github.com/skimo-openhub/caffe2.git
1810

1911
# Mainstream branches, we don't modify those
20-
[submodule "third-party/gflags"]
21-
path = third-party/gflags
22-
url = https://github.com/gflags/gflags
23-
[submodule "third-party/glog"]
24-
path = third-party/glog
25-
url = https://github.com/google/glog.git
2612
[submodule "third-party/dlpack"]
2713
path = third-party/dlpack
2814
url = https://github.com/dmlc/dlpack.git
29-
[submodule "third-party/googletest"]
30-
path = third-party/googletest
31-
url = https://github.com/google/googletest.git
3215
[submodule "third-party/pybind11"]
3316
path = third-party/pybind11
3417
url = https://github.com/pybind/pybind11.git
35-
[submodule "third-party/pytorch"]
36-
path = third-party/pytorch
37-
url = https://github.com/pytorch/pytorch.git
18+
[submodule "third-party/googlelibraries"]
19+
path = third-party/googlelibraries
20+
url = https://github.com/nicolasvasilache/googlelibraries.git

.jenkins/build.sh

Lines changed: 18 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,6 @@ set -ex
1717

1818
source /etc/lsb-release
1919

20-
# condition: if 14.04 and conda, conda install pytorch and build
21-
# condition: if 16.04 and conda, conda install pytorch and build
22-
# condition: if any and non-conda, simply build TC from scratch
23-
2420
# note: printf is used instead of echo to avoid backslash
2521
# processing and to properly handle values that begin with a '-'.
2622
echo "ENTERED_USER_LAND"
@@ -55,42 +51,23 @@ declare -f -t trap_add
5551

5652
trap_add cleanup EXIT
5753

58-
if which ccache > /dev/null; then
59-
# Report ccache stats for easier debugging
60-
ccache --zero-stats
61-
ccache --show-stats
62-
function ccache_epilogue() {
63-
ccache --show-stats
64-
}
65-
trap_add ccache_epilogue EXIT
66-
fi
54+
# Check we indeed have GPUs and list them in the log file
55+
nvidia-smi
6756

68-
if [[ "$DISTRIB_RELEASE" == 14.04 ]]; then
69-
if [[ $(conda --version | wc -c) -ne 0 ]]; then
70-
echo "Building TC in conda env"
71-
conda create -y --name tc-env python=3.6
72-
source activate tc-env
73-
conda install -y pyyaml mkl-include
74-
conda install -yc conda-forge pytest
75-
conda install -y pytorch -c pytorch
76-
WITH_PYTHON_C2=OFF CORES=$(nproc) CLANG_PREFIX=/usr/local/clang+llvm-tapir5.0 BUILD_TYPE=Release ./build.sh --all
77-
else
78-
echo "Building TC in non-conda env"
79-
WITH_PYTHON_C2=OFF CORES=$(nproc) CLANG_PREFIX=/usr/local/clang+llvm-tapir5.0 BUILD_TYPE=Release ./build.sh --all
80-
fi
81-
fi
57+
# Just install missing conda dependencies, build and run tests
58+
cd /var/lib/jenkins/workspace
59+
. /opt/conda/anaconda/bin/activate
60+
git submodule update --init --recursive
8261

83-
if [[ "$DISTRIB_RELEASE" == 16.04 ]]; then
84-
if [[ $(conda --version | wc -c) -ne 0 ]]; then
85-
echo "Building TC in conda env"
86-
conda create -y --name tc-env python=3.6
87-
source activate tc-env
88-
conda install -y pyyaml mkl-include
89-
conda install -yc conda-forge pytest
90-
conda install -y pytorch cuda90 -c pytorch
91-
WITH_PYTHON_C2=OFF CORES=$(nproc) CLANG_PREFIX=/usr/local/clang+llvm-tapir5.0 BUILD_TYPE=Release ./build.sh --all
92-
else
93-
echo "Building TC in non-conda env"
94-
WITH_PYTHON_C2=OFF CORES=$(nproc) CLANG_PREFIX=/usr/local/clang+llvm-tapir5.0 BUILD_TYPE=Release ./build.sh --all
95-
fi
96-
fi
62+
source activate tc_build
63+
conda install -y -c nicolasvasilache llvm-tapir50 halide
64+
conda install -y -c conda-forge eigen
65+
conda install -y -c nicolasvasilache caffe2
66+
67+
WITH_CAFFE2=ON CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda CLANG_PREFIX=$(${CONDA_PREFIX}/bin/llvm-config --prefix) BUILD_TYPE=Release ./build.sh
68+
69+
python setup.py install
70+
./test_python/run_test.sh
71+
72+
FILTER_OUT=MLP_model ./test.sh
73+
./build/tc/benchmarks/MLP_model --gtest_filter=-*2LUT*

.jenkins/run_test.sh

Lines changed: 0 additions & 20 deletions
This file was deleted.

0 commit comments

Comments
 (0)