-
Notifications
You must be signed in to change notification settings - Fork 448
OmniSciDB Documentation
- Install the dependencies mentioned in the Dependencies page.
- Download and Install the package.
- Read up about Usage and Configuration
- Initialize servers using a wrapper or manually
- Load sample data. Sample sets can be found in the Data Set Library.
- Perform a Sample Query
- Downloads and Installation Instructions
- Building
- Testing
- Using
- Working With Data
- Code Style
- Developing With Specific Programming Languages
- APIS, Interfaces & Projects
- Resources
- Contributing
- Need Help?
- License
OmniSci provides pre-built binaries for Linux for stable releases of the project:
Distro | Package type | CPU/GPU | Download Link | Installation Guide |
---|---|---|---|---|
CentOS | RPM | CPU | https://releases.omnisci.com/os/yum/stable/cpu | https://www.omnisci.com/docs/latest/4_centos7-yum-cpu-os-recipe.html |
CentOS | RPM | GPU | https://releases.omnisci.com/os/yum/stable/cuda | https://www.omnisci.com/docs/latest/4_centos7-yum-gpu-os-recipe.html |
Ubuntu | DEB | CPU | deb https://releases.omnisci.com/os/apt/ stable cpu | https://www.omnisci.com/docs/latest/4_ubuntu-apt-cpu-os-recipe.html |
Ubuntu | DEB | GPU | deb https://releases.omnisci.com/os/apt/ stable cuda | https://www.omnisci.com/docs/latest/4_ubuntu-apt-gpu-os-recipe.html |
* | tarball | CPU | https://releases.omnisci.com/os/tar/omnisci-os-latest-Linux-x86_64-cpu.tar.gz | |
* | tarball | GPU | https://releases.omnisci.com/os/tar/omnisci-os-latest-Linux-x86_64-cuda.tar.gz |
- Installing OmniSci GPU Open Source for Ubuntu Using Apt
- Installing OmniSci GPU Open Source for Ubuntu With a Tarball
- Installing OmniSci GPU Open Source for CentOS Using Yum
- Installing OmniSci GPU Open Source for CentOS With a Tarball
Learn more by visiting the Setup page.
If this is your first time building OmniSciDB, check out the Quickstart above or visit the Setup page for a setup checklist and additional resources. Additionally, there is a curated list of known issues & limitations of this project.
Reminder: Before you begin building, install the appropriate dependencies.
Once you have completed the installation guide listed in the Downloads and Installation Instructions, you are ready to build.
OmniSciDB uses CMake for its build system.
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=debug ..
make -j 4
The following cmake
/ccmake
options can enable/disable different features:
-
-DCMAKE_BUILD_TYPE=release
- Build type and compiler options to use. Options areDebug
,Release
,RelWithDebInfo
,MinSizeRel
, and unset. -
-DENABLE_ASAN=off
- Enable address sanitizer. Default isoff
. -
-DENABLE_AWS_S3=on
- Enable AWS S3 support, if available. Default ison
. -
-DENABLE_CALCITE_DELETE_PATH=on
- Enable Calcite Delete Path. Default ison
. -
-DENABLE_CALCITE_UPDATE_PATH=on
- Enable Calcite Update Path. Default ison
. -
-DENABLE_CUDA=off
- Disable CUDA. Default ison
. -
-DENABLE_CUDA_KERNEL_DEBUG=off
- Enable debugging symbols for CUDA kernels. Will dramatically reduce kernel performance. Default isoff
. -
-DENABLE_DECODERS_BOUNDS_CHECKING=off
- Enable bounds checking for column decoding. Default isoff
. -
-DENABLE_FOLLY=on
- Use Folly. Default ison
. -
-DENABLE_IWYU=off
- Enable include-what-you-use. Default isoff
. -
-DENABLE_JIT_DEBUG=off
- Enable debugging symbols for the JIT. Default isoff
. -
-DENABLE_PROFILER=off
- Enable google perftools. Default isoff
. -
-DENABLE_STANDALONE_CALCITE=off
- Require standalone Calcite server. Default isoff
. -
-DENABLE_TESTS=on
- Build unit tests. Default ison
. -
-DENABLE_TSAN=off
- Enable thread sanitizer. Default isoff
. -
-DENABLE_CODE_COVERAGE=off
- Enable code coverage symbols (clang only). Default isoff
. -
-DENALBE_JAVA_REMOTE_DEBUG=on
- Enable Java Remote Debug. Default isoff
. -
-DMAPD_DOCS_DOWNLOAD=on
- Download the latest master build of the documentation /docs.mapd.com
. Default isoff
. Note: this is a >50MB download. -
-DPREFER_STATIC_LIBS=off
- Static link dependencies, if available. Default isoff
.
OmniSciDB uses Google Test as its main testing framework. Tests reside under the Tests directory.
The sanity_tests
target runs the most common tests. If using Makefiles to build, the tests may be run using:
make sanity_tests
AddressSanitizer can be activated by setting the ENABLE_ASAN
CMake flag in a fresh build directory. At this time CUDA must also be disabled. In an empty build directory run CMake and compile:
mkdir build && cd build
cmake -DENABLE_ASAN=on -DENABLE_CUDA=off ..
make -j 4
Finally run the tests:
export ASAN_OPTIONS=alloc_dealloc_mismatch=0:handle_segv=0
make sanity_tests
Additional testing and test scripts can be found by visiting Tests folder.
ThreadSanitizer can be activated by setting the ENABLE_TSAN
CMake flag in a fresh build directory. At this time CUDA must also be disabled. In an empty build directory run CMake and compile:
mkdir build && cd build
cmake -DENABLE_TSAN=on -DENABLE_CUDA=off ..
make -j 4
We use a TSAN suppressions file to ignore warnings in third party libraries. Source the suppressions file by adding it to your TSAN_OPTIONS
env:
export TSAN_OPTIONS="suppressions=/path/to/mapd/config/tsan.suppressions"
Finally run the tests:
make sanity_tests
OmniSciDB uses CPack to generate packages for distribution. Packages generated on CentOS with static linking enabled can be used on most other recent Linux distributions.
To generate packages on CentOS (assuming starting from top level of the omniscidb repository):
mkdir build-package && cd build-package
cmake -DPREFER_STATIC_LIBS=on -DCMAKE_BUILD_TYPE=release ..
make -j 4
cpack -G TGZ
The first command creates a fresh build directory, to ensure there is nothing left over from a previous build.
The second command configures the build to prefer linking to the dependencies' static libraries instead of the (default) shared libraries, and to build using CMake's release
configuration (enables compiler optimizations). Linking to the static versions of the libraries libraries reduces the number of dependencies that must be installed on target systems.
The last command generates a .tar.gz
package. The TGZ
can be replaced with, for example, RPM
or DEB
to generate a .rpm
or .deb
, respectively.
The startomnisci
wrapper script may be used to start OmniSciDB in a testing environment. This script performs the following tasks:
- initializes the
data
storage directory viainitdb
, if required - starts the main OmniSciDB server,
omnisci_server
- offers to download and import a sample dataset, using the
insert_sample_data
script
Assuming you are in the build
directory, and it is a subdirectory of the omniscidb
repository, startomnisci
may be run by:
../startomnisci
It is assumed that the following commands are run from inside the build
directory.
Initialize the data
storage directory. This command only needs to be run once.
mkdir data && ./bin/initdb data
Start the OmniSciDB server:
./bin/omnisci_server
You can now start using the database. The omnisql
utility may be used to interact with the database from the command line:
./bin/omnisql -p HyperInteractive
where HyperInteractive
is the default password. The default user admin
is assumed if not provided.
Now that you're ready to start working with data, you can insert a sample dataset by running the insert_sample_data
script in a new terminal:
../insert_sample_data
Additional data sets can be found in the Data Set Library.
There are several possible ways to load data to OmniSci using SQL commands.
- COPY FROM
- SQLImporter
- StreamInsert
- Importing AWS S3 Files
- KafkaImporter
- StreamImporter
- HDFS with Sqoop
If you need a complete list and more detailed instructions, please visit the OmniSci documentation.
COPY TO
COPY ( <SELECT statement> ) TO '<file path>' [WITH (<property> = value, ...)];
must be a path on the server. This command exports the results of any SELECT statement to the file. There is a special mode when is empty. In that case, the server automatically generates a file in /omnisci_export that is the client session id with the suffix .txt.
Access a table of available properties in the optional WITH clause and more detailed instructions on the OmniSci documentation.
In order to report performance of queries or data import, you can use MapD Benchmark Script. These Python scripts leverage pymapd to query OmniSciDB.
Learn more about how to add a logger to OmniSciDB by using instructions in Logger.md.
Contributed code should compile without generating warnings by recent compilers on most Linux distributions. Changes to the code should follow the C++ Core Guidelines.
A .clang-format
style configuration, based on the Chromium style guide, is provided at the top level of the repository. Please format your code using a recent version (8.0+ preferred) of ClangFormat before submitting.
To use:
clang-format -i File.cpp
A .clang-tidy
configuration is provided at the top level of the repository. Please lint your code using a recent version (6.0+ preferred) of clang-tidy before submitting.
clang-tidy
requires all generated files to exist before running. The easiest way to accomplish this is to simply run a full build before running clang-tidy
. A build target which runs clang-tidy
is provided. To use:
make clang-tidy
Note: clang-tidy
may make invalid or overly verbose changes to the source code. It is recommended to first commit your changes, then run clang-tidy
and review its recommended changes before amending them to your commit.
Note: the clang-tidy
target uses the run-clang-tidy.py
script provided with LLVM, which may depend on PyYAML
. The target also depends on jq
, which is used to filter portions of the compile_commands.json
file.
Overview:
For more resources, please visit the primary resources for Java.
- omnisql: omnisql is the client-side SQL console that displays query results for SQL statements you submit to the OmniSciDB Server.
-
Thrift: OmniSciDB uses Apache Thrift to generate client-side interfaces. Interface definitions can be found in the
$OMNISCI_PATH/omnisci.thrift
file. - JDBC:OmniSciDB supports Java Database Connectivity (JDBC) connections.
- ODBC: Open Database Connectivity (ODBC) is a standard application programming interface (API) for accessing database management systems (DBMS). OmniSciDB supports ODBC connections.
- Vega: Vega is a powerful, data-driven tool for specifying visualizations. The JSON Vega specification describes your data source and visualization properties.
- RJDBC: OmniSciDB supports the R language using RJDBC. RJDBC is a package implementing DBI in R on the basis of JDBC. This allows the use of any database management system (DBMS) in R through the JDBC interface.
In addition to OmniSciDB, there are other open source projects that connect to OmniSciDB based on programming language such as Javascript, Python and Julia.
- MapD connector: A JavaScript library for connecting to a OmniSci GPU database and running queries.
- PyMapD: Python client for OmniSci GPU-accelerated SQL engine and analytics platform
- OmniSci.jl: Julia client for OmniSci GPU-accelerated SQL engine and analytics platform.
Developers are encouraged to contribute to this open source project. Check out the Contributing page to learn more.
If you have questions and would like to connect with the maintainers of this open source project, please visit the official online forum and community. or access additional resources on our Community page.
This project is licensed under the Apache License, Version 2.0.
The repository includes a number of third party packages provided under separate licenses. Details about these packages and their respective licenses is at ThirdParty/licenses/index.md.
| OmniSciDB | Documentation | Setup |Community & Resources| Copyright & 2018 OmniSci, Inc.