apache / kudu
Mirror of Apache Kudu
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing apache/kudu in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler view// Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file // distributed with this work for additional information // regarding copyright ownership. The ASF licenses this file // to you under the Apache License, Version 2.0 (the // "License"); you may not use this file except in compliance // with the License. You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, // software distributed under the License is distributed on an // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY // KIND, either express or implied. See the License for the // specific language governing permissions and limitations // under the License. = Kudu Developer Documentation == Building and installing Kudu Follow the steps in the https://kudu.apache.org/docs/installation.html#build_from_source[documentation] to build and install Kudu from source === Building Kudu out of tree A single Kudu source tree may be used for multiple builds, each with its own build directory. Build directories may be placed anywhere in the filesystem with the exception of the root directory of the source tree. The Kudu build is invoked with a working directory of the build directory itself, so you must ensure it exists (i.e. create it with _mkdir -p_). It's recommended to place all build directories within the _build_ subdirectory; _build/latest_ will be symlinked to most recently created one. The rest of this document assumes the build directory _ /build/debug_. === Automatic rebuilding of dependencies The script is invoked by cmake, so new thirdparty dependencies added by other developers will be downloaded and built automatically in subsequent builds if necessary. To disable the automatic invocation of , set the environment variable: [source,bash] ---- $ cd build/debug $ NO_REBUILD_THIRDPARTY=1 cmake ../.. ---- This can be particularly useful when trying to run tools like between two commits which may have different dependencies. === Building Kudu itself [source,bash] ---- Add /thirdparty/installed/common/bin to your $PATH before other parts of $PATH that may contain cmake, such as /usr/bin For example: "export PATH=$HOME/git/kudu/thirdparty/installed/common/bin:$PATH" if using bash. $ mkdir -p build/debug $ cd build/debug $ cmake ../.. $ make -j8 # or whatever level of parallelism your machine can handle ---- The build artifacts, including the test binaries, will be stored in _build/debug/bin/_. To omit the Kudu unit tests during the build, add -DNO_TESTS=1 to the invocation of cmake. For example: [source,bash] ---- $ cd build/debug $ cmake -DNO_TESTS=1 ../.. ---- == Running unit/functional tests To run the Kudu unit tests, you can use the command from within the _build/debug_ directory: [source,bash] ---- $ cd build/debug $ ctest -j8 ---- This command will report any tests that failed, and the test logs will be written to _build/debug/test-logs_. Individual tests can be run by directly invoking the test binaries in _build/debug/bin_. Since Kudu uses the Google C++ Test Framework (gtest), specific test cases can be run with gtest flags: [source,bash] ---- List all the tests within a test binary, then run a single test $ build/debug/bin/tablet-test --gtest_list_tests $ build/debug/bin/tablet-test --gtest_filter=TestTablet/9.TestFlush ---- gtest also allows more complex filtering patterns. See the upstream documentation for more details. === Running tests with the clang AddressSanitizer enabled AddressSanitizer is a nice clang feature which can detect many types of memory errors. The Jenkins setup for kudu runs these tests automatically on a regular basis, but if you make large changes it can be a good idea to run it locally before pushing. To do so, you'll need to build using : [source,bash] ---- $ mkdir -p build/asan $ cd build/asan $ CC=../../thirdparty/clang-toolchain/bin/clang \ CXX=../../thirdparty/clang-toolchain/bin/clang++ \ ../../thirdparty/installed/common/bin/cmake \ -DKUDU_USE_ASAN=1 ../.. $ make -j8 $ ctest -j8 ---- The tests will run significantly slower than without ASAN enabled, and if any memory error occurs, the test that triggered it will fail. You can then use a command like: [source,bash] ---- $ cd build/asan $ ctest -R failing-test ---- to run just the failed test. NOTE: For more information on AddressSanitizer, please see the https://clang.llvm.org/docs/AddressSanitizer.html[ASAN web page]. === Running tests with the clang Undefined Behavior Sanitizer (UBSAN) enabled Similar to the above, you can use a special set of clang flags to enable the Undefined Behavior Sanitizer. This will generate errors on certain pieces of code which may not themselves crash but rely on behavior which isn't defined by the C++ standard (and thus are likely bugs). To enable UBSAN, follow the same directions as for ASAN above, but pass the flag to the invocation. In order to get a stack trace from UBSan, you can use gdb on the failing test, and set a breakpoint as follows: ---- (gdb) b __ubsan::Diag::~Diag ---- Then, when the breakpoint fires, gather a backtrace as usual using the command. === Running tests with ThreadSanitizer enabled ThreadSanitizer (TSAN) is a feature of recent Clang and GCC compilers which can detect improperly synchronized access to data along with many other threading bugs. To enable TSAN, pass to the invocation, recompile, and run tests. For example: [source,bash] ---- $ mkdir -p build/tsan $ cd build/tsan $ CC=../../thirdparty/clang-toolchain/bin/clang \ CXX=../../thirdparty/clang-toolchain/bin/clang++ \ ../../thirdparty/installed/common/bin/cmake \ -DKUDU_USE_TSAN=1 ../.. $ make -j8 $ ctest -j8 ---- TSAN may truncate a few lines of the stack trace when reporting where the error is. This can be bewildering. It's documented for TSANv1 here: https://code.google.com/p/data-race-test/wiki/ThreadSanitizerAlgo…