Skip to content

Commit

Permalink
[native] Add support for env vars providing property values
Browse files Browse the repository at this point in the history
This allows a user to configure the value of an environment
variable to be used as value for a configuration property.
For example, the configuration file may contain keys that
would be hard coded. The values can now be replaced by
environment variables that are read at startup time and
the keys don't require hard coding in the configuration file.

Co-authored-by: ashkrisk <[email protected]>
  • Loading branch information
czentgr and ashkrisk committed Nov 26, 2024
1 parent b823353 commit 9fc8fbb
Show file tree
Hide file tree
Showing 3 changed files with 123 additions and 46 deletions.
78 changes: 53 additions & 25 deletions presto-docs/src/main/sphinx/presto_cpp/properties.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ Presto C++ Configuration Properties

This section describes Presto C++ configuration properties.

The following is not a complete list of all configuration properties,
and does not include any connector-specific catalog configuration properties
or session properties.
The following is not a complete list of all configuration properties,
and does not include any connector-specific catalog configuration properties
or session properties.

For information on catalog configuration properties, see :doc:`Connectors </connector/>`.

Expand All @@ -20,9 +20,9 @@ For information on Presto C++ session properties, see :doc:`properties-session`.
Coordinator Properties
----------------------

Set the following configuration properties for the Presto coordinator exactly
as they are shown in this code block to enable the Presto coordinator's use of
Presto C++ workers.
Set the following configuration properties for the Presto coordinator exactly
as they are shown in this code block to enable the Presto coordinator's use of
Presto C++ workers.

.. code-block:: none
Expand All @@ -32,17 +32,17 @@ Presto C++ workers.
use-alternative-function-signatures=true
experimental.table-writer-merge-operator-enabled=false
These Presto coordinator configuration properties are described here, in
alphabetical order.
These Presto coordinator configuration properties are described here, in
alphabetical order.

``driver.cancel-tasks-with-stuck-operators-threshold-ms``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* **Type:** ``string``
* **Default value:** ``240000`` (40 minutes)

Cancels any task when at least one operator has been stuck for at
Cancels any task when at least one operator has been stuck for at
least the time specified by this threshold.

Set this property to ``0`` to disable canceling.

``experimental.table-writer-merge-operator-enabled``
Expand All @@ -51,16 +51,16 @@ alphabetical order.
* **Type:** ``boolean``
* **Default value:** ``true``

Merge TableWriter output before sending to TableFinishOperator. This property must be set to
``false``.
Merge TableWriter output before sending to TableFinishOperator. This property must be set to
``false``.

``native-execution-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``boolean``
* **Default value:** ``false``

This property is required when running Presto C++ workers because of
This property is required when running Presto C++ workers because of
underlying differences in behavior from Java workers.

``optimizer.optimize-hash-generation``
Expand All @@ -69,9 +69,9 @@ alphabetical order.
* **Type:** ``boolean``
* **Default value:** ``true``

Set this property to ``false`` when running Presto C++ workers.
Velox does not support optimized hash generation, instead using a HashTable
with adaptive runtime optimizations that does not use extra hash fields.
Set this property to ``false`` when running Presto C++ workers.
Velox does not support optimized hash generation, instead using a HashTable
with adaptive runtime optimizations that does not use extra hash fields.

``regex-library``
^^^^^^^^^^^^^^^^^
Expand All @@ -88,17 +88,17 @@ alphabetical order.
* **Type:** ``boolean``
* **Default value:** ``false``

Some aggregation functions use generic intermediate types which are
not compatible with Velox aggregation function intermediate types. One
example function is ``approx_distinct``, whose intermediate type is
``VARBINARY``.
This property provides function signatures for built-in aggregation
Some aggregation functions use generic intermediate types which are
not compatible with Velox aggregation function intermediate types. One
example function is ``approx_distinct``, whose intermediate type is
``VARBINARY``.
This property provides function signatures for built-in aggregation
functions which are compatible with Velox.

Worker Properties
-----------------

The configuration properties of Presto C++ workers are described here, in alphabetical order.
The configuration properties of Presto C++ workers are described here, in alphabetical order.

``async-cache-persistence-interval``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -302,8 +302,8 @@ The configuration properties of Presto C++ workers are described here, in alphab
Memory Checker Properties
-------------------------

The LinuxMemoryChecker extends from PeriodicMemoryChecker and is used for Linux systems only.
The LinuxMemoryChecker can be enabled by setting the CMake flag ``PRESTO_MEMORY_CHECKER_TYPE=LINUX_MEMORY_CHECKER``.
The LinuxMemoryChecker extends from PeriodicMemoryChecker and is used for Linux systems only.
The LinuxMemoryChecker can be enabled by setting the CMake flag ``PRESTO_MEMORY_CHECKER_TYPE=LINUX_MEMORY_CHECKER``.
The following properties for PeriodicMemoryChecker are as follows:

``system-mem-pushback-enabled``
Expand All @@ -322,7 +322,7 @@ server is under low memory pressure.
* **Default value:** ``55``

Specifies the system memory limit that triggers the memory pushback or heap dump if
the server memory usage is beyond this limit. A value of zero means no limit is set.
the server memory usage is beyond this limit. A value of zero means no limit is set.
This only applies if ``system-mem-pushback-enabled`` is ``true``.

``system-mem-shrink-gb``
Expand All @@ -333,3 +333,31 @@ This only applies if ``system-mem-pushback-enabled`` is ``true``.

Specifies the amount of memory to shrink when the memory pushback is
triggered. This only applies if ``system-mem-pushback-enabled`` is ``true``.

Environment variables as values for worker properties
-----------------------------------------------------

This section applies to worker configurations in the config.properies file
and catalog property files only.

The value in a key-value par may reference an environment variable by using
a leading `$` followed by enclosing the environment variable name in brackets (`{}`).

```
key=${ENV_VAR_NAME}
```

The environment variable name must match exactly with the defined variable.

This allows a worker to read sensitive data e.g. for access keys from an
environment variable rather than having the actual value hard coded in a comnfiguration
file on disk enhancing the security stance of deployments.

For example, consider the hive connector's `hive.s3.aws-access-key` property.
This is sensitive data and can be stored in an environment variable, for example,
`AWS_S3_ACCESS_KEY` which is set to the actual access key value. Then the following can be
put into the catalog properties to read the value from the environment variable:

```
hive.s3.aws-access-key=${AWS_S3_ACCESS_KEY}
```
29 changes: 28 additions & 1 deletion presto-native-execution/presto_cpp/main/common/ConfigReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,31 @@
#include "velox/common/config/Config.h"

namespace facebook::presto::util {

namespace {
// Replaces strings of the form "${VAR}"
// with the value of the environment variable "VAR" (if it exists).
// Does nothing if the input doesn't look like "${...}".
std::optional<std::string> extractValueIfEnvironmentVariable(
std::string_view str) {
std::optional<std::string> value{std::nullopt};
if (str.size() >= 3 && str.substr(0, 2) == "${" &&
str[str.size() - 1] == '}') {
auto env_name = std::string(str.substr(2, str.size() - 3));

char* envval = std::getenv(env_name.c_str());
if (envval != nullptr) {
VELOX_CHECK_NE(
strlen(envval),
0,
"The environment variable cannot be an empty string.");
value = std::string(envval);
}
}
return value;
}
} // namespace

std::unordered_map<std::string, std::string> readConfig(
const std::string& filePath) {
// https://teradata.github.io/presto/docs/141t/configuration/configuration.html
Expand All @@ -45,7 +70,9 @@ std::unordered_map<std::string, std::string> readConfig(
const auto name = line.substr(0, delimiterPos);
VELOX_CHECK(!name.empty(), "property pair '{}' has empty key", line);
const auto value = line.substr(delimiterPos + 1);
properties.emplace(name, value);
const auto adjustedValue = extractValueIfEnvironmentVariable(value);
adjustedValue.has_value() ? properties.emplace(name, adjustedValue.value())
: properties.emplace(name, value);
}

return properties;
Expand Down
62 changes: 42 additions & 20 deletions presto-native-execution/presto_cpp/main/common/tests/ConfigTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,15 @@ class ConfigTest : public testing::Test {
if (tempDirectoryPath == nullptr) {
throw std::logic_error("Cannot open temp directory");
}
configFilePath = tempDirectoryPath;
configFilePath += "/config.properties";
configFilePath_ = tempDirectoryPath;
configFilePath_ += "/config.properties";
}

void writeDefaultConfigFile(bool isMutable) {
auto fileSystem = filesystems::getFileSystem(configFilePath, nullptr);
auto sysConfigFile = fileSystem->openFileForWrite(configFilePath);
auto fileSystem = filesystems::getFileSystem(configFilePath_, nullptr);
auto sysConfigFile = fileSystem->openFileForWrite(configFilePath_);
sysConfigFile->append(
fmt::format("{}={}\n", SystemConfig::kPrestoVersion, prestoVersion));
fmt::format("{}={}\n", SystemConfig::kPrestoVersion, kPrestoVersion_));
sysConfigFile->append(
fmt::format("{}=11kB\n", SystemConfig::kQueryMaxMemoryPerNode));
if (isMutable) {
Expand All @@ -52,8 +52,8 @@ class ConfigTest : public testing::Test {
}

void writeConfigFile(const std::string& config) {
auto fileSystem = filesystems::getFileSystem(configFilePath, nullptr);
auto sysConfigFile = fileSystem->openFileForWrite(configFilePath);
auto fileSystem = filesystems::getFileSystem(configFilePath_, nullptr);
auto sysConfigFile = fileSystem->openFileForWrite(configFilePath_);
sysConfigFile->append(config);
sysConfigFile->close();
}
Expand All @@ -65,40 +65,41 @@ class ConfigTest : public testing::Test {
std::make_unique<config::ConfigBase>(std::move(properties)));
}

std::string configFilePath;
const std::string prestoVersion{"SystemConfigTest1"};
const std::string prestoVersion2{"SystemConfigTest2"};
std::string configFilePath_;
const std::string kPrestoVersion_{"SystemConfigTest1"};
const std::string kPrestoVersion2_{"SystemConfigTest2"};
};

TEST_F(ConfigTest, defaultSystemConfig) {
setUpConfigFilePath();
writeDefaultConfigFile(false);
auto systemConfig = SystemConfig::instance();
systemConfig->initialize(configFilePath);
systemConfig->initialize(configFilePath_);

ASSERT_FALSE(systemConfig->mutableConfig());
ASSERT_EQ(prestoVersion, systemConfig->prestoVersion());
ASSERT_EQ(kPrestoVersion_, systemConfig->prestoVersion());
ASSERT_EQ(11 << 10, systemConfig->queryMaxMemoryPerNode());
ASSERT_THROW(
systemConfig->setValue(
std::string(SystemConfig::kPrestoVersion), prestoVersion2),
std::string(SystemConfig::kPrestoVersion), kPrestoVersion2_),
VeloxException);
}

TEST_F(ConfigTest, mutableSystemConfig) {
setUpConfigFilePath();
writeDefaultConfigFile(true);
auto systemConfig = SystemConfig::instance();
systemConfig->initialize(configFilePath);
systemConfig->initialize(configFilePath_);

ASSERT_TRUE(systemConfig->mutableConfig());
ASSERT_EQ(prestoVersion, systemConfig->prestoVersion());
ASSERT_EQ(kPrestoVersion_, systemConfig->prestoVersion());
ASSERT_EQ(
prestoVersion,
kPrestoVersion_,
systemConfig
->setValue(std::string(SystemConfig::kPrestoVersion), prestoVersion2)
->setValue(
std::string(SystemConfig::kPrestoVersion), kPrestoVersion2_)
.value());
ASSERT_EQ(prestoVersion2, systemConfig->prestoVersion());
ASSERT_EQ(kPrestoVersion2_, systemConfig->prestoVersion());
ASSERT_EQ(
"11kB",
systemConfig
Expand Down Expand Up @@ -220,7 +221,7 @@ TEST_F(ConfigTest, parseValid) {
"key1= value with space\n"
"key2=value=with=key=word\n"
"emptyvaluekey=");
auto configMap = presto::util::readConfig(configFilePath);
auto configMap = presto::util::readConfig(configFilePath_);
ASSERT_EQ(configMap.size(), 6);

std::unordered_map<std::string, std::string> expected{
Expand All @@ -240,7 +241,7 @@ TEST_F(ConfigTest, parseInvalid) {
setUpConfigFilePath();
writeConfigFile(fileContent);
VELOX_ASSERT_THROW(
presto::util::readConfig(configFilePath), expectedErrorMsg);
presto::util::readConfig(configFilePath_), expectedErrorMsg);
};
testInvalid(
"noequalsign\n", "No '=' sign found for property pair 'noequalsign'");
Expand All @@ -255,4 +256,25 @@ TEST_F(ConfigTest, optionalNodeId) {
EXPECT_EQ(nodeId, config.nodeId());
}

TEST_F(ConfigTest, readConfigEnvVarTest) {
setUpConfigFilePath();
std::string ENV_VAR = "PRESTO_READ_CONFIG_TEST_VAR";

writeConfigFile(
fmt::format("{}={}\n", "plain-text", "plain-text-value") +
fmt::format("{}=${{{}}}\n", "env-var", ENV_VAR) +
fmt::format("{}=${{{}\n", "env-var2", ENV_VAR) +
fmt::format("{}={}}}\n", "env-var3", ENV_VAR) +
fmt::format("{}=${{}}\n", "no-env-var"));

setenv(ENV_VAR.c_str(), "env-var-value", 1);
auto properties = presto::util::readConfig(configFilePath_);
ASSERT_EQ(properties["plain-text"], "plain-text-value");
ASSERT_EQ(properties["env-var"], "env-var-value");
ASSERT_EQ(properties["env-var2"], "${PRESTO_READ_CONFIG_TEST_VAR");
ASSERT_EQ(properties["env-var3"], "PRESTO_READ_CONFIG_TEST_VAR}");
ASSERT_EQ(properties["no-env-var"], "${}");
unsetenv(ENV_VAR.c_str());
}

} // namespace facebook::presto::test

0 comments on commit 9fc8fbb

Please sign in to comment.