Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] Add support for env vars providing property values #23880

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

czentgr
Copy link
Contributor

@czentgr czentgr commented Oct 24, 2024

This allows a user to configure the value of an environment variable to be used as value for a configuration property. For example, the configuration file may contain keys that would be hard coded. The values can now be replaced by environment variables that are read at startup time and the keys don't require hard coding in the configuration file.

Description

This change allows reading of a config value that references an environment variable that contains the value as opposed to providing the value directly in the config file.

Motivation and Context

This can be used if certain values should not be hard coded, e.g. keys in configuration files.

Impact

User can provide config values using environment variables and refer to the environment variable name as the value in the configuration file.

Test Plan

Added unit test.

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Prestissimo
* Add utilizing environment variables as stand in for configuration values. The environment variable value is retrieved and used for the configuration option. :pr:`23880`

@czentgr
Copy link
Contributor Author

czentgr commented Oct 24, 2024

TODo add documentation in the PrestoC++ section.

@czentgr czentgr force-pushed the cz_read_env_var_config_values branch from 4a0e096 to 18373b8 Compare October 24, 2024 02:46
@czentgr czentgr marked this pull request as ready for review October 31, 2024 20:49
@czentgr czentgr requested a review from a team as a code owner October 31, 2024 20:49
@czentgr
Copy link
Contributor Author

czentgr commented Nov 8, 2024

@majetideepak FYI

std::optional<std::string> value{std::nullopt};
if (str.size() >= 3 && str.substr(0, 2) == "${" &&
str[str.size() - 1] == '}') {
auto env_name = std::string(str.substr(2, str.size() - 3));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@majetideepak we discussed using string_view here but we need a null terminated char* so we need to construct something from the substring string_view to pass to getenv.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. Maybe add a comment for this to ensure someone else does not make this change.

@@ -333,3 +333,31 @@ This only applies if ``system-mem-pushback-enabled`` is ``true``.

Specifies the amount of memory to shrink when the memory pushback is
triggered. This only applies if ``system-mem-pushback-enabled`` is ``true``.

Environment variables as values for worker properties
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Title should be camel case.

The environment variable name must match exactly with the defined variable.

This allows a worker to read sensitive data e.g. for access keys from an
environment variable rather than having the actual value hard coded in a comnfiguration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: configuration


This allows a worker to read sensitive data e.g. for access keys from an
environment variable rather than having the actual value hard coded in a comnfiguration
file on disk enhancing the security stance of deployments.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove stance

file on disk enhancing the security stance of deployments.

For example, consider the hive connector's `hive.s3.aws-access-key` property.
This is sensitive data and can be stored in an environment variable, for example,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May also add to this the real usage of how these key values can be stored encrypted on disk and are decrypted during deployment as passed as environment variables.

std::optional<std::string> extractValueIfEnvironmentVariable(
std::string_view str) {
std::optional<std::string> value{std::nullopt};
if (str.size() >= 3 && str.substr(0, 2) == "${" &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: str.size() > 3. Let's skip empty ${} as well.

std::string ENV_VAR = "PRESTO_READ_CONFIG_TEST_VAR";

writeConfigFile(
fmt::format("{}={}\n", "plain-text", "plain-text-value") +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use constants for "plain-text" and other keys to be consistent when reading them below.

fmt::format("{}={}}}\n", "env-var3", ENV_VAR) +
fmt::format("{}=${{}}\n", "no-env-var"));

setenv(ENV_VAR.c_str(), "env-var-value", 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test for empty value. It should be enough to check that it does not introduce a crash.

std::optional<std::string> value{std::nullopt};
if (str.size() >= 3 && str.substr(0, 2) == "${" &&
str[str.size() - 1] == '}') {
auto env_name = std::string(str.substr(2, str.size() - 3));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. Maybe add a comment for this to ensure someone else does not make this change.

str[str.size() - 1] == '}') {
auto env_name = std::string(str.substr(2, str.size() - 3));

char* envval = std::getenv(env_name.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const char*

@@ -45,7 +70,9 @@ std::unordered_map<std::string, std::string> readConfig(
const auto name = line.substr(0, delimiterPos);
VELOX_CHECK(!name.empty(), "property pair '{}' has empty key", line);
const auto value = line.substr(delimiterPos + 1);
properties.emplace(name, value);
const auto adjustedValue = extractValueIfEnvironmentVariable(value);
adjustedValue.has_value() ? properties.emplace(name, adjustedValue.value())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During our sync, I misunderstood that we are redundantly updating non-env values.
How about we use void extractValueIfEnvironmentVariable(std::string& value)?
We can then use

extractValueIfEnvironmentVariable(value);
properties.emplace(name, value);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We used to copy it around if the value was a real value and not envvar.
This approach also works and might be a bit cleaner.

@czentgr czentgr force-pushed the cz_read_env_var_config_values branch 2 times, most recently from cfed7b5 to aa87ffd Compare November 27, 2024 00:36
// Does nothing if the input doesn't look like "${...}".
void extractValueIfEnvironmentVariable(std::string& value) {
if (value.size() > 3 && value.substr(0, 2) == "${" && value.back() == '}') {
auto env_name = value.substr(2, value.size() - 3);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envName

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duh...

if (value.size() > 3 && value.substr(0, 2) == "${" && value.back() == '}') {
auto env_name = value.substr(2, value.size() - 3);

const char* envval = std::getenv(env_name.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envVal

This allows a user to configure the value of an environment
variable to be used as value for a configuration property.
For example, the configuration file may contain keys that
would be hard coded. The values can now be replaced by
environment variables that are read at startup time and
the keys don't require hard coding in the configuration file.

Co-authored-by: ashkrisk <[email protected]>
@czentgr czentgr force-pushed the cz_read_env_var_config_values branch from aa87ffd to fbeee81 Compare November 27, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants