Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Culture directive parsing #951

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

fredrikhr
Copy link
Contributor

@fredrikhr fredrikhr commented Jun 28, 2020

TL;DR

  • Adds a new extension method UseCultureDirective() to CommandLineBuilder enabling to control CurrentCulture and CurrentUICulture for the invocation pipeline from command-line directives.
    These directives are: culture, uiculture, invariantculture, invariantuiculture.

Details

This introduces the following four culture directives:

  • culture
    The directive value specifies a new value for static CultureInfo.CurrentCulture. The value must be a culture name recognised by the CLR.
    Expample: [culture:de-DE] will set CultureInfo.CurrentCulture to the German culture that is specific to Germany.
  • uiculture
    The directive value specifies a new value for static CultureInfo.CurrentUICulture. The value must be a culture name recognised by the CLR.
    Expample: [uiculture:de-DE] will set CultureInfo.CurrentUICulture to the German culture that is specific to Germany.
  • invariantculture
    If present in the parse result directives CultureInfo.CurrentCulture is set to CultureInfo.InvariantCulture.
  • invariantuiculture
    If present in the parse result directives CultureInfo.CurrentUICulture is set to CultureInfo.InvariantCulture.

Because CurrentCulture is a more general concept, setting CurrentCulture will also always implicitly set CurrentUICulture. This avoids the need to specify both culture and uiculture when the culture to set is the same. On the other hand, setting CurrentUICulture leaves CurrentCulture unmodified.

Added to UseDefaults()

Since the usage of culture specific code and parsing is quite common, I propose that we should add the culture-directives to the UseDefaults() extension method on the CommandLineBuilder as well.

Behaviour

If invariantculture and invariantuiculture directives have values specified, these values are ignored. culture and uiculture directives are expected to have the culture name as their value. The value is trimmed and then passed directly into CultureInfo.GetCultureInfo(string name). If the culture name is not recognised by the CLR, it will throw a an appropiate exception.

If multiple culture and uiculture directives are specified each specified culture is applied in order, which means that the last specified culture is the one that will be applied for the rest of the pipeline.

Interaction with Argument parsing

The culture-directives parsing is implemented as an Invocation pipeline middleware. That means that the culture is applied before a command handler is invoked. Therefore this can be used to perform culture-specific argument value conversion (e.g. respecting thousands-separating characters and decimal separators specific to a certain culture), since model binding is applied when the command handler is invoked.

Isolation of Culture

As implemented here, the CurrentCulture of the execution context that is running the invocation pipeline is modified directly. We could implement the middleware in such a way that the culture change was isolated into its own execution context by doing some trickery with Tasks, it might also be worth considering in the future.

Currently this means, that after the invocation pipeline has completed executing, the CurrentCulture will remain as the value set when a culture-directive was parsed. Arguably that is not an issue, since the program usually exits after the invocation pipeline completes.

@KathleenDollard
Copy link
Contributor

Cool!

So the scenario is to override the machine culture while you run the app? It looks like this parallels this work that was done in the CLI a while back: dotnet/cli#7021 That work was primarily intended for testing. What other scenarios do you see for this?

I'd like this to align with the .NET CLI patterns for getting locale if possible for consistency. There we used an environment variable. I don't think it is enough by itself, but I wonder if we want a way to hook this up to an environment variable? But that would need to be application specific because I doubt any user would want "the subset of command line apps that happen to use System.CommandLine to use a different locale than the machine locale".

It's possible that we ignore the CLI approach as a) directives aren't available there at present and b) it's geared to a scripted testing scenario. Thoughts on that?

Kathleen

@fredrikhr
Copy link
Contributor Author

fredrikhr commented Jun 29, 2020

@KathleenDollard My idea is more in the direction of changing how an application outputs stuff. If you set the culture that will affect all of number parsing, number outputs, DateTime formats, etc. And yes, obviously this would normally be an OS-setting, but being able to affect this on a per-application basis is nice. GNU-like programs usually deal with this using the LC_* environment variables.

In scripting purposes the other way round is probably more interesting: The ability to enforce invariant culture regardless of environment variables.

In combination, imagine an application that can be used both for human readable as well as machine readable purposes: How can you easily control that the program is culture-specific or invariant.

In my personal opinion I don't think this will be as useful in the .NET CLI as I cannot really imagine a situation where we'd want the .NET CLI or the build process to be culture-specific. Imagine MSBuild failing because it cannot parse an integer because some American got the date wrong? (Sorry as someone from Europe I had to make the obligatory Date-Format-jibe 😄)

But for a specific example: Imagine an application that accepts a single argument typed Argument<double> and then adds 1 and prints the result.

$ addone [invariantculture] 1.234
2.234
$ addone [culture:de-DE] 1.234
1.235,00

In the example above it is assumed that the argument parsing uses the culture-aware double.Parse and prints using double.ToString("N") (for normal human readable output)

@jonsequitur
Copy link
Contributor

I'm in favor of adding a convention to support specifying culture info. I think this is a really nice idea.

Given there's already a convention or two for doing this with environment variables, I wonder if we should instead implement that.

Separately, a fairly common scenario for command line tools is to support options that let you set environment variables.

We might compose these two approaches by generalizing the environment variable approach using directives, e.g.:

> myapp [env:culture=de-DE] 

What do you think?

@fredrikhr
Copy link
Contributor Author

@jonsequitur Hmm, I absolutely think that we should also add an environment-variables directive! That would be a great bonus! I'll get right on it.

However, I am unsure whether setting culture through environment variable is a good idea or not, and I don't really get a good answer when I debate this with myself. 😂

If we want to do this my suggestion would be to change the culture-directive middleware I propose here to do:

  1. Chek for the well-known environment variable (Naming suggestions?)
  2. Parse culture-directives

That way culture-directives passed on the command-line would still take precendence over environment variables.

We would also need to assure that our defaults would process environment variable directives before the culture-directives are applies (so that specifying environment variables affecting culture would be applicable).

So for naming the well-known culture-affecting variables:

  • DOTNET_CULTURE and DOTNET_CULTUREUI?
  • The culture name of the invariant culture is "" (empty string), which is really hard to pass as an environment variable.
    Also add DOTNET_USE_INVARIANTCULTURE and DOTNET_USE_INVARIANTUICULTURE?

What about failure when parsing the culture name? I considered directives to be so explicit that a failure in recognising the culture name (i.e. passing does-not-exist as a culture) causes the BCL to throw an execption that effectively terminates the invocation pipeline. My gut-feeling tells me that environment variable parsing should be best effort instead.

@jonsequitur
Copy link
Contributor

@fredrikhr
Copy link
Contributor Author

I have added the environment-variable parsing directive as its own proposal in #965.

Possibly related: https://github.com/dotnet/runtime/blob/master/docs/design/features/globalization-invariant-mode.md

@jonsequitur That is an interesting point, setting the variable DOTNET_SYSTEM_GLOBALIZATION_INVARIANT will enforce invariant. But as I explain in #965 we will not be able to change the behaviour of an already booted runtime with an env-directive middleware.

We can however use the name DOTNET_SYSTEM_GLOBALIZATION_INVARIANT as an inspiration for well-known environment variable names that can be added to our culture-directive parsing behaviour. From this maybe we should consider the following environment variable names:

@fredrikhr
Copy link
Contributor Author

@jonsequitur I merged master into this branch after #965 got accepted.

So now, with environment variable parsing out of the way, what do you think about my choice for well-known environment variable names for culture directives?

@jonsequitur
Copy link
Contributor

@KathleenDollard

@fredrikhr
Copy link
Contributor Author

639624f shows an interesting challenge for culture awareness: The HelpBuilder is run AFTER the invocation pipeline has completed, and therefore the execution context has been restored to the previous culture settings and thus the Help Builder is not aware of cultures specified by [culture] (or a corresponding [env]) directives.

@fredrikhr
Copy link
Contributor Author

The issue described in the comment above was solved by capturing the ExecutionContext when running the culture-directive handler. The captured context is stored in a new internal property on InvocationContext. Finally, when Apply is called on the IInvocationResult instance on InvocationContext when the invocation is complete, the implementation now calls ExecutionContext.Run (if an execution context is available) to restore the previously captured execution context.

This will cause the HelpResult and any other IInvocationResult types that we introduce later to run Apply in the execution context that was set when the invocation pipeline was originally processed.

Any captured execution context is disposed when the Invocation context is disposed. When a middleware attempts to override a previously captured execution context, the previous one is disposed when the new one is set.

@jonsequitur
Copy link
Contributor

The current design leaves us with two ways to accomplish the same thing within a command line invocation. For example, to set the culture to "nb-NO", I could do either of:

> myapp [culture:nb-NO]
> myapp [env:DOTNET_SYSTEM_GLOBALIZATION_CULTURE=nb-NO]

My impression is that this will make it harder to learn how to use. It also means that we have more complex code in two ways:

  • Two different coexisting implementations (i.e. one sets the culture based on the directive, and the other sets the culture based on an environment variable) and people need to reason about how these interact and what is the precedence, and
  • Additional invocations in the middleware pipeline to check for a larger number of directive names.

So I would propose removing the dedicated directives and saying that in order to set the culture via a directive, people should use the [env] directive.

@fredrikhr
Copy link
Contributor Author

@jonsequitur

So I would propose removing the dedicated directives and saying that in order to set the culture via a directive, people should use the [env] directive.

On this I actually strongly disagree. Using the same rationale we should then also remove [debug] in favor of something like [env:DOTNET_SYSTEM_COMMANDLINE_DEBUG=1] and the same for rendering, etc.

The only reason I agree that we should allow recognition of the DOTNET_SYSTEM_GLOBALIZATION_CULTURE environment variable is because the .NET CLR Host does.

I feel very strongly that setting culture should be primarily be settable using the culture directive. And no point would I ever suggest to anyone that they should set some magic variable to achieve something. [culture:de-DE] is much more obvious and straight-forward than [env:DOTNET_SYSTEM_GLOBALIZATION_CULTURE=de-DE].

When I see [culture] in the command-line, I immediately have an expectation as to how the runtime will handle it. When I see [env:DOTNET_SYSTEM_GLOBALIZATION_CULTURE=] I only know that some magic environment variable will be set, but what does DOTNET_SYSTEM_GLOBALIZATION_CULTURE mean?

So if two ways of doing this is an issue, I'd actually rather vote for removing the recognition of environment variables in favor of [culture].

The soft compromise here would be to only recognize DOTNET_SYSTEM_GLOBALIZATION_CULTURE (because of the special meaning in .NET CLR Host) and not introduce the recognition for the other variables as I suggested above.

@fredrikhr
Copy link
Contributor Author

fredrikhr commented Aug 6, 2020

Update: because of the work in #1013 I discovered that Localization features (e.g. in Microsoft.Extensions.Localization) usually use CurrentUICulture to decide what culture to use for localization. In order avoid situations where you need to specify both [culture:de-DE] and [uiculture:de-DE] to localize the entire application to German culture, I decided that all handling that sets CurrentCulture (e.g. [culture], [invariantculture]) will also implicitly set CurrentUICulture. Additional [uiculture] or [invariantuiculture] can override this behaviour.

Hence:

  • [culture:de-DE] will set both CurrentCulture and CurrentUICulture to German culture.
  • [culture:en-US] [uiculture:de-DE] will use US-English number and date formatting, but display UI in German.
    i.e. CurrentCulture will be set to en-US and CurrentUICulture will be set to de-DE.

@jonsequitur
Copy link
Contributor

So if two ways of doing this is an issue, I'd actually rather vote for removing the recognition of environment variables in favor of [culture].

The environment variable approach has the advantage of not necessarily being an implementation detail of the parser, and of flowing to downstream processes in a well-understood manner. It's a protocol more than a feature. Other programs can participate in this protocol without needing a directive implementation. Directives on the other hand are specific to the app, which is why [debug] makes sense as a directive.

We've had similar conversations around rendering and there's general agreement that a directive, while useful, is also too specific to this technology and should ideally be a protocol that apps can participate in regardless of platform.

@fredrikhr
Copy link
Contributor Author

As @jonsequitur pointed out here #1013 (comment) capturing the ExecutionContext as described in the comment above feels very heavy handed. We could of course simply add a light container on InvocationContext that stores stuff like the Curlture, or if we imagine that InvocationContext never will get more state than the culture, we could just replace the internal ExecutionContext property on InvocationContext with two CultureInfo properties (UI and regular Culture).

On the other hand, this is really what the ExecutionContext is meant for... I have an idea: What if we moved the capturing of the execution context to the end of the middleware (i.e. after the invocation) and only captured the context if there was an InvocationResult that was set (and would trigger an Apply later). That would mitigate the performance cost to cases only where Help or Error messages are being written out, otherwise applications would remain unaffected.

@fredrikhr
Copy link
Contributor Author

@jonsequitur

The environment variable approach has the advantage of not necessarily being an implementation detail of the parser, and of flowing to downstream processes in a well-understood manner. It's a protocol more than a feature. Other programs can participate in this protocol without needing a directive implementation.

I understand the reasoning. I still strongly feel that we should support both environment variables (for the reasons you describe) and specific directives (for ease of use and readability).

However, I am not the one who will need to maintain this functionality and support that in the future. I can understand the reasoning from a maintainance perspective, but in that case I'd like you to make an executive decision instructing me to either remove the directives or leave both parts of the feature as is.

If you choose that we remove the culture directives, community additions for System.CommandLine (for example one I create in my own repo) could still do this in a separate package if really needed.

@brettfo brettfo changed the base branch from master to main August 19, 2020 19:53
@jonsequitur
Copy link
Contributor

Related: #1125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants