Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locale argument conversion not setting language and country properly #3141

Open
2 tasks
scordio opened this issue Jan 31, 2023 · 16 comments · May be fixed by #3410
Open
2 tasks

Locale argument conversion not setting language and country properly #3141

scordio opened this issue Jan 31, 2023 · 16 comments · May be fixed by #3410

Comments

@scordio
Copy link
Contributor

scordio commented Jan 31, 2023

I am trying to write a parameterized test for a service that uses a resource bundle under the hood, loading the resource bundle with ResourceBundle.getBundle(String, Locale).

The test gets several Locale values in a @CsvSource annotation:

The documentation does not have an example for a country-based locale so I initially tried with:

@CsvSource({
  "en_FR, value1",
  "en_GB, value2",
  "fr_FR, value3",
  "fr_MC, value4",
  "it_IT, value5",
  ...
})
void test(Locale locale, String expected) {
  // call service using the resource bundle

  // compare result with `expected`
}

All worked fine on Windows (local environment) but failed on Linux (CI).

After some digging, on Windows everything works because it's a case-insensitive file system, so for example the en_FR value is converted to a en_fr language-only locale (i.e., without country) and it successfully matches the file ending with en_FR because of case insensitivity.

Looking at the code, I think the problem is here:

which uses the Locale(String) constructor. The constructor Javadoc also explains the different behavior based on case sensitivity:

This constructor normalizes the language value to lowercase.

Looking at the test cases:

assertConverts("en", Locale.class, Locale.ENGLISH);
assertConverts("en_us", Locale.class, new Locale(Locale.US.toString()));

Probably the second one is invalid. I would have expected that to be:

assertConverts("en_us", Locale.class, Locale.US);

which does not pass with the current implementation.

Steps to reproduce

As the documentation does not cover conversion of locales with country, I am not sure what would be the "right" test case to demonstrate the issue.

Assuming the IETF BCP 47 language tag format would be the right format for input values, this test fails already at the first assertion:

  @ParameterizedTest
  @ValueSource(strings = { "en-US" })
  void test(Locale locale) {
    assertEquals("en", locale.getLanguage());
    assertEquals("US", locale.getCountry());
    assertEquals(Locale.US, locale);
  }

Workaround

Currently, I'm using a custom ArgumentConverter which delegates the conversion to Locale::forLanguageTag.

Context

  • Used versions (Jupiter/Vintage/Platform): 5.8.2
  • Build Tool/IDE: Intellij IDEA / Maven

Deliverables

  • DefaultArgumentConverter properly converts IETF BCP 47 strings
  • The documentation has an example using the IETF BCP 47 language tag format
@scordio
Copy link
Contributor Author

scordio commented Jan 31, 2023

I propose to change the implementation at:

replacing Locale::new with Locale::forLanguageTag.

Happy to work on it if accepted.

@marcphilipp
Copy link
Member

@scordio IIUC that could potentially break existing tests that (wrongly or not) rely on the Locale(String) constructor being called, right?

@scordio
Copy link
Contributor Author

scordio commented Jan 31, 2023

@marcphilipp yes, but only if they specify a country and/or variant in the string. They would be broken like the second test case you currently have in the codebase.

However, such tests would have today a "wrong" Locale instance like the one in the reproducer and I'm not sure if they exist in reality. Up to you to judge if fixing this deserves a breaking change.

Tests that are specifying only a "plain" language (i.e., without country, variant, etc.) are safe from my point of view, which might be the most common use case.

@scordio
Copy link
Contributor Author

scordio commented Feb 2, 2023

Looking at Java 19, the Locale(String) constructor has been deprecated together with the other constructors and Locale::forLanguageTag is one of the suggested ways to obtain a Locale object.

@scordio
Copy link
Contributor Author

scordio commented Feb 4, 2023

In case you prefer to keep the current behavior thus avoiding a breaking change, an alternative could be an opt-in argument converter like @JavaTimeConversionPattern.

A few ideas about naming:

  • @LanguageTag
  • @LanguageTagConversion
  • @LanguageTagLocale

The implicit conversion could then be changed in the next major version of JUnit.

@Bukama
Copy link
Contributor

Bukama commented Apr 16, 2023

Looking at Java 19, the Locale(String) constructor has been deprecated together with the other constructors and Locale::forLanguageTag is one of the suggested ways to obtain a Locale object.

We (JUnit Pioneer) also changed our implementation as the old constructor is deprecated, see JUnit Pioneer issue and JUnit Pioneer PR. Why do I note this here? Because some of the builder methods validates the input against a valid pattern - which the old constructor did not. So you might have the same discussion as we if you see this as a breaking change or not.

@scordio
Copy link
Contributor Author

scordio commented Apr 18, 2023

Thanks for the pointers, @Bukama!

As far as I can see, JUnit Pioneer favored the builder only when the fine-grained parameters are specified, otherwise Locale::forLanguageTag is used.

Do you see any issue with using Locale::forLanguageTag to solve the issue with the JUnit argument conversion?

@Bukama
Copy link
Contributor

Bukama commented Apr 21, 2023

Do you see any issue with using Locale::forLanguageTag to solve the issue with the JUnit argument conversion?

I would not expect any issues as this method does not do a syntax check (as far as I understand the docs)

@scordio
Copy link
Contributor Author

scordio commented Apr 22, 2023

In case you prefer to keep the current behavior thus avoiding a breaking change, an alternative could be an opt-in argument converter like @JavaTimeConversionPattern.

A few ideas about naming:

* `@LanguageTag`

* `@LanguageTagConversion`

* `@LanguageTagLocale`

The implicit conversion could then be changed in the next major version of JUnit.

As mentioned in #2702 (comment), a custom argument converter could be a candidate for JUnit Pioneer.

@marcphilipp
Copy link
Member

Team decision: Since the Locale constructor is not deprecated for removal, there's no urgency to change the existing behavior. Since switching from Locale::new to Locale::forLanguageTag would likely break existing tests, we would need to make it configurable one way or another. For now, we'll wait for additional interest from the community.

@sbrannen
Copy link
Member

During the team discussion (resulting in the above team decision), I mentioned that Spring Framework dealt with this issue by introducing "lenient" Locale parsing in order to support both legacy locale formats as well as BCP 47 formats.

@scordio
Copy link
Contributor Author

scordio commented May 12, 2023

Hi @marcphilipp and @sbrannen, I was going to submit a feature request to JUnit Pioneer for a custom argument converter, but stopped because I'm guessing that having the behavior configurable in JUnit would probably make the Pioneer converter obsolete.

I understand you're waiting for additional interest. I am definitely interested in having a solution, one way or another, but I also understand a single person's need doesn't count as a community interest 😅

I'm happy to help with the changes, but some initial design would of course be required.

In case you're open to discussing the solution further, here is my proposal: as I can hardly imagine test suites where you want to have a combination of the old and new conversion behavior for Locale, I would introduce a configuration parameter in the style of junit.jupiter.tempdir.scope, perhaps junit.jupiter.params.arguments.conversion.locale=default|lenient.

In case you don't want to invest more in this right now, that's totally fine and understood!

@marcphilipp
Copy link
Member

@scordio Wouldn't it rather be "ISO 639" vs. "BCP 47"?

@scordio
Copy link
Contributor Author

scordio commented May 13, 2023

@marcphilipp I guess you're referring to the values of the configuration property, right?

If the change would be the one I mentioned at #3141 (comment), yes, "ISO 639" vs. "BCP 47" would make more sense as property values.

In the previous comment, I proposed default vs. lenient following the example @sbrannen described at #3141 (comment) but probably it's not needed to have a mode supporting both at the same time (I actually mentioned the same in my previous comment... fighting with myself, it seems 🤦).

Just as a last comment, I was reading again the Javadoc of Spring's StringUtils.parseLocale(). Overall, Spring supports three variants:

  1. Locale's toString() format (e.g., "en", "en_US")
  2. Locale's toString() format with spaces as separators (e.g., "en US")
  3. BCP 47 (e.g. "en-US")

I tested how the two solutions would perform with such use cases:

Locale::getLanguage Locale::getCountry Locale::toString
new Locale("en") en en
Locale.forLanguageTag("en") en en
new Locale("en_US") en_us en_us
Locale.forLanguageTag("en_US")
new Locale("en-US") en-us en-us
Locale.forLanguageTag("en-US") en US en_US
new Locale("en US") en us en us
Locale.forLanguageTag("en US")

Neither of the underscore and space use cases are adequately supported by either new Locale(String) or Locale.forLanguageTag(String). However, I don't know if JUnit should support these use cases at all.

My proposal would be to not support them as they are not backed by any standard.

Thoughts?

@scordio
Copy link
Contributor Author

scordio commented Jun 1, 2023

@marcphilipp if you don't have objections, I'll compose a PR in the next few days for this topic.

@marcphilipp
Copy link
Member

SGTM. Sorry for the delayed response. I was out for a few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants