Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoverage in Pants OSS Proposal #8026

Closed
sammy-1234 opened this issue Jul 8, 2019 · 3 comments
Closed

Scoverage in Pants OSS Proposal #8026

sammy-1234 opened this issue Jul 8, 2019 · 3 comments

Comments

@sammy-1234
Copy link
Contributor

sammy-1234 commented Jul 8, 2019

Problem

The current code coverage framework offered by Pants for JVM targets (via Cobertura or Jacoco) works well for Java targets but doesn’t provide accurate results when using code coverage for Scala targets. Some of the common issues faced while building code coverage for scala targets include Case classes and traits being incorrectly measured or being missed altogether.

Solution

Adding Scoverage

Current issues with Scoverage

Instrumentation of binaries done during compile time instead of runtime
Unlike Jacoco and Cobertura, Scoverage instruments the target binaries for generating coverage during compile time instead of runtime. These instruments are then stored at a location specified by the dataDir provided by the user. Consequently, this demands the user to specify the dataDir for each target at compile time. Furthermore, since compilation of targets may run in a different/unique sandbox and tests may run in another different/unique sandbox, this compile time instrumentation of targets also breaks remoting as discussed below as discussed below.

No support for remoting compilation #265
Since scoverage does not put the created dataDir into the classpath, it does not allow for remoting. It would be beneficial to add an option in OSS Scoverage to include the dataDir in the classpath. This would provide more flexibility in systems that do not guarantee a local filesystem. Classpaths are well-known and would provide a better and more stable future extensibility for the scalac-scoverage-plugin.

Resolving issues with Scoverage

This PR attempts at resolving the remoting issue with scoverage. As specified in the comments in #265, we store the instrumented binaries at the output classpath specified by the compiler and at runtime, the measurements directory is specified by an environment variable SCOVERAGE_MEASUREMENT_PATH where the coverage measurements are stored.

Until the suggested changes are merged into OSS scalac-scoverage-plugin, the issue can be resolved by making the necessary changes into a Twitter forked scalac-scoverage-plugin and updating Pants to point to the forked scoverage repo rather than the upstream one. Once the changes are committed into OSS scoverage, pants can again point to the upstream version.

Generating scoverage instruments and measurements in Pants

Instrumentation can be done with the help of a new subsystem scala_coverage_platform.py that overrides scalac_plugins and scalac_plugin_args in scala targets when scoverage is enabled.

As recommended by @stuhood below,
Scoverage would generates measurements in the same way as Jacoco or Cobertura, i.e by modifying run_modifications method inside scoverage.py under coverage directory. Scoverage sets the System property scoverage_measurement_path to the output_dir where the measurements will be stored.

Generating scoverage reports in Pants

Once the instruments and measurements have been generated, we can pass those to ReportGenerator.scala to generate the scoverage reports. This too can be done by modifying report method in scoverage.py (similar to how Jacoco and Cobertura generate reports).

Running scoverage with pants

If scoverage is sucsessfuly installed, you can get scala coverage reports by running the following command:

./pants --scoverage-enable-scoverage=True test <test-target>

@stuhood
Copy link
Member

stuhood commented Jul 10, 2019

Thanks @sammy-1234! This is a great overview.

A few comments:

In addition to that, the subsystem will provide for two more options - reportDir and dataDirPrefix.

I'd suggest looking at what cobertura and jacoco do here: my understanding is that rather than having separate settings per coverage subsystem (Cobertura and Jacoco are the existing subsystem names), there are global settings in

class CodeCoverage(Subsystem):
"""Manages setup and construction of JVM code coverage engines.
"""
options_scope = 'coverage'
@classmethod
def subsystem_dependencies(cls):
return super().subsystem_dependencies() + (Cobertura.Factory, Jacoco.Factory)
# TODO(jtrobec): move these to subsystem scope after deprecating
@staticmethod
def register_junit_options(register, register_jvm_tool):
register('--coverage', type=bool, fingerprint=True, help='Collect code coverage data.')
register('--coverage-processor', advanced=True, fingerprint=True,
choices=['cobertura', 'jacoco'], default=None,
help="Which coverage processor to use if --coverage is enabled. If this option is "
"unset but coverage is enabled implicitly or explicitly, defaults to 'cobertura'."
"If this option is explicitly set, implies --coverage.")
# We need to fingerprint this even though it nominally UI-only affecting option since the
# presence of this option alone can implicitly flag on `--coverage`.
register('--coverage-open', type=bool, fingerprint=True,
help='Open the generated HTML coverage report in a browser. Implies --coverage.')
register('--coverage-jvm-options', advanced=True, type=list, fingerprint=True,
help='JVM flags to be added when running the coverage processor. For example: '
'{flag}=-Xmx4g {flag}=-Xms2g'.format(flag='--coverage-jvm-options'))
register('--coverage-force', advanced=True, type=bool,
help='Attempt to run the reporting phase of coverage even if tests failed '
'(defaults to False, as otherwise the coverage results would be unreliable).')
, with per-implementation settings on those subsystems.

The reportDir would probably not be a per-coverage-implementation setting: it looks like the CoverageEngine interface has a report method that takes the output directory to use. For example:

def report(self, output_dir, execution_failed_exception=None):
if execution_failed_exception:
self._settings.log.warn('Test failed: {}'.format(execution_failed_exception))
if self._coverage_force:
self._settings.log.warn('Generating report even though tests failed, because the'
'coverage-force flag is set.')
else:
return
report_dir = os.path.join(output_dir, 'coverage', 'reports')

dataDirPrefix will be the root directory where all the instrumentation and measurements generated by scoverage will go.

This cannot be a user-configurable setting. The CoverageEngine implementation for Scoverage will need to set this per run of junit, which I think means implementing the def run_modifications method so that it adds the environment variable.

@sammy-1234
Copy link
Contributor Author

sammy-1234 commented Jul 10, 2019

Thanks @stuhood for this! It helps greatly. I'll look into it and update the issue accordingly.

@nsaechao
Copy link
Contributor

Until the suggested changes are merged into OSS scalac-scoverage-plugin, the issue can be resolved by making the necessary changes into a Twitter forked scalac-scoverage-plugin and updating Pants to point to the forked scoverage repo rather than the upstream one. Once the changes are committed into OSS scoverage, pants can again point to the upstream version.

It should be noted that the releases for OSS scalac-scoverage-plugin is not frequent (~2 years between releases, with the current release happening ~a month ago). There is potential risk that the patch is never merged with OSS or having to wait significantly a long time for the merge to be released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants