Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update integrated test baseline storage #14

Merged
merged 46 commits into from
May 7, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
ff14ade
Adding python scripts to manage baseline io
cssherman Mar 15, 2024
d8b0e14
Wiring in new baseline management tools
cssherman Mar 15, 2024
f3d2860
Fixing typo in geos_ats parser
cssherman Mar 15, 2024
93fcf9c
Fixing package import bug
cssherman Mar 15, 2024
dccdc28
Updating geos_ats input args, various baseline method updates
cssherman Mar 18, 2024
6e4ce12
Updating geos ats command line parsing
cssherman Mar 21, 2024
bb68813
Adding yaml path to ats environment setup
cssherman Mar 29, 2024
d9b4c11
Splitting baseline archive packing, upload
cssherman Mar 29, 2024
97f063e
Fixing bug in baseline packing
cssherman Apr 2, 2024
80ae85c
Adding https baseline fetch option
cssherman Apr 2, 2024
0bacdce
Adding options to work with baseline cache files
cssherman Apr 2, 2024
7325ae0
Fixing baseline archive names
cssherman Apr 2, 2024
0b0591c
Fixing baseline archive structure, copying log files
cssherman Apr 2, 2024
a7facad
Fixing blob download name
cssherman Apr 2, 2024
8a17acc
Handling empty directories for baseline management
cssherman Apr 10, 2024
18aa1d9
Allowing integrated tests to run without baselines
cssherman Apr 10, 2024
02ee8ab
Adding baseline management error check
cssherman Apr 10, 2024
89ff130
Adding additional logging to baseline management code
cssherman Apr 10, 2024
6e9ae4e
Adding additional logging to baseline management code
cssherman Apr 10, 2024
6c1a49d
Fixing log copying bug
cssherman Apr 11, 2024
b35d7f8
Removing test messages from geos_ats
cssherman Apr 11, 2024
49af05e
Adding simple log check script
cssherman Apr 11, 2024
fead5c9
Updating log checker for geos ats
cssherman Apr 11, 2024
2ceda4f
Fixing log checker script
cssherman Apr 11, 2024
683f82b
Fixing log check script
cssherman Apr 12, 2024
b7da058
Attempting to use an anonymous gcp client for baseline fetch
cssherman Apr 12, 2024
070fc12
Fixing log check, allowing baselines to be packed to various folders
cssherman Apr 12, 2024
cbed575
Fixing geos ats blob name
cssherman Apr 12, 2024
37bcbfb
Adding whitelist for geos ats log check script
cssherman Apr 15, 2024
f752546
Adding yaml input option to geos_ats log checker
cssherman Apr 15, 2024
a4b40a7
Using relative file paths for geos_ats html logs
cssherman Apr 15, 2024
63d9624
Updating ats html table format
cssherman Apr 16, 2024
f154225
Updating html report style
cssherman Apr 17, 2024
107bebc
Removing auto page refresh from geos ats report
cssherman Apr 17, 2024
3e74652
Updating html report style
cssherman Apr 19, 2024
0f0e48e
Updating html report style
cssherman Apr 19, 2024
8669589
Adding additional assets to html report
cssherman Apr 19, 2024
64a5f4c
Fixing report label
cssherman Apr 20, 2024
a33c9d5
Fixing lightbox settings
cssherman Apr 20, 2024
a2d4194
Grouping lightbox captions
cssherman Apr 20, 2024
ab0bc6b
Adding baseline history file
cssherman Apr 22, 2024
97f7d41
Fixing the baseline log path
cssherman Apr 22, 2024
8d7c3dc
Separating logs from archives by default
cssherman Apr 29, 2024
cad1f59
Fixing parsing of geos ats options
cssherman Apr 30, 2024
218af63
Skipping baseline management for some test actions
cssherman Apr 30, 2024
b82a757
Adding an additional prerequisite to geos_ats
cssherman May 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 185 additions & 0 deletions geos_ats_package/geos_ats/baseline_io.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
import os
import logging
import tempfile
import shutil
import yaml
import time
from google.cloud import storage

logger = logging.getLogger( 'geos_ats' )


def download_baselines( bucket_name: str,
blob_name: str,
baseline_path: str,
force_redownload: bool = False,
ok_delete_old_baselines: bool = False ):
"""
Download and unpack baselines from GCP to the local machine

Args:
bucket_name (str): Name of the GCP bucket
blob_name (str): Name of the baseline blob
baseline_path (str): Path to unpack the baselines
force_redownload (bool): Force re-download baseline files
ok_delete_old_baselines (bool): Automatically delete old baseline files if present
"""
# Setup
baseline_path = os.path.abspath( os.path.expanduser( baseline_path ) )
status_path = os.path.join( baseline_path, '.blob_name' )

# Check to see if the baselines are already downloaded
logger.info( 'Checking for existing baseline files...' )
if os.path.isdir( baseline_path ):
logger.info( f'Target baseline directory already exists: {baseline_path}' )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disputable, but maybe

if not os.path.isdir( baseline_path ):
    os.makedirs( os.path.dirname( baseline_path ), exist_ok=True )
else:
    logger.info( f'Target baseline directory already exists: {baseline_path}' )
    ...

would remove the one line else at the very end (which comes as a surprise)?

if os.path.isfile( status_path ):
last_blob_name = open( status_path, 'r' ).read()
if ( blob_name == last_blob_name ) and not force_redownload:
logger.info( f'Target baselines are already downloaded: {blob_name}' )
logger.info( 'To re-download these files, run with the force_redownload option' )
return

if not ok_delete_old_baselines:
for ii in range( 10 ):
print( f'Existing baseline files found: {baseline_path}' )
user_input = input( 'Delete old baselines? [y/n]' )
user_input = user_input.strip().lower()
if user_input in [ "y", "yes" ]:
logger.debug( 'User chose to delete old baselines' )
break
elif user_input in [ "n", "no" ]:
logger.debug( 'User chose to keep old baselines' )
logger.warning( 'Running with out of date baseline files' )
return
else:
print( f'Unrecognized option: {user_input}' )
raise Exception( 'Failed to parse user options for old baselines' )

logger.info( 'Deleting old baselines...' )
shutil.rmtree( baseline_path )
cssherman marked this conversation as resolved.
Show resolved Hide resolved

else:
os.makedirs( os.path.dirname( baseline_path ), exist_ok=True )

# Download new baselines
try:
logger.info( 'Downloading baselines...' )
tmpdir = tempfile.TemporaryDirectory()
archive_name = os.path.join( tmpdir.name, 'baselines.tar.gz' )

# Download from GCP
client = storage.Client( use_auth_w_custom_endpoint=False )
bucket = client.bucket( bucket_name )
blob = bucket.blob( blob_name )
blob.download_to_filename( archive_name )

# Unpack new baselines
logger.info( 'Unpacking baselines...' )
shutil.unpack_archive( archive_name, baseline_path, format='gztar' )
logger.info( 'Finished fetching baselines!' )

except Exception as e:
logger.error( 'Failed to fetch baseline files' )
logger.error( str( e ) )


def upload_baselines( bucket_name: str, blob_name: str, baseline_path: str ):
"""
Pack and upload baselines to GCP

Args:
bucket_name (str): Name of the GCP bucket
blob_name (str): Name of the baseline blob
baseline_path (str): Path to unpack the baselines
"""
# Setup
baseline_path = os.path.abspath( os.path.expanduser( baseline_path ) )
status_path = os.path.join( baseline_path, '.blob_name' )

# Check to see if the baselines are already downloaded
logger.info( 'Checking for existing baseline files...' )
if not os.path.isdir( baseline_path ):
logger.error( f'Could not find target baselines: {baseline_path}' )
raise FileNotFoundError( 'Could not find target baseline files' )

# Check for old blob name files and over-write if necessary
last_blob_name = ''
if os.path.isfile( status_path ):
last_blob_name = open( status_path, 'r' ).read()
with open( status_path, 'w' ) as f:
f.write( blob_name )

try:
logger.info( 'Archiving baseline files...' )
tmpdir = tempfile.TemporaryDirectory()
archive_name = os.path.join( tmpdir.name, 'baselines.tar.gz' )
shutil.make_archive( archive_name, format='gztar', base_dir=baseline_path )

# Upload to gcp
logger.info( 'Uploading baseline files...' )
client = storage.Client()
bucket = client.bucket( bucket_name )
blob = bucket.blob( blob_name )
blob.upload_from_filename( archive_name, if_generation_match=0 )
logger.info( 'Finished uploading baselines!' )

except Exception as e:
logger.error( 'Failed to upload baselines!' )
logger.error( str( e ) )

# Reset the local blob name
with open( status_path, 'w' ) as f:
f.write( last_blob_name )


def manage_baselines( options ):
"""
Manage the integrated test baselines
"""
# Check for integrated test yaml file
test_yaml = ''
if options.yaml:
test_yaml = options.yaml
else:
test_yaml = os.path.join( options.geos_bin_dir, '..', '..', '.integrated_tests.yaml' )

if not os.path.isfile( test_yaml ):
raise Exception( f'Could not find the integrated test yaml file: {test_yaml}' )

test_options = {}
with open( test_yaml ) as f:
test_options = yaml.safe_load( f )

baseline_options = test_options.get( 'baselines', {} )
for k in [ 'bucket', 'baseline' ]:
if k not in baseline_options:
raise Exception(
f'Required information (baselines/{k}) missing from integrated test yaml file: {test_yaml}' )

# Manage baselines
if options.action == 'upload_baselines':
if os.path.isdir( options.baselineDir ):
epoch = int( time.time() )
upload_name = f'integrated_test_baseline_{epoch}.tar.gz'
upload_baselines( baseline_options[ 'bucket' ], upload_name, options.baselineDir )

# Update the test config file
baseline_options[ 'baseline' ] = upload_name
with open( test_yaml, 'w' ) as f:
yaml.dump( baseline_options, f )
quit()
else:
raise Exception( f'Could not find the requested baselines to upload: {options.baselineDir}' )

download_baselines( baseline_options[ 'bucket' ],
baseline_options[ 'baseline' ],
options.baselineDir,
force_redownload=options.update_baselines,
ok_delete_old_baselines=options.delete_old_baselines )

# Cleanup
if not os.path.isdir( options.baselineDir ):
raise Exception( f'Could not find the specified baseline directory: {options.baselineDir}' )

if options.action == 'download_baselines':
quit()
48 changes: 24 additions & 24 deletions geos_ats_package/geos_ats/command_line_parsers.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
import logging
import argparse
import os
import shutil
from pydoc import locate

action_options = {
"run": "execute the test cases that previously did not pass.",
Expand All @@ -17,6 +14,8 @@
"rebaseline": "rebaseline the testcases from a previous run.",
"rebaselinefailed": "rebaseline only failed testcases from a previous run.",
"report": "generate a text or html report, see config for the reporting options.",
"upload_baselines": "Upload baselines to bucket",
"download_baselines": "Download baselines from bucket",
}

check_options = {
Expand Down Expand Up @@ -46,6 +45,20 @@ def build_command_line_parser():

parser.add_argument( "-b", "--baselineDir", type=str, help="Root baseline directory" )

parser.add_argument( "-y", "--yaml", type=str, help="Path to YAML config file", default='' )

parser.add_argument( "-d",
"--delete-old-baselines",
action="store_true",
default=False,
help="Automatically delete old baselines" )

parser.add_argument( "-u",
"--update-baselines",
action="store_true",
default=False,
help="Force baseline file update" )

action_names = ','.join( action_options.keys() )
parser.add_argument( "-a", "--action", type=str, default="run", help=f"Test actions options ({action_names})" )

Expand All @@ -59,14 +72,6 @@ def build_command_line_parser():
default="info",
help=f"Log verbosity options ({verbosity_names})" )

parser.add_argument( "-d",
"--detail",
action="store_true",
default=False,
help="Show detailed action/check options" )

parser.add_argument( "-i", "--info", action="store_true", default=False, help="Info on various topics" )

parser.add_argument( "-r",
"--restartCheckOverrides",
nargs='+',
Expand Down Expand Up @@ -115,17 +120,20 @@ def parse_command_line_arguments( args ):
# Check action, check, verbosity items
check = options.check
if check not in check_options:
print(
f"Selected check option ({check}) not recognized. Try running with --help/--details for more information" )
print( f"Selected check option ({check}) not recognized" )
exit_flag = True

action = options.action
if action not in action_options:
print(
f"Selected action option ({action}) not recognized. Try running with --help/--details for more information"
)
print( f"Selected action option ({action}) not recognized" )
exit_flag = True

if exit_flag:
for option_type, details in zip( [ 'action', 'check' ], [ action_options, check_options ] ):
print( f'\nAvailable {option_type} options:' )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for option_type, details in zip( [ 'action', 'check' ], [ action_options, check_options ] ):
for option_type, details in ('action', action_options), ('check', check_options ):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that work? I am not so sure. I think you need to make it a list of tuples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> for i, j in (0, 1), (2, 3):
...     print(i, j)
...
0 1
2 3

I guess there's some auto conversion to an Iterable on the fly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CusiniM I was surprised by this behavior myself when @TotoGaz suggested it

for k, v in details.items():
print( f' {k}: {v}' )

verbose = options.verbose
if verbose not in verbose_options:
print( f"Selected verbose option ({verbose}) not recognized" )
Expand All @@ -138,14 +146,6 @@ def parse_command_line_arguments( args ):
if not options.baselineDir:
options.baselineDir = options.workingDir

# Print detailed information
if options.detail:
for option_type, details in zip( [ 'action', 'check' ], [ action_options, check_options ] ):
print( f'\nAvailable {option_type} options:' )
for k, v in details.items():
print( f' {k}: {v}' )
exit_flag = True

if exit_flag:
quit()

Expand Down
3 changes: 2 additions & 1 deletion geos_ats_package/geos_ats/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import subprocess
import time
import logging
from geos_ats import command_line_parsers
from geos_ats import command_line_parsers, baseline_io

test_actions = ( "run", "rerun", "check", "continue" )
report_actions = ( "run", "rerun", "report", "continue" )
Expand Down Expand Up @@ -292,6 +292,7 @@ def main():
os.chdir( ats_root_dir )
os.makedirs( options.workingDir, exist_ok=True )
create_log_directory( options )
baseline_io.manage_baselines( options )

# Check the test configuration
from geos_ats import configuration_record
Expand Down
2 changes: 2 additions & 0 deletions geos_ats_package/setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ install_requires =
numpy
lxml
tabulate
pyyaml
google-cloud-storage
TotoGaz marked this conversation as resolved.
Show resolved Hide resolved
ats @ https://github.com/LLNL/ATS/archive/refs/tags/7.0.105.tar.gz
python_requires = >=3.7

Expand Down
Loading