From 825d5a08f7fc1e32408d6281bd965794675132da Mon Sep 17 00:00:00 2001 From: Marina Gourtovaia Date: Fri, 25 Oct 2024 11:24:02 +0100 Subject: [PATCH] Updated documentation --- Changes | 25 ++++ lib/npg_qc/autoqc/checks/review.pm | 208 ++++++++++++++++------------- 2 files changed, 141 insertions(+), 92 deletions(-) diff --git a/Changes b/Changes index 48f6a14d..3ed8684c 100644 --- a/Changes +++ b/Changes @@ -1,5 +1,30 @@ LIST OF CHANGES FOR NPG-QC PACKAGE + - npg_qc::autoqc::qc_store - changed loading of ORM classes into memory + to on demand at ru time. + - Added lane-level assessment to RoboQC: + - Extended npg_qc::Schema::Result::Review class to enable saving of + sequencing (lane-level) QC outcomes to database + - Extended npg_qc::autoqc::check::review to enable lane-level assessment + and documented the new feature. + - Made redundant 'qc_type' key of the 'robo' configuration. + - Fixed a bug in creating a list of class names in + npg_qc::autoqc::check::review. + A list of names for result classes to be loaded was inferred + from all criteria in the 'robo' section under consideration. + Until now this did not pesent a problem since event when in + the study RoboQC section of the configuration file a number + of different criteria sections were present, it was possible + to load all of the AutoQC results for any given entity. + The bug is revealed when lane-level assessment is added to + the 'robo' section under the 'default' configuration. This + type of assessment might require presence of lane-level autoqc + results for checks that are never performed om sample level, + 'tag_metrics' check being the most obvious example. + The fix is to infer the list of class names from the criteria, + which are going to be evaluated for the entity, rather than + from a collection of criteria from the whole 'robo' section. + release 73.0.0 (2024-10-24) - Removed from the repository unused test data. - npg_qc::autoqc::check::review: diff --git a/lib/npg_qc/autoqc/checks/review.pm b/lib/npg_qc/autoqc/checks/review.pm index 66245caf..4d29c2a0 100644 --- a/lib/npg_qc/autoqc/checks/review.pm +++ b/lib/npg_qc/autoqc/checks/review.pm @@ -59,84 +59,80 @@ npg_qc::autoqc::checks::review =head2 Overview -This checks evaluates the results of other autoqc checks -against a predefined set of criteria. +This checks evaluates the results of other autoqc checks against a predefined +set of criteria. -If data product acceptance criteria are defined, it is possible to -introduce a degree of automation into the manual QC process. To -provide interoperability with the API supporting the manual QC process, -the outcome of the evaluation, which is performed by this check, is -recorded not only as a simple undefined, pass or fail as in other autoqc -checks, but also as one of valid manual or user QC outcomes. +If data product acceptance criteria are defined, it is possible to introduce +a degree of automation into the QC process. To provide interoperability with +the API supporting the manual QC process, the outcome of the evaluation is +recorded not only as a simple C, C or C as in other +autoqc checks, but also as one of valid manual lane or library-level QC +outcomes. =head2 Types of criteria -The robo section of the product configuration file sits either -within the configuration for a particular study or in the default -section, or in both locations. A study-specific RoboQC definition +The C section of the product configuration file sits either within +the configuration for a particular study or in the C section, or +in both locations. For a given entity a study-specific RoboQC definition takes precedence over the default one. -Evaluation criteria for samples vary depending on the sequencing -instrument type, library type, sample type, etc. There might be a -need to exclude some samples from RoboQC. The criteria key of the -robo configuration points to an array of criteria objects. Each of -the criteria contains two further keys, one for acceptance and one -for applicability criteria. The acceptance criteria are evaluated -if either the applicability criteria have been satisfied or no -applicability criteria are defined. - -The applicability criteria for each criteria object should be -set in such a way that the order of evaluation of the criteria -array does not matter. If applicability criteria in all of the -criteria objects are not satisfied, no QC outcome is assigned -and the pass attribute of the review result object remains unset. - -The product can satisfy applicability criteria in at most one -criteria object. If none of the study-specific applicability -criteria are satisfied, the review check does not proceed even if -the product might satisfy one of the default applicability criteria. - -=head2 QC outcomes - -A valid Manual QC outcome is one of the values from the library -qc outcomes dictionary (mqc_library_outcome_dict table of the -npg_qc database), i.e. one of 'Accepted', 'Rejected' or 'Undecided' -outcomes. If the final_qc_outcome flag of this class' instance is -set to true, the outcome is also marked as 'Final', otherwise it is -marked as 'Preliminary' (examples: 'Accepted Final', -'Rejected Preliminary'). By default the final_qc_outcome flag is -false and the produced outcomes are preliminary. - -The type of QC outcome can be configured within the Robo QC -section of product configuration. The default type is library -Manual QC. +Evaluation criteria for samples vary depending on the sequencing instrument +and flowcell type, library type, sample reference, etc. Each of the +C keys of the C configuration contains two further keys, one +for acceptance and one for applicability criteria. + +The applicability criteria for each criteria object should be set in such +a way that the order of evaluation of the criteria array does not matter. +If applicability criteria in all of the criteria objects are not satisfied, +the evaluation is not performed, no QC outcome is assigned and the C +attribute of the review result object remains unset. + +The product can satisfy applicability criteria in at most one criteria object. +If none of the study-specific applicability criteria are satisfied, the review +check does not proceed even if the product satisfies one of the applicability +criteria in the C section. + +=head2 QC outcomes values + +A valid manual QC outcome is one of the values from the library +qc outcomes dictionary (C table of the +QC database), i.e. one of 'Accepted', 'Rejected' or 'Undecided' +outcomes. If the C flag of the review check object +is set to a C value, the outcome is also marked as 'Final', + otherwise it is marked as 'Preliminary' (examples: C, +C). By default the C flag is +C and the produced outcomes are preliminary. + +=head2 QC outcomes types + +The behaviour mimics 'manua; QC process via SeqQC UI. For lane entities +'sequencing' QC outcomes are generated. Fo samples 'library' QC outcomes +are generated. =head2 Rules for assignment of the QC outcome The rules below apply to a single criteria object. The 'Accepted' outcome is assigned if the outcome of evaluation is -true, the 'Rejected' outcome is assigned otherwise. +a C value, the 'Rejected' outcome is assigned otherwise. =head2 Retrieval of autoqc results to be evaluated It is possible to invoke this check on any entity. At run time an attempt is made to retrieve autoqc results for this entity (product), -which are relevant to the RoboQC for this product. If this attempt -fails, the execute method of the check will exit with an error. A +which are relevant to RoboQC for this product. If this attempt +fails, the C method of the check will exit with an error. A failure to retrieve the autoqc results might be for one of three -reasons: (1) either the entity is not an end product (example: a pool) -and no such results exist or (2) it is a product, but the autoqc results -have not been computed yet, or (3) they have, but their file system -location (if that's where we are looking) is different from expected -(ie given by the qc_in attribute). - -The autoqc results are retrieved either from the file system (use_db -attribute should be set to false, which is default) or from a database -(use_db attribute should be set to true). npg_qc::autoqc::qc_store class -is used to retrieve results. In contrast to the default behaviour of the -npg_qc::autoqc::qc_store class, if the database retrieval is enabled, no -fall back to a search on a file system is performed. +reasons: (1) the autoqc results have not been computed yet, or (2) the +file system location is different from the one given by C +attribute. + +The autoqc results are retrieved either from the file system (C +attribute should be set to C, which is default) or, C is +set to a C value, from a database. C +class is used to retrieve results. In contrast to the default behaviour +of C class, if no result are retrieved from +the database, a search on a file system is not attempted. =head2 Record of the evaluation criteria @@ -190,9 +186,9 @@ all criteria in the array are equally essential. Therefore, a conjunction =head2 use_db -A boolean read-only attribute, false by default. -If set to false, autoqc results are loaded from the qc_in path. -If set to true, they are loaded from the database. +A boolean read-only attribute, C by default. +If set to C, autoqc results are loaded from the C path. +If set to C, the results are loaded from the database. =cut @@ -201,6 +197,7 @@ has 'use_db' => ( is => 'ro', ); +# The DBIx schema object for the database with autoqc results. has '_qc_schema' => ( isa => 'Maybe[npg_qc::Schema]', is => 'ro', @@ -220,9 +217,10 @@ sub _build__qc_schema { =head2 final_qc_outcome -A boolean read-only attribute, false by default. -If set to false, the result of the evaluation is saved as a -preliminary manual QC outcome. If set to true, the result of the +A boolean read-only attribute, C by default. + +If set to C, the result of the evaluation is saved as a +preliminary manual QC outcome. If set to C, the result of the evaluation is saved as a final manual QC outcome. =cut @@ -234,22 +232,30 @@ has 'final_qc_outcome' => ( =head2 conf_path -An attribute, an absolute path of the directory with -the pipeline's configuration files. Inherited from -npg_tracking::util::pipeline_config +An attribute, an absolute path of the directory with the pipeline's +configuration files. + +Inherited from C. =head2 conf_file_path -A method. Returns the path of the product configuration file. -Inherited from npg_tracking::util::pipeline_config +Method returning the path of the product configuration file. + +Inherited from C. =head2 runfolder_path -The runfolder path, an optional attribute. In case of complex products -(multi-component compositions) is only relevant if all components belong -to the same sequencing run. This attribute is used to retrieve information -from RunInfo.xml and {r,R}unParameters.xml files. Some 'robo' configuration -might not require information of this nature, thus the attribute is optional. +The runfolder path, an optional attribute. + +In case of complex products (multi-component compositions) it is only relevant +if all components belong to the same sequencing run. This attribute is used to +retrieve information from C and C<{r,R}unParameters.xml> files. + +Some C configurations do not require information of this nature, thus the +attribute is optional. In practice the attribute has to be set to allow for +correct evaluation of applicability of all criteria of a production RoboQC +configuration. + If the information from the above-mentioned files is required, but the access to the staging run folder is not available, the check cannot be run. @@ -265,7 +271,7 @@ has 'runfolder_path' => ( =head2 BUILD A method that is run before returning the new object instance to the caller. -Errors if any attributes of the object are are in conflict. +Errors if any attributes of the object are in conflict. =cut @@ -281,8 +287,9 @@ sub BUILD { =head2 can_run -Returns true if the check can be run, ie a valid RoboQC configuration exists -and one of the applicability criteria is satisfied for this product. +Returns a C value if the check can be run, ie a valid RoboQC +configuration exists and one of the applicability criteria is satisfied for +this product. =cut @@ -320,14 +327,13 @@ sub can_run { =head2 execute -Returns early if the can_run method returns a false value. +Returns early without an error if C method returns a C value. -An assessment of applicability of running RoboQC on this entity is performed -next, an early return is possible after that. If RoboQC is applicable, a full -evaluation of autoqc results for this product is performed. If autoqc results -that are necessary to perform the evaluation are not available or there is -some other problem with evaluation, an error is raised if the final_qc_outcome -flag is set to true, otherwise it is captured and and logged as a comment. +If C returns a C value, a full evaluation of autoqc results +for this product is performed. If autoqc results that are necessary to perform +the evaluation are not available or there is some other problem with evaluation, +an error is raised if the C flag is set to C, otherwise +the error it is captured and and logged as a comment. No QC outcome is assigned if the evaluation has not had a chance to run to a successful completion. @@ -365,8 +371,8 @@ sub execute { =head2 evaluate Method implementing the top level evaluation algorithm. Returns the outcome -of the evaluation as 0 for a fail or 1 for a pass. Saves the outcomes of -evaluation of individual expressions in the evaluation_results attribute of +of the evaluation as C<0> for a fail or C<1> for a pass. Saves the outcomes of +evaluation of individual expressions in the C attribute of the result object. =cut @@ -384,8 +390,25 @@ sub evaluate { Returns a hash reference representing the QC outcome. - my $m_outcome = $r->generate_qc_outcome('mqc'); - + my $qc_outcome = $r->generate_qc_outcome(); + +The returned hash has three keys: + +=over + +=item C records the current time and is interpreted downstream + as the time the QC decision was made. + +=item C with the value of C<'robo_qc'>, is interpreted downstream as + the user who performed 'manual' QC. + +=item C or C for the outcome of the evaluation. + Examples of values: C, C. If lane-level + evaluation is performed, the key is C, otherwise it is + C. + +=back + =cut sub generate_qc_outcome { @@ -407,7 +430,7 @@ sub generate_qc_outcome { =head2 lims -st::api::lims object corresponding to this object's rpt_list +C object corresponding to this object's C attribute. =cut @@ -424,7 +447,8 @@ sub _build_lims { =head2 runfolder -npg_tracking::illumina::runfolder object +C object for C attribute +value. =cut