CAL-920 Ignore Page rows on ingest to Californica (#847)

* test * working but title problem * removed temp type insertion * sending to Dev for testing * fixed for rubucop * added _not ingested_ to the Import Status Report * reformmated for rubocop * updated to fix test * added back row_status code * fixed rubocop * added additional logic for _not-ingested_ message Co-authored-by: darrowcoucla <[email protected]> Co-authored-by: JenDiamond <[email protected]>
UCLALibrary · Sep 1, 2020 · e5a020b · e5a020b
1 parent 4740a4a
commit e5a020b
Show file tree

Hide file tree

Showing 4 changed files with 19 additions and 8 deletions.
diff --git a/.rubocop.yml b/.rubocop.yml
@@ -61,6 +61,7 @@ Metrics/CyclomaticComplexity:
     - app/models/concerns/discoverable.rb
     - app/uploaders/csv_manifest_validator.rb
     - app/indexers/year_parser.rb
+    - app/jobs/csv_row_import_job.rb
 
 Metrics/LineLength:
   Exclude:
@@ -110,6 +111,7 @@ Metrics/PerceivedComplexity:
     - app/uploaders/csv_manifest_validator.rb
     - app/controllers/application_controller.rb
     - app/indexers/year_parser.rb
+    - app/jobs/csv_row_import_job.rb
 
 RSpec/AnyInstance:
   Enabled: false

diff --git a/README.md b/README.md
@@ -156,9 +156,10 @@ bundle exec rake californica:ingest:csv
 
 ### Adding works to a collection
 
-You can add a `Parent ARK` column to the CSV file.  For each row of the CSV, add the ARK for the collection that work should belong to.  The importer will find the `Collection` record with the matching ARK.  If the `Collection` record doesn't exist yet, the importer will create a new `Collection` using that ARK.
+You can add a `Parent ARK` column to the CSV file. For each row of the CSV, add the ARK for the collection that work should belong to. The importer will find the `Collection` record with the matching ARK. If the `Collection` record doesn't exist yet, the importer will create a new `Collection` using that ARK.
 
 ## Read-only mode
+
 Californica has a read-only mode, which can be enabled via the `Settings` menu on the admin dashboard, and is useful for making consistent backups or migrating data.
 
 Ideally, a user should log in as an admin, enable read-only mode, and keep the window open so they can then disable it again. If the window accidentally gets closed, and the system is stuck in read-only mode, open a rails console and fix the problem like this:
@@ -186,6 +187,8 @@ Californica is available under the [Apache License Version 2.0](./LICENSE).
 ---
 
 ### flaky tests
-##### `bundle exec rspec spec --seed 43408`  
-+ rspec `./spec/system/import_from_csv_spec.rb:16` # Importing records from a CSV file logged in as an admin user starts the import
-+ rspec `./spec/system/new_collection_spec.rb:21` # Create a new collection logged in as an admin user successfully creates a new collection with an ark based identifier
+
+##### `bundle exec rspec spec --seed 43408`
+
+- rspec `./spec/system/import_from_csv_spec.rb:16` # Importing records from a CSV file logged in as an admin user starts the import
+- rspec `./spec/system/new_collection_spec.rb:21` # Create a new collection logged in as an admin user successfully creates a new collection with an ark based identifier
diff --git a/app/jobs/csv_row_import_job.rb b/app/jobs/csv_row_import_job.rb
@@ -10,8 +10,8 @@ def perform(row_id:)
     @row = CsvRow.find(@row_id)
     @row.ingest_record_start_time = Time.current
 
-    @row.status = 'in progress'
     @metadata = JSON.parse(@row.metadata)
+    @row.status = @metadata["Object Type"].include?("Page") ? 'not ingested' : 'in progress'
     @metadata = @metadata.merge(row_id: @row_id)
     @csv_import = CsvImport.find(@row.csv_import_id)
     import_file_path = @csv_import.import_file_path
@@ -22,12 +22,18 @@ def perform(row_id:)
                         else
                           actor_record_importer
                         end
-    selected_importer.import(record: record)
+
+    selected_importer.import(record: record) unless @metadata["Object Type"].include?("Page")
     @row.status = if ['Page', 'ChildWork'].include?(record.mapper.object_type)
-                    "complete"
+                    if @metadata["Object Type"].include?("Page")
+                      "not ingested"
+                    else
+                      "complete"
+                    end
                   else
                     "pending finalization"
                   end
+
     end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
     @row.ingest_record_end_time = Time.current
 

diff --git a/app/uploaders/csv_manifest_validator.rb b/app/uploaders/csv_manifest_validator.rb
@@ -161,7 +161,7 @@ def validate_records
         field_label, types_that_require = REQUIRED_VALUES[j]
         next this_row_errors << "Rows missing required value for \"#{REQUIRED_VALUES[j][0]}\".  Your spreadsheet must have this value." if field_label == 'Title' && row[column_number].blank?
         next this_row_errors << "Rows missing required value for \"#{REQUIRED_VALUES[j][0]}\".  Your spreadsheet must have this value." if field_label == 'Item ARK' && row[column_number].blank?
-        next this_row_errors << "Rows missing required value for \"#{REQUIRED_VALUES[j][0]}\".  Your spreadsheet must have this value." if field_label == 'IIIF Manifest URL' && row[column_number].blank?
+        next this_row_errors << "Rows missing required value for \"#{REQUIRED_VALUES[j][0]}\".  Your spreadsheet must have this value." if field_label == 'IIIF Manifest URL' && !object_type.include?("Page") && row[column_number].blank?
         next unless types_that_require.include?(object_type)
         next unless row[column_number].blank?
         this_row_warnings << if field_label == 'Rights.copyrightStatus'