Skip to content

Commit

Permalink
CAL-920 Ignore Page rows on ingest to Californica (#847)
Browse files Browse the repository at this point in the history
* test

* working but title problem

* removed temp type insertion

* sending to Dev for testing

* fixed for rubucop

* added _not ingested_ to the Import Status Report

* reformmated for rubocop

* updated to fix test

* added back row_status code

* fixed rubocop

* added additional logic for _not-ingested_ message

Co-authored-by: darrowcoucla <[email protected]>
Co-authored-by: JenDiamond <[email protected]>
  • Loading branch information
3 people authored Sep 1, 2020
1 parent 4740a4a commit e5a020b
Show file tree
Hide file tree
Showing 4 changed files with 19 additions and 8 deletions.
2 changes: 2 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ Metrics/CyclomaticComplexity:
- app/models/concerns/discoverable.rb
- app/uploaders/csv_manifest_validator.rb
- app/indexers/year_parser.rb
- app/jobs/csv_row_import_job.rb

Metrics/LineLength:
Exclude:
Expand Down Expand Up @@ -110,6 +111,7 @@ Metrics/PerceivedComplexity:
- app/uploaders/csv_manifest_validator.rb
- app/controllers/application_controller.rb
- app/indexers/year_parser.rb
- app/jobs/csv_row_import_job.rb

RSpec/AnyInstance:
Enabled: false
Expand Down
11 changes: 7 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,9 +156,10 @@ bundle exec rake californica:ingest:csv

### Adding works to a collection

You can add a `Parent ARK` column to the CSV file. For each row of the CSV, add the ARK for the collection that work should belong to. The importer will find the `Collection` record with the matching ARK. If the `Collection` record doesn't exist yet, the importer will create a new `Collection` using that ARK.
You can add a `Parent ARK` column to the CSV file. For each row of the CSV, add the ARK for the collection that work should belong to. The importer will find the `Collection` record with the matching ARK. If the `Collection` record doesn't exist yet, the importer will create a new `Collection` using that ARK.

## Read-only mode

Californica has a read-only mode, which can be enabled via the `Settings` menu on the admin dashboard, and is useful for making consistent backups or migrating data.

Ideally, a user should log in as an admin, enable read-only mode, and keep the window open so they can then disable it again. If the window accidentally gets closed, and the system is stuck in read-only mode, open a rails console and fix the problem like this:
Expand Down Expand Up @@ -186,6 +187,8 @@ Californica is available under the [Apache License Version 2.0](./LICENSE).
---

### flaky tests
##### `bundle exec rspec spec --seed 43408`
+ rspec `./spec/system/import_from_csv_spec.rb:16` # Importing records from a CSV file logged in as an admin user starts the import
+ rspec `./spec/system/new_collection_spec.rb:21` # Create a new collection logged in as an admin user successfully creates a new collection with an ark based identifier

##### `bundle exec rspec spec --seed 43408`

- rspec `./spec/system/import_from_csv_spec.rb:16` # Importing records from a CSV file logged in as an admin user starts the import
- rspec `./spec/system/new_collection_spec.rb:21` # Create a new collection logged in as an admin user successfully creates a new collection with an ark based identifier
12 changes: 9 additions & 3 deletions app/jobs/csv_row_import_job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ def perform(row_id:)
@row = CsvRow.find(@row_id)
@row.ingest_record_start_time = Time.current

@row.status = 'in progress'
@metadata = JSON.parse(@row.metadata)
@row.status = @metadata["Object Type"].include?("Page") ? 'not ingested' : 'in progress'
@metadata = @metadata.merge(row_id: @row_id)
@csv_import = CsvImport.find(@row.csv_import_id)
import_file_path = @csv_import.import_file_path
Expand All @@ -22,12 +22,18 @@ def perform(row_id:)
else
actor_record_importer
end
selected_importer.import(record: record)

selected_importer.import(record: record) unless @metadata["Object Type"].include?("Page")
@row.status = if ['Page', 'ChildWork'].include?(record.mapper.object_type)
"complete"
if @metadata["Object Type"].include?("Page")
"not ingested"
else
"complete"
end
else
"pending finalization"
end

end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
@row.ingest_record_end_time = Time.current

Expand Down
2 changes: 1 addition & 1 deletion app/uploaders/csv_manifest_validator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ def validate_records
field_label, types_that_require = REQUIRED_VALUES[j]
next this_row_errors << "Rows missing required value for \"#{REQUIRED_VALUES[j][0]}\". Your spreadsheet must have this value." if field_label == 'Title' && row[column_number].blank?
next this_row_errors << "Rows missing required value for \"#{REQUIRED_VALUES[j][0]}\". Your spreadsheet must have this value." if field_label == 'Item ARK' && row[column_number].blank?
next this_row_errors << "Rows missing required value for \"#{REQUIRED_VALUES[j][0]}\". Your spreadsheet must have this value." if field_label == 'IIIF Manifest URL' && row[column_number].blank?
next this_row_errors << "Rows missing required value for \"#{REQUIRED_VALUES[j][0]}\". Your spreadsheet must have this value." if field_label == 'IIIF Manifest URL' && !object_type.include?("Page") && row[column_number].blank?
next unless types_that_require.include?(object_type)
next unless row[column_number].blank?
this_row_warnings << if field_label == 'Rights.copyrightStatus'
Expand Down

0 comments on commit e5a020b

Please sign in to comment.