-
Notifications
You must be signed in to change notification settings - Fork 19
Mapping columns to attributes
The main way in which an importer processes the contents of a spreadsheet is by mapping columns in it to attributes in the data model. Here we take a closer look at how to declare this mapping in an importer.
Importers are subclasses of ActiveImporter::Base
. This base class provides a DSL for declaring several features of the importing process. One of the most important methods in this DSL is the column
method. It receives a column header name, and an attribute name, and this establishes a mapping between the values in that spreadsheet column, and the corresponding attribute in the data model.
class EmployeeImporter < ActiveImporter::Base
imports Employee
column 'First name', :first_name
column 'Last name', :last_name
end
As you can see in the example above, this importer expects to receive a spreadsheet with columns named 'First name' and 'Last name'. It will import all rows in such a spreadsheet, into the Employee
data model, storing the values in the column 'First name' into the :first_name
attribute of the model, and the values in the column 'Last name' into the :last_name
attribute, creating one new model for each row processed successfully.
We often need to process incoming data that has to be pre-processed before storing it in the corresponding attribute. Perhaps we need to trim excess whitespace off a string value, parse a string value converting it into some internal representation, etc. This can be achieved by providing a block to the column method, in which we process the incoming value and return the actual value to be stored in the corresponding attribute.
class EmployeeImporter < ActiveImporter::Base
imports Employee
column('First name', :first_name) do |first_name|
first_name.capitalize
end
column 'Last name', :last_name, &:capitalize
end
You can see in the example above how this works. The block of code receives as argument the incoming value (the value coming from the spreadsheet in the corresponding column). Inside the block we process this value and return the actual value we want to store in our database model. In this case we're instructing the importer to capitalize each name (in case you didn't notice, both attributes are being processed in the same way, we're just using in the second example a more concise Ruby way to specify the transformation block).
Most commonly, we might need to use an incoming value to search a corresponding record from another data model, and establish the association with the model being created. Blocks like the ones shown above can help us do that:
class EmployeeImporter < ActiveRecord::Base
imports Employee
column 'Department', :department do |department_name|
Department.find_by(name: department_name)
end
# ...
end
You can see how in this example, we're using the incoming value (the department name) to look-up in the Department
data model for a department with that name. In this case, we're setting department to nil
if no department is found, but we could take different approaches, such as raising an error if no such department is found, or even dynamically creating new departments when the name does not match any existing one. This all depends on what you want for your applications.
Sometimes, while processing a column's incoming value, you need to check on values from other columns. It turns out that you have full access to the entire spreadsheet row inside these blocks of code. The importer exposes it via the row
method, which is a hash with the keys being the column headers, and the values being the actual values for each column in the current row.
Suppose we're processing a spreadsheet of employees, and it comes with columns for first and last name. But our data model has a single column called :full_name
, which should contain the concatenation of the two other names.
class EmployeeImporter < ActiveImporter::Base
imports Employee
column('First name', :full_name do |first_name|
last_name = row['Last name']
[first_name, last_name].compact.join(' ')
end
end
Note how we're mapping the 'First name' column to the :full_name
attribute, but inside the block we're using the row
hash to access the last name and include it in the actual value that will be stored in the model.
Importer column declarations may omit the attribute and the block of code. This is useful when you have a column that will be processed inside a custom processing block, but you still need to inform the importer that this is a required column, so it can be used when scanning the spreadsheet to find the header row.
class EmployeeImporter < ActiveImporter::Base
imports Employee
column 'Tags'
on :row_processing do
tags = row['Tags'].split(',').map(&:strip).map(&:downcase)
model.manager = !!tags.delete('manager')
model.tag_ids = tags.map { |tag| Tag.where(name: tag).first_or_create }
end
end
The importer above will not automatically assign the value from the 'Tags' column to any attribute, but will still consider this column name when searching for the header row.
Columns can be declared as optional. This lets the importer know that the spreadsheet may or may not include that column, and only processes it if present. The column name will not be used to match the row header.
class EmployeeImporter < ActiveRecord::Base
imports Employee
column 'Department', :department, optional: true do |department_name|
Department.find_by(name: department_name)
end
# ...
end
The above importer will process spreadsheets as usual, but it will also be able to consume a spreadsheet not containing the 'Department' column. In that case, the block of code will never be invoked. It is important to note that if the :department
attribute is required, then you still need to be sure it is assigned some value. Otherwise all rows will fail to be imported.
If logic inside these blocks ever gets too complex, you may need to do some refactoring. In that case remember you can use helper methods to DRY-up your importer code.
© 2014 Continuum