Skip to content
This repository has been archived by the owner on Mar 30, 2021. It is now read-only.

Commit

Permalink
Merge pull request #37 from alexconlin/master
Browse files Browse the repository at this point in the history
Fix for handling duplicate records when loading in Redshift.
  • Loading branch information
rmahfoud committed Mar 12, 2015
2 parents 1951dd6 + 4359c79 commit 7c2417b
Showing 1 changed file with 1 addition and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,6 @@ public List<String> emit(final UnmodifiableBuffer<String> buffer) throws IOExcep
LOG.info("All the files in this set were already copied to Redshift.");
// All of these files were already written
rollbackAndCloseConnection(conn);
records.clear();
return Collections.emptyList();
}

Expand All @@ -128,7 +127,7 @@ public List<String> emit(final UnmodifiableBuffer<String> buffer) throws IOExcep
}
// Write manifest file to Amazon S3
try {
writeManifestToS3(manifestFileName, records);
writeManifestToS3(manifestFileName, deduplicatedRecords);
} catch (Exception e) {
LOG.error("Error writing file " + manifestFileName + " to S3. Failing this emit attempt.", e);
return buffer.getRecords();
Expand Down

0 comments on commit 7c2417b

Please sign in to comment.