Skip to content

Commit

Permalink
REL-878227 long text fields encoding configuration (#18)
Browse files Browse the repository at this point in the history
* REL-878227 long text fields encoding configuration
  • Loading branch information
annasojkapal authored Oct 19, 2023
1 parent 2b294c2 commit 1779090
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1139,7 +1139,7 @@ List of samples:
## Import Job Settings

### Encoding
For improved performance when dealing with fileshare data on ADLS, we highly recommend using extracted text or other long text files encoded in UTF-16. By doing so, you can avoid the need for conversion to the correct encoding, leading to significant time savings in your document and image workflows.
For best performance, we highly recommend using UTF-16 encoding for any single long text field (including Extracted text). Other encodings are still supported, but will be converted to UTF-16 which will add delay to document or image import process.

For the document workflow, set **FieldMapping.Encoding** to UTF-16. Similarly, for the image workflow, configure **ImageSettings.ExtractedTextEncoding** as UTF-16. With these settings in place, the conversion overhead is eliminated, and your files will be copied directly in the unicode encoding, resulting in faster processing times.

Expand Down Expand Up @@ -1169,6 +1169,7 @@ For the document workflow, set **FieldMapping.Encoding** to UTF-16. Similarly, f
.WithoutFieldsMapped()
.WithoutFolders();

If your mapping contains more than one long text field, you should use UTF-16. No other encodings are supported in this case.

### FileSizeColumnIndex
Another valuable setting that can enhance performance is the **FieldMapping.FileSizeColumnIndex**. By configuring this setting, the need for additional file size calculations can be eliminated. The file sizes will be automatically extracted from the load file, streamlining the process and saving valuable processing time.
Expand Down

0 comments on commit 1779090

Please sign in to comment.