Skip to content

The PDF Writer Object

galkahana edited this page Aug 15, 2011 · 9 revisions

The PDFWriter object is the entry point to the PDF Library. Users create a PDFWriter object instance and through it perform either higher level commands, or access the lower level objects for complementary functionality.
The PDFWriter object header file can be accessed here

Basic Usage

Using the PDFWriter object is simple. A template to creating a PDF file can be seen in the following code:

PDFWriter pdfWriter;	
// Open file for PDF writing
pdfWriter.StartPDF("C:\\Basic.PDF" , ePDFVersion13);
// Write pages and contents
.....
// Finalize PDF. write final objects and close file for writing
pdfWriter.EndPDF();

The code first creates a PDFWriter object instance. Then it initializes the writer with the target PDF file with the InitializePDFWriter command. Then starts writing the PDF with the header. At this point the user may add pages and graphics and whatever the PDF content should be. Then with EndPDF command finalizes the PDF and it is now ready.

PDFWriter methods

The PDFWriter objects has various high level methods to write the PDF. It also provides access to the lower levels of the PDF writing model, to allow for extensibility and advanced functionality. The following provide a short summary of the commands of PDFWriter

Page Content Context methods

To create a PDF page, create and instance of the PDFPage object with a simple new PDFPage() call. The page can content can now be set. To manipulate page content on the page use the relevant PDFWriter methods:

  1. PageContentContext* StartPageContentContext(PDFPage* inPage)
    StartPageContentContext creates a content context for a page. This will finally manifest itself as either one or more content streams for the page. as input it receives a page, and returns a PageContentContext object which can now be used for writing content commands.
  2. EStatusCode PausePageContentContext(PageContentContext* inPageContext)
    PausePageContentContext is to be used whenever a certain page content is to be temporarily paused for the benefit of writing a different object. This is very useful when a user who writes a page wants to temporarily pause the stream in order to write either a Form or Image XObject, or a font definition, to be used in that page. Simply pause the content using PausePageContentContext and after you finished writing the intermediate object continue writing to the stream using the context commands. using any of the context commands will restart the context writing.
  3. EStatusCode EndPageContentContext(PageContentContext* inPageContext)
    EndPageContentContext is to be used when the page content is done. it finalizes the writing context and releases all relevant objects. After calling that command the writing of the page object itself can commence.

Page Writing Methods

after the content writing is finished you can write the page. use one of the following:

  1. EStatusCode WritePage(PDFPage* inPage)
    Write a page object. the page object is not released by the command, and may now be reused.
  2. EStatusCode WritePageAndRelease(PDFPage* inPage)
    Write a page object and release it. pointer may no longer be used.

For extensibility purposes you might be interested in the PDF Object IDs for the written pages. you can use these two methods, alternatively:

  1. EStatusCodeAndObjectIDType WritePageAndReturnPageID(PDFPage* inPage)
  2. EStatusCodeAndObjectIDType WritePageReleaseAndReturnPageID(PDFPage* inPage)
    both methods return an object of type EStatusCodeAndObjectIDType which .first member is the method success/failure code, and the .second member is the page PDF Object ID (in case of success).

Form XObject Writing Methods

PDFWriter allows the creation of form XObjects. Form XObject function much like pages, having their own content context.

To create a form XObject use
PDFFormXObject* StartFormXObject(const PDFRectangle& inBoundingBox,const double* inMatrix = NULL)
The return form XObject may be used for one of the following functionalities:

  • accessing the form content stream, which usage is much like a page content stream, accept that it cannot be paused.
  • accessing the XObject ID required in order to add XObjects referencing in pages.
  • accessing the form resources dictionary, that is used for adding object references required by the content stream.

this method has a override allowing the user to input an ObjectIDType. This allows the user to set a pre-created object ID for the form. This practice is normally used in scenarios of forward referencing.

To finalize and write a form XObject use:
EStatusCode EndFormXObjectAndRelease(PDFFormXObject* inFormXObject)
This method writes the form XObject and releases it.

Image creating methods

Image usage is handled via the CreateImageXObjectFromJPGFile, CreateFormXObjectFromJPGFile and CreateFormXObjectFromTIFFFile commands (as well as custom stream related overloads). They are explained in details in Images Support.

PDF embedding methods

You can embed pages of another PDF in this PDF, in one of two forms – as complete pages or as parts of an existing page. The former method is used when you just want to append pages to the PDF. The latter method is used when you want to use the original PDF pages as components for pages in the written PDF. When embedding pages as components you can choose between embedding them as From XObjects, which would allow you to reuse them, or to merge their content with a page, which would fit a one-time usage. The following methods exist for PDF Embedding:

  1. EStatusCodeAndObjectIDTypeList CreateFormXObjectsFromPDF( const string& inPDFFilePath, const PDFPageRange& inPageRange, EPDFPageBox inPageBoxToUseAsFormBox, const double* inTransformationMatrix = NULL, const ObjectIDTypeList& inCopyAdditionalObjects = ObjectIDTypeList())
    Create multiple form xobjects, one for each page, of the input PDF. The return result is a pair of status code and an object IDs list. The IDs are those of the created XObject. You can use them to place the form XObjects in pages. Obviously, as reusable form XObjects go, they can be placed in multiple locations of a page, or in multiple different pages.
    inPDFFilePath is the path to the PDF file to embed
    inPageRange is a structure of type PageRange defining which pages to use. it mType member may be either eRangeTypeAll (for embed all pages) or eRangeTypeSpecific (embed specific pages). For eRangeTypeSpecific, use the mSpecificRanges which is a list of page ranges in the PDF to embed (pairs of inclusive ranges. e.g. (1,3) would page pages 1,2 and 3).
    inPageBoxToUseAsFormBox defines which box of the pages to use as box for the form. A PDF Page may have various types of boxes, and one should be used for the form. Usually ePDFPageBoxMediaBox, matching the media box, would be a good choice.
    inTransformationMatrix is an array of 6 doubles that defines the transformation matrix for the form.
    inCopyAdditionalObjects is a list of object IDs, to copy from the source PDF into the written file. This is good for extensibility options, to allow to copy more than just the pages themselves.
  2. EStatusCodeAndObjectIDTypeList AppendPDFPagesFromPDF( const string& inPDFFilePath, const PDFPageRange& inPageRange, const ObjectIDTypeList& inCopyAdditionalObjects = ObjectIDTypeList())
    Append the input PDF pages as pages of the written PDF. The result is a status and a list of object IDs, matching the pages object IDs.
    inPDFFilePath is the path to the PDF file to embed
    inPageRange is a structure of type PageRange defining which pages to use.
    inCopyAdditionalObjects is a list of object IDs, to copy from the source PDF into the written file. This is good for extensibility options, to allow to copy more than just the pages themselves.
  3. EStatusCode MergePDFPagesToPage(PDFPage* inPage, const string& inPDFFilePath, const PDFPageRange& inPageRange, const ObjectIDTypeList& inCopyAdditionalObjects = ObjectIDTypeList());
    Merge the content of one or more pages of an input PDF with an existing page in the written PDF. The merged content becomes an integral part of the content of the page. This is similar to creating a Form XObject from a page and placing it, however is more fitting in case you do not wish to reuse the content of the input pages. This is a more straightforward approach in such case, which does not require the creating and placing code as would be the case for using Form XObjects.
    inPage is the target page, to which the content should be merged.
    inPDFFilePath is the path to the PDF file to embed
    inPageRange is a structure of type PageRange defining which pages to use. Note, that merging content from multiple pages, using this method alone, will probably require the usage of events to write positioning code between the placement of the multiple pages. [prefer using a copying context in such a case, as is provided by the CreatePDFCopyingContext method.
    inCopyAdditionalObjects is a list of object IDs, to copy from the source PDF into the written file. This is good for extensibility options, to allow to copy more than just the pages themselves.
  4. PDFDocumentCopyingContext* CreatePDFCopyingContext(const string& inPDFFilePath)
    CreatePDFCopyingContext creates an object of type PDFDocumentCopyingContext which provides an alternative, more powerful, method of embedding content from another PDF. A Context is created by calling CreatePDFCopyingContext with the input PDF file path. Then, using the context commands, you can add PDF pages as either pages or xobjects or merge them directly to an existing page, as well as adding any other non-page related information. When done, you can destroy the context object. The benefits of using this method over the regular methods for embedding PDF, is that you can combine multiple source PDF pages embedding (using multiple contexts), as well as use multiple methods of embedding with the same PDF efficiently.

Note that there are method for using custom streams, and not just files.

Read more about PDF embedding in PDF Embedding.

Session end/start methods

You may want to stop using this instance of PDFWriter object, and continue later. In this case, you should call the PDFWriter ShutDown method with this instance as the final command (then you can destroy the object), and call ContinuePDF with the new instance (before calling any other method). The methods are defined as follows:

  1. EStatusCode Shutdown(const string& inStateFilePath)
    Stops the current instance writing. The method creates a file, in the input path, that contains the state information. The file would be used later by ContinuePDF to allow the new instance of PDFWriter to continue from the same point.
  2. EStatusCode ContinuePDF(const string& inOutputFilePath, const string& inStateFilePath, const LogConfiguration& inLogConfiguration = LogConfiguration::DefaultLogConfiguration)
    Continues writing a PDF, which was stopped. A new Instance should call this method as the first one. inOutputFilePath is the PDF output file path, inStateFilePath is the file where state information was saved (see ShutDown). Using the inLogConfiguration parameter you can also determine the log file writing configuration.

Extensibility, Lower level objects

To allow the user to use advanced features of the library or to extend the library capabilities the PDFWriter has methods for acessing lower level objects.

The following methods provides access to lower level objects:

  1. DocumentContext& GetDocumentContext()
    GetDocumentContext retrieves the DocumentContext object used by PDFWriter. The DocumentContext object is responsible for writing complex objects such as pages, images and form. it is described in details in The DocumentContext Object
  2. ObjectsContext& GetObjectsContext()
    GetObjectsContext retrieves the ObjectsContext object used by PDFWriter. The ObjectsContext object is responsible for writing all lower level objects, including primitives, streams. It also holds the IndirectObjectsReferenceRegistry which is in charge of allocating IDs for new objects. The ObjectsContext object is very useful for anyone extending the library that wishes to add new object types, not supported by higher levels, or extend existing ones. you can find more details on the ObjectsContext in The ObjectsContext Object
  3. OutputFile& GetOutputFile()
    if you need direct access to the output file usse the GetOutputFile command. it will provide the handle to the output file. the header file of OutputFile will currently be the source for details.
Clone this wiki locally