By Ben Kempe

Advanced PDF templating using XDocReport with JodConverter

PDF generation from templates is a common requirement for many software applications. Sometimes, simple fillable PDF Forms are a sufficient solution and many PDF libraries support this sort of templating.

As part of our software platform, Elements, we generate reports for all patients using a variety of report types, while giving our clients the ability to fully customize the design. For these business requirements, we needed a solution to generate PDFs from arbitrary advanced templates using a complex domain model. The resulting output needed to be as close as possible to the original design, with full control over the page and while minimizing development time.

An overview of the PDF templating landscape

Low level generation

These solutions give full control over the generated PDF output, but recreating the template in the native library API takes up a lot of development time.

Higher level generation using proprietary authoring solutions

These solutions emphasize BI report generation over PDF generation. The limitations of these solutions make them unsuitable for more advanced designs and cumbersome, unless the template in question has been created with the included authoring software.

HTML Conversion

wkhtmltopdf is a powerful solution for generating PDFs from HTML templates, but lacks common print features like multi-column support. Also, creating a correctly positioned HTML template for print output will take up a considerable amount of development time.

Print format templating

OpenOffice ODT files provide an accurate page representation, so the format can be leveraged for generating print-ready PDF output. XDocReport (XML Document reporting) is a library which supports rendering OpenOffice ODT template files using the Velocity or Freemarker engines. The resulting ODT output can then be rendered into a PDF file using JodConverter (Java OpenDocument Converter), which manages LibreOffice processes and access to the OpenOffice API. We’re using the well-maintained JodConverter fork by Simon Braconnier here.

Implementing a templating service using XDocReport and JodConverter

Install LibreOffice locally.

Dependencies

The following dependencies (here as Maven pom.xml) are required to use XDocReport with Freemarker and JodConverter.

<properties>
    <xdocreport.version>2.0.1</xdocreport.version>
</properties>
<dependencies>
    <!-- requires LibreOffice to be installed on the instance -->
    <dependency>
      <groupId>org.jodconverter</groupId>
      <artifactId>jodconverter-local</artifactId>
      <version>4.1.0</version>
    </dependency>
    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>fr.opensagres.xdocreport.template.freemarker</artifactId>
        <version>${xdocreport.version}</version>
    </dependency>
    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>fr.opensagres.xdocreport.document.odt</artifactId>
        <version>${xdocreport.version}</version>
    </dependency>
    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>fr.opensagres.xdocreport.converter.odt.odfdom</artifactId>
        <version>${xdocreport.version}</version>
    </dependency>
</dependencies>

You can also use the jodconverter-spring-boot-starter dependency to further simplify the setup in Spring applications.

Code snippet

The (Scala) code below is sufficient to generate the conversionOutput PDF from the templateInputStream ODT file while providing all template placeholder values in a context map:

val templateEngineKind = TemplateEngineKind.Freemarker
val fieldsMeta = new FieldsMetadata(templateEngineKind)
// for html fields, add: fieldsMeta.addFieldAsTextStyling(htmlField, SyntaxKind.Html)
try {
  val interpolationOutput = File.createTempFile("interpolationOutput", ".odt")
  XDocReport.generateReport(templateInputStream, templateEngineKind.name, fieldsMeta, context, interpolationOutput)
  JodConverter
      .convert(interpolationOutput)
      .as(DefaultDocumentFormatRegistry.ODT)
      .to(conversionOutput)
      .as(DefaultDocumentFormatRegistry.PDF)
      .execute()
} finally {
  FileUtils.deleteQuietly(interpolationOutput)
}

Usage

Template Syntax Quick Start for Freemarker
  • Defaults: ${myPlaceholder!someDefault}, also ${myPlaceholder!} to handle not-null placeholders
  • Nested structures: ${parent.child}
  • Lists: [#list items as i] ${i.someAttribute} [/#list] where the [#list] directives should be stored in an Input Field to avoid problems with line breaks, white spaces, etc.
  • Table row repetitions: Use the XDocReport tags @before-row[#list someRows as someRow] and @after-row[/#list] in Input Fields within the first table cell. Then use the placeholder ${someRow.someCellAttribute} in all necessary table cells. For more details, please see the XDocReport Wiki.
Example

Given the ODT and an example placeholder context

XDocReport and JodConverter produce the PDF

Creating templates from a sample PDF

Here is a simple way to get started by creating an ODT from any sample PDF:

  • Use Adobe Acrobat to export the PDF to Word 97-2003
  • Use Microsoft Word to save the resulting doc file as Word 97-2004 (doc), again
  • Load the doc file in LibreOffice and save it as ODT
  • Clean up: Delete any “Heading … Char” styles that exist in Styles And Formatting > Character Styles
  • Clean up: You will likely have to go to Styles and Formatting > Paragraph Styles > Text Body > right-click Modify and set the correct font name
  • Replace dynamic text with ${myPlaceholder} placeholders.

Troubleshooting

You can fix complex template issues by unzipping the ODT file, editing the content.xml and then updating the ODT.

unzip template.odt -d template_edit && cd template_edit

#editing the content.xml

zip -f ../template.odt content.xml  

Split placeholders

Make sure you do not split placeholders (${myPlaceholder}) across ODT tags. For example, the content.xml should not contain invalid templating markup like this:

Incorrect

<text:p>${myPla<text:span>ceholder}</text:span></text:p>


Correct

<text:p>${myPlaceholder}</text:p>

Style problems with input fields

If you see an exception like

org.jodconverter.office.OfficeException: could not load document: some_template.pdf-in2412284939948309622.tmp

the template file is incorrect. LibreOffice is known to have this problem when there is no text:span around a text:text-input (which is needed for handling lists).

Incorrect

<text:p>
  <text:text-input>[#list items as item]</text:text-input>
</text:p>


Correct

<text:p>
  <text:span>
     <text:text-input>[#list items as item]</text:text-input>
  </text:span>
</text:p>

Conditionals and well-formed XML

When using Freemarker directives that may omit certain ODT tags (e.g. an if directive), it’s important to ensure that the resulting XML output is well-formed for all branches.

For example, the following valid template

<text:p>
  <text:span>
     <text:span>
         <text:text-input>[#if myCondition]</text:text-input>
         Some Text
     </text:span>
     <text:text-input>[/#if]</text:text-input>
  </text:span>
</text:p>

would produce this invalid output if myCondition evaluates to false:

<text:p>
  <text:span>
     <text:span>
         <text:text-input></text:text-input>
  </text:span>
</text:p>

References

Read These Next

comments(0)

Your email address will not be published. Required fields are marked *

x

x

x