Limitations

The following limitations will occur for the various parser types. Please note, that this may be subject to change in newer versions.

i-net PDFC Comparison

The following elements can not be compared or have limited comparison capabilities.

  • Diagrams are drawings in PDF and thus will be compared as composition of lines and text
  • Technical drawings will compare each line separately, parts and components cannot be identified
  • Proprietary graphical content like complex equations (e.g. MathML), special renderers (e.g. SVG embedding, Wordart) or data mappings (e.g. diagrams, maps) have no standardized representation in PDF. The PDF generator converts such content to drawings and sometimes custom fonts. The comparison of i-net PDFC is limited to the PDF elements created by the PDF generator.
    • Example 1 - A text formatted as 'Wordart' will be converted to a drawing when exporting to PDF. The comparison will compare the shapes of this drawing but cannot reconstruct the original text, which is lost in the PDF conversion.
    • Example 2 - A math equation uses special symbols. Such symbols can be exported as custom fonts or drawings. In case of fonts, it's up to the PDF generator to provide a proper mapping to unicode characters which i-net PDFC can then use for the comparison.
  • Layers and optional content are supported but layer visibility cannot be selected by the user and is calculated by it's default visibility setting
  • Line Breaks between words - PDF has no line break marker or paragraphs; the comparison will not assume whether a line break is intentional or the result of a paragraph margin; use 'strict' comparison to find all line break differences(automatic and by user)
  • Changes to the page margin / indentation - PDF has no margin and indent markup; the PDF 'bbox' values are optional and not used for clipping; as a result, a different horizontal location of text will cause no difference
  • Changes to the glyphs of fonts with the same name - if the fonts in the compared files have the same name; the comparison assumes that their visual appearance is equal; the visual appearance of each single character will not be compared
  • Recognize fonts as equal, if the name can not be matched - if compared fonts have a different name, it will be a difference even if the visual appearance is equal; this applies especially to embedded subtypes
  • Moving text or other objects to a different position in terms of content - PDF has no entities like text boxes; so a moved object (table, text box) is a major differences for all contained atomic element
  • Images: changes in image effects - Only the actual visual appearance is compared
  • Images: uncompressed vs. highly compressed - Jpeg compression can lead to extensive artifacts which will be recognized as difference when compared to uncompressed formats
  • Embedded content does not contribute to the printable content of the document and thus will be ignored for the comparison. This applies for instance to embedded files, videos or sound effects.