Comparison Profile

Compared Types

Text comparison

Ignore invisible elements

The purpose of this filter is to ignore the meaningless elements, generated by certain PDF renderers. E.g. text outside of the visible area(or page) or transparent borders of tables. So the filter is designed to efficiently remove:

  • transparent text
  • tiny shapes which are not visible at 100% scale
  • transparent or white filled shapes
Clipping calculation

Several document formats, such as PDF, are actually vector graphics formats. These documents contain commands which advise the viewer application what and where to draw. The commands are not exclusive and may cause overlapping or nonsensical drawing operations. Like text that is hidden behind an opaque shape. Or a white line on a white background. To recognize such scenarios, where certain graphical elements are hidden or clipped, requires to calculate the actual visibility of each such element. Due to the performance impact of this computation it has to be activated by the switch COMPUTE_CLIPPING.

With this feature active, i-net PDFC will check for any element: whether it's occluded, whether it can be merged together with similar elements (for shapes and images), whether it is clipped in some way and whether it's the same color as it's background. So, only the visible part of each element will be compared or the element is ignored if it has no impact on the visual appearance of the document.

Property
Property NameDescription
FILTERS Add INVISIBLEELEMENTS to the comma separated list to enable. Potentially invisible elements such as white or transparent lines are not compared.
COMPUTE_CLIPPING Enables the calculation of the actual visibility of each element in the document. This feature may require a lot of performance, thus the default value is false