This filter option toggles how the layout detection of i-net PDFC will react to PDF files. By default, i-net PDFC will try to detect the layout of the document pages to some extend to reconstruct the reading order of the content. This is required as most PDF generators - especially PDF printers - print the text in an arbitrary order. The order could be for instance the time when the elements were added in a text processing application.
But in case the generator creates the entities in the PDF document by paragraph or text block, the original order could be a reasonable alternative the keep the context of the text. This requires both compared documents to be created by the same application or PDF generator.
The layout detection mechanisms of i-net PDFC are suited for the common cases of paged documents and layouts. But there are ambiguous layouts that may not be detected correctly. For instance with text boxes:
This text is arranged in the primary text part of the document and a text box. The problem here is that i-net PDFC cannot detect the text box and it's anchor in the primary text. As a result the whole content will be regarded as one paragraph with unusual spacing.
Most PDF generators of text processing applications will export the primary text first and then the text box.
If that's the case, the parser component of i-net PDFC will first encounter the text of the red box and then the text of the blue box when reading the document. With the 'Compare original PDF text order' option active, the layout detection for text runs will be skipped and the text will be compared red first, then blue. So the text box content will be intact.
This options interferes with all filters that detect layout components like tables or columns. Depending on the order of the content these filters may not work as expected.
With this option, the logical structure of the PDF document will be used as a basis for the comparison. This structure is for instance used for the reading order in accessible document. So it's the optimum grouping the for the content - if available.
The logical structure is an optional feature for PDF document. Furthermore it has to be explicitly generated per document. That's why it's missing in most cases. In that case, the filter option will have no effect.
You can check whether the filter is effective by activation the filter visibility. If effective, the filter will mark up all recognized tables, drawings and paragraphs.