PDF Reflow – Convert PDF to Text

PDF Reflow – Convert PDF to Text

PDF Reflow feature allows you to extract pure text from a PDF page to view it as a simple TXT file, without left/right scrolling and with the font size of your choice.
First, transfer a file to GoodReader (see: How To Import Files & Folders), then tap a file name to open it.
PDF Reflow offers many advantages for reading on an iPad, but it is especially useful on small iPhone screens.

Unlike with TXT files, you don't have to choose the correct text encoding to view reflowed text; all necessary text encoding is chosen internally in this case. All other parameters that you usually adjust for reading TXT files in Application Settings apply to this mode.
Use all reading techniques that you usually use for reading TXT files, including Autoscroll.

PDF Reflow is done on a page-by-page basis due to performance reasons. So you will only see the text from the current PDF page in Reflow mode. However, all techniques for turning PDF pages apply to Reflow mode - you can turn reflowed pages by swiping, by tapping or by using Turn Page buttons. Please note that when you turn a page in Reflow mode, the corresponding page in the original PDF mode is also turned, so two viewing modes are always in sync page-wise.

If Autoscroll is on and you're turning a page, autoscrolling will continue after 3 seconds pause - GoodReader lets you catch up with the first few lines of text.

You can quickly go back to the original PDF page by pressing the back button in the navigation menu. For your convenience, we have reserved the same zone of the screen for the same purpose when the navigation menu is off. Just tap where the back button is supposed to be, and you'll get back to the original PDF page.
Please note:

  • a scanned PDF page is an image, and does not include actual text, so there's no actual text to extract with PDF Reflow. However, modern, sophisticated PDF creation software may process the file with OCR (optical character recognition).
    In such cases, reflowing may be possible.
  • text extracted from a PDF page doesn't necessarily have the same grouping order as you visually see it on a page. Text lines may be mixed up. GoodReader extracts the text as it is encoded inside PDF file, and it's up to PDF creator to encode text paragraphs in the correct order, which doesn't always happen.
  • PDF Reflow is a very experimental feature. The correct extraction of text is not always possible. The PDF format allows omitting information that would allow extracting encoded text. So there are many PDF files, which you can read in graphic mode, but extracting text from them may produce unexpected results. For example, PDF format allows us to specify the exact page coordinates of every single character. Therefore many PDF files do not include whitespace or line-break characters, making it very hard to determine word-breaks and line-breaks. We have implemented a very sophisticated heuristic algorithm in GoodReader that makes guesses about word-breaks and line-breaks depending on letter-positioning on a page. Although we did a vast amount of testing and we're proud to say that GoodReader handles most of the cases well, there's still a chance of breaking words and lines incorrectly
There are a few options in Application Settings that help break lines correctly depending on a text formatting style:

  • Double-break per paragraph. Inserts two line-breaks between groups of lines that are distant from each other (considered as paragraphs). Treats a group of close lines as a continuous text. Useful for book-like or article-like formatting. Larger line spacing switch helps to determine which lines are "close" to each other, and which are not. Many Asian texts require a larger line spacing option to be on.
  • Single break per paragraph. The same as the previous setting, but inserts a single line-break between paragraphs.
  • Break every line. Inserts line-break at the end of every visual line. Useful with tables, where lines are close to each other, making it look like it's a single paragraph, but all lines should be separated from each other.
  • No line-breaks. Treats all text on a page as a continious text stream. Use it if all other options produce an undesirable result.
After changing a line-breaking option in Application Settings you have to close the PDF Reflow view if it was open and reflow the text again.

Notice for right-to-left readers (Hebrew, Arabic, and others). Some PDF files with right-to-left fonts instead of encoding text as they should - from right to left - actually contain text stored in left-to-right (reversed) order. GoodReader extracts text in the order as it appears in PDF file, which makes it look backward in Reflow mode. We're still working on this issue. Please keep in mind that this problem created by PDF creating software, which doesn't store text inside PDF in the correct order.
Related Questions

GoodReader User Manual: Reading PDF with GoodReader. Main Features Guide.

GoodReader User Manual: Navigating Through a PDF File. Main Features Guide.

GoodReader User Manual: How to Find Text in PDF. Main Features Guide.

GoodReader User Manual: How to Add a Bookmark to PDF. Main Features Guide.

WOW WOW newest update!!!! Love GoodReader - so many features, especially for PDFs

GoodReader has become my go-to app on the iPad Pro 12.9 - I ditched my desktop Mac (yes, I got sucked into the Apple commercials about tossing your laptop/desktop for a iPad Pro 12.9) - now I do things a little differently, so GoodReader is my key tool to manipulate files.

Mac had 'Preview' - not sure what Apple had in mind as an equivalent for the iPad, so I use GoodReader to fill that gap. Works to manipulate, crop, edit, amend, split, etc., etc.,....

So how can I back up my entire GoodRead data 'blob' to iCloud? I guess my normal cloud backup gets it all, but I'm wondering if GoodReader has it's own cloud backup to iCloud. I know there is an iCloud folder on GoodReader - should I move everything in there as my 'root' folder and work from that?

C. Amex, Review on Apple Store