1. Exports content to create HTML navigation and webapges so that the extracted pages can be viewed with extracted text
pdftohtml.exe -nofonts d:\source.pdf d:\outputDir
Try it
Convert PDF to HTML
Render the first page of the selected PDF to HTML using XPDF's pdftohtml. The tool emits one HTML file per page; this demo shows page 1. Use the 'Preview HTML' button on the Output to see the rendered result.
# PDFToHTML - Help
Source: https://www.xpdfreader.com/download.html
```
pdftohtml version 4.03 [www.xpdfreader.com]
Copyright 1996-2021 Glyph & Cog, LLC
Usage: pdftohtml [options] <PDF-file> <html-dir>
-f <int> : first page to convert
-l <int> : last page to convert
-z <number> : initial zoom level (1.0 means 72dpi)
-r <int> : resolution, in DPI (default is 150)
-nofonts : do not extract embedded fonts
-skipinvisible : do not draw invisible text
-allinvisible : treat all text as invisible
-opw <string> : owner password (for encrypted files)
-upw <string> : user password (for encrypted files)
-q : don't print any messages or errors
-cfg <string> : configuration file to use in place of .xpdfrc
-v : print copyright and version info
-h : print usage information
-help : print usage information
--help : print usage information
-? : print usage information
```