PDF File Text Data Extractor. Extracts text data from PDF files. pdfDX intelligently scans one or more PDF files and accurately extracts the text data into easily readable, scan-able text in a format closely resembling the original document.With the use of programming or macros and regular expressions, it allows easier, more impactful and more precise searching, data mining and/or data scraping from most PDF Documents.The pdfDX Command Line mode allows batch formatting as many PDF Files as desired. Command Line mode allows formatting ALL the PDF Files in a Folder. Command Line mode allows operating pdfDX in shell scripts and from custom computer programs.File not totally formatted or not totally readable
pdfDX is not able to format all PDF Files. The PDF File specification is NOT text-oriented, it is graphic-oriented. Text is treated as pictures or images, not letters. Most text can be deciphered, but some is just not possible.Text File output doesn't resemble original PDf File.
pdfDX uses some extensive and highly-sophisticated functions to attempt to replicate the original formatting as closely as possible. The primary problem is attempting to convert different sizes of variable-width fonts to a standard-sized fixed width font.Limitations:
* Converts only the first three pages

Published By:pdfDX.com

License Type:Shareware

Date Added:06 August, 2012



Size:2.3 MB

Platform: Macintosh

What people say
- required fields