Document Management Tools

iFilter and OCR Resource Library

There are some standard iFilters (indexing filters) included within the Microsoft Windows operating system. These indexing filters allow users to perform full text searches on Microsoft Office files, text documents, HTML files, and many other formats. In order to realize the full benefits of DocuXplorer's functionality, you can install any of the iFilters listed below. Depending on which filter you install, DocuXplorer can index and full-text search files like TIFFs, PDFs, and CAD, among others. 

While many of the filters are free of charge, others are offered from third-party developers and may need to be purchased. 

Corel WordPerfect

32/64bit

Search text inside WordPerfect documents using DocuXplorer.

DWG iFilter - CAD files

32/64bit

Search text inside DWG format CAD files using DocuXplorer.

Microsoft Office 2010 iFilter Pack

32/64bit

These iFilters are used by DocuXplorer to index the contents of specific document formats: .docx, .docm, .pptx, .pptm, .xlsx, .xlsm, .xlsb, .zip, .one, .vdx, .vsd, .vss, .vst, .vdx, .vsx, and .vtx. Be sure to install all related service packs.

Adobe PDF iFilter

32bit

Starting with Acrobat and Reader 7.0.5, iFilter functionality is now bundled within the Acrobat and Reader products. Improvements to iFilter in Acrobat and Reader 8 include support for Vista and Windows Desktop Search, as well as improved performance and stability. It is recommended that you update your copy of Adobe Acrobat or Adobe Reader in order to get the most current iFilter functionality, rather than download and install the stand-alone iFilter plug-in.

Adobe PDF iFilter

64bit

Adobe PDF iFilter allows searching for PDF files on Microsoft® Windows® 64-bit platforms.

Tesseract OCR Training Tool

OCR

jTessBoxEditor is a box editor and trainer for Tesseract OCR, providing editing of box data of both Tesseract 2.0x and 3.0x formats and full automation of Tesseract training. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 6.0 or later.

Tesseract support files for Indian languages

OCR

This open source OCR engine extends DX abilities to OCR language formats in Bengali, Gujarati, Hindi, Kannada, Malayalam, Oriya, Punjabi, Tamil and Telugu.

iFilter Shop

Vendor

IFilterShop line of iFilters is free for non-commercial/non-institutional use. Commercial and institutional users must purchase a license.

iFilter.org

Vendor

Open source iFilters.

PDFlib TET ifilter

Vendor

Supports Western text, Chinese, Japanese, and Korean (CJK) text and right-to-left languages such as Arabic and Hebrew. Indexes protected documents and extracts text even from PDFs where Acrobat fails. Supports Unicode folding, decomposition, and normalization Deployment: thread-safe; fast and robust; 32- and 64-bit versions. Automatic script and language detection for improved search.