IFilterShop XML IFilter Server Edition Release 3.0 README

XML IFilter supports the following Microsoft server operating systems:

Windows Server 2012
Windows Server 2016
Windows Server 2019
Windows Server 2022

XML IFilter supports the following Microsoft desktop operating systems:

Windows 10
Windows 11

XML IFilter supports the following Microsoft Search products

SharePoint Server 2013
SharePoint Server 2016
SharePoint Server 2019
SQL Server 2014
SQL Server 2016
SQL Server 2017
SQL Server 2019
Windows Search

When parsing XML file XML IFilter follows rules set up through the configuration file. Default configuration file (XmlFilterConfig.xml) comes as part of XML IFilter installation. XmlFilterConfig.xml configuration file is stored in XML IFilter installation directory (C:\Program Files\IFilterShop\XmlFilter by default). XML IFilter can be set up to work with user specified configuration file. To change configuration file name and/or location:

Stop all appropriate Search services.
Open registry key "HKEY_LOCAL_MACHINE\SOFTWARE\IFilterShop\XmlFilter"
Add new String value named "ConfigFile" and enter the full path to the new configuration file. If this value is missing, empty or the path is not pointing to a valid file, default configuration file will be used.
Start all appropriate Search services and re-index catalogs containing .xml files.

XML IFilter configuration file is a document in XML format with the following structure:

<config> -- Root element, MUST exist
   <content> -- Defines xpaths that should or should not be indexed as document content, MUST exist.
                All XML elements that are not explicitly excluded in this section will be extracted as document
                content.
      <exclude> -- Defines xpaths that should not be indexed as document content, MUST exist.
                   Can be empty or can include one or several <xpath> elements.
         <xpath> </xpath> -- Value of this node defines xpath that should not be
                             indexed as document content, MAY exist.
         <xpath> </xpath>
         ...
      </exclude>
      <include> -- Defines xpaths that should be indexed as document content, MUST exist.
                   Can be empty or can include one or several <xpath> elements
         <xpath> </xpath> -- Value of this node defines xpath that should be
                             indexed as document content, MAY exist.
         <xpath> </xpath>
         ...
      </include>
   </content>

   <metadata> -- Defines xpaths that should or should not be indexed as document metadata, MUST exist.
                 All XML elements that are not explicitly excluded in this section will be extracted as document
                 metadata with the default Property GUID and Property Name set to the node name.
      <default> -- MUST exist, can be empty
         <guid> </guid> -- Defines Property GUID to be assigned by default to all
                           document metadata. MAY exist, cannot be empty. If this node does not exist
                           then Microsoft defined GUID for user defined metadata which is
                           {D5CDD505-2E9C-101B-9397-08002B2CF9AE} will be used as a default GUID.
      </default>
      <exclude> -- Defines xpaths that should not be indexed as document metadata, MUST exist.
                   Can be empty or can include one or several <xpath> elements.
         <xpath> </xpath> -- Value of this node defines xpath that should not be indexed as
                             document metadata, MAY exist.
         <xpath> </xpath>
         ...
      </exclude>
      <include> -- Defines xpaths that should be indexed as document metadata, MUST exist.
                   Can be empty or can include one or several <mapping> elements.
         <mapping> -- Defines xpath that should be indexed as document metadata, MAY exist, cannot be empty.
            <xpath> </xpath>-- Value of this node defines xpath that should be indexed as document metadata,
                                MUST exist, cannot be empty.
            <property> -- Optional element that defines mapping between XML element and Indexing Service Property.
                          MAY exist, cannot be empty. If this node does not exist then the element defined in
                          corresponding <xpath> node will be output with the default settings.
               <guid> </guid> -- Defines Property GUID, MAY exist, cannot be empty.
               <name> </name> -- Defines Property Name, MAY exist, cannot be empty.
                                 Exclusive with <id> below.
               <id> </id> -- Defines Property ID, MAY exist, cannot be empty.
                                 Exclusive with <name> above.
               <type> </type> -- Defines Property type, MAY exist, cannot be empty.
                                 VT_LPWSTR type is used by default. XML IFilter can also
                                 output properties of VT_FILETIME and VT_INT types
            </property>
            ...
         </mapping>
         ...
      </include>
   </metadata>

   <namespaces> -- Elements in this section define mappings between namespaces and their aliases,
                   MAY exist, can be empty or can include one or several <namespace> elements.
      <namespace> -- Defines alias for XML schema, MAY exist, can be empty.
         <alias> </alias> --  MAY exist, cannot be empty. Missing <alias> element denotes the default namespace
         <schema> </schema>--  MAY exist, cannot be empty.
      </namespace>
      ...
   </namespaces>
</config>

When parsing XML file, XML IFilter first extracts and outputs as document content values of all XML elements and attributes, excluding the nodes defined in <content><exclude> </exclude> </content> section. <xpath>//*</xpath> allows to exclude all XML elements from indexing. After that XML IFilter indexes content of XML elements defined in <content><include> </include></content> section.

Then XML IFilter extracts and outputs as document metadata values of all XML elements, excluding the nodes defined in <metadata><exclude> </exclude></metadata> section. In accordance to Microsoft IFilter specification, XML IFilter defines each XML metadata as combination of Property Set and Property Name. During this step XML IFilter assigns to all document properties Property Set GUID defined in <metadata><default><guid> </guid></default></metadata>section. XML IFilter uses node names to define Property Names. If default Property Set GUID is not defined through the configuration file then XML IFilter uses Microsoft defined GUID for user defined metadata which is {D5CDD505-2E9C-101B-9397-08002B2CF9AE}. After that XML IFilter indexes content of XML elements defined in <metadata><include> </include></metadata> section. This section allows to output values of XML elements under different Property Set GUIDs and Property IDs. It also allows to index XML nodes as non-text elements, such as properties of VT_FILETIME and VT_INT types.

XML IFilter uses settings defined in the configuration file to parse XML files. If format of configuration file is invalid XML IFilter will not operate properly. The IFilter comes with XmlConfigTest.exe utility that allows to test XML IFilter configuration file. XmlConfigTest.exe application is stored in XML IFilter installation directory (C:\Program Files\IFilterShop\XmlFilter by default). It is a command line application that accepts full path to the configuration file as a single command line argument.

Installation Instructions

Setup file is a self-extracting archive that must be downloaded and opened on the machine where you wish to use XML IFilter.

Stop all appropriate Search services.
Uninstall any previous version of XML IFilter.
Start setup file and follow the on-screen instructions.
Start all appropriate Search services.
Re-index catalogs containing XML files.

Some Microsoft Search products require additional setup steps as described below:

SharePoint Server:

In SharePoint Central Administration go to "General Application Settings" page
In the "Search" section click on "Farm Search Administration"
Click on " Search Service Application" link
On the left side menu select "File Types"
Make sure that xml file type is included
Restart SharePoint Search Host Controller service

Microsoft SQL Server:

Restart SQL Server service
Perform a Full Population

When integrated with Windows Search, XML IFilter uses a temporary directory to process XML files. Due to Windows Search security restrictions, IFilters are not able to utilize the default system temporary directory. Therefore, XML IFilter must be set to work with a user specified temporary directory. To change the XML IFilter temporary directory's settings:

Stop Windows Search service.
Open registry key "HKEY_LOCAL_MACHINE\SOFTWARE\IFilterShop\XmlFilter"
Add new String value named "TempPath" and enter the full path to the new temporary directory. If this value is missing, empty or the path is not pointing to a valid directory, system temporary directory will be used. Please make sure that "Users" or "Authenticated Users" Group has "Full Control" permissions to the custom temporary directory.
Start Windows Search service.

When using custom temporary directory, we recommend that you set it as "not indexable" in all your indexing products. Otherwise temporary files may be indexed. This will pollute the index and also can prevent temporary files from proper removal by XML IFilter.

How to Uninstall

If you ever have to uninstall XML IFilter application you can easily do it using any of the following methods:

Through "Add/Remove Program". Open Control Panel ->Add/Remove Program. Select "IFilterShop XML IFilter" in currently installed list. Press "Remove" button.
Through XML IFilter Uninstall program. Go to the directory there XML IFilter was installed (C:\Program Files\IFilterShop\XmlFilter by default). Run uninstall.exe program.

XML IFilter Server Edition Release 3.0 README

CONTENT OF README FILE

General Information

System Requirements

Information Retrieval

Configuration File Format

Content and Metadata Indexing

How to test XML IFilter Configuration File

Installation Instructions

Additional Setup Steps

SharePoint Server:

Microsoft SQL Server:

Windows Search:

How to Uninstall

Known Issues

XML files are not searchable with Windows Search

Additional Information

Acknowledgements

What's new in this version

Contact Information