Configuring Import Data And Docs With XML

Import Data and Docs can be used to batch import several files with relevant Index Field data defined in either a CSV or an XML. Sometimes, it may be more convenient to use XML files instead of CSVs. However, in order for GlobalCapture to recognize the data, it has to follow a specific schema. This requires that the SSImp Engine process be running in the Windows Services.

Import Data and Docs GlobalCapture Import

A standard Import Data and Docs with XML has a very simple XML format. This XML will be picked up from a hot folder, and will look for physical files as specified in the XML. Ensure that the App Pool user has access to the repository or directory where the files live. A new batch will be created for the import and the import will follow the workflow process after importing. This is the most common and easiest to configure method of importing.

XML

<Import>
	<Archive>
		<Document>
			<DocFile FileLoc="C:\GetSmart\Import\Document1.txt"/>
			<Fields>
				<Field Name="Status" value="Processing" />
				<Field Name="Other Field" value="Value1" />
			</Fields>
		</Document>
	</Archive>
</Import>

Direct XML Import

XML can be imported directly through the processing folder by dropping the XML into (by default) the C:\GetSmart\Processing Folder. This location can be changed by editing the SSIMPORTERWS.exe.config. This will do a direct import, bypassing the Engine. No workflow needs to be constructed for this type of import. This is can be a faster, more efficient method for importing bulk data.

Please note that no batch data will be created and errors will cause the entire XML file to import incorrectly. For this reason, this application is not recommended for most instances.

The XML will have to be formatted slightly differently than the normal Import Data and Docs using GlobalCapture.

Below is an example of such an XML file:

XML

<Import>
	<Archive ConnectionID="1" Name="2">
		<Document pass="True">
			<DocFile FileLoc="C:\GetSmart\Processing\Test\test.pdf"/>
			<Fields>
				<Field Name="Vendor Name" pass="True" value="POP"/>
				<Field Name="PO Amount" pass="True" value="110.00"/>
				<Field Name="Vendor email" pass="True" value="pop@pop.com"/>
			</Fields>
		</Document>
		<Document pass="True">
		<DocFile FileLoc="C:\GetSmart\Processing\Test\test1.pdf"/>
			<Fields>
				<Field Name="Vendor Name" pass="True" value="Amazon"/>
				<Field Name="PO Amount" pass="True" value="10000.00"/>
				<Field Name="Vendor email" pass="True" value="cs@amazon.com"/>
			</Fields>
		</Document>
	 </Archive>
</Import>

Overview of Schema

<Import> ... </Import>
- Master tag for the entire XML file.
<Archive> ... </Archive>
- Secondary tag to cover the entire document, but inside the <Import> tag. This is necessary for proper functionality. ConnectionID refers to the Database ID and Name refers to the Archive ID
<Document> ... </Document>
- Information about the document to be imported. You need a <Document></Document> for each document being imported via this XML. This tag MUST have pass="True"
<DocFile FileLoc="X:\path\to\document" />
- The FileLoc property in the DocFile is the full file path to the document. The GlobalCapture engine will look for the file in this location.
<Fields> <Field pass="True" Name="Field Name" value="Field value"> </Fields>
- Within the <Fields></Fields> tag, you specify the relevant index field information for the incoming document. You need a <Field /> tag for each index field. The Name and value properties are the Field Name and Field Value, respectively. This tag MUST have pass="True"

Other important information

All tag names and property names are case-sensitive. If your <Field /> tag says Value instead of value, then the data will not be captured.
All index fields listed in the XML file must be process fields in the workflow. If they are not process fields, the index field data will be ignored.