OpenText Information Extraction Service For SAP Solutions 16.5 - Administration Guide English (CPIE160500-AGD-En-09)
OpenText Information Extraction Service For SAP Solutions 16.5 - Administration Guide English (CPIE160500-AGD-En-09)
Administration Guide
CPIE160500-AGD-EN-09
OpenText™ Information Extraction Service for SAP® Solutions
Administration Guide
CPIE160500-AGD-EN-09
Rev.: 2019-Sept-20
This documentation has been created for software version 16.5.
It is also valid for subsequent software versions as long as no new document version is shipped with the product or is
published at https://knowledge.opentext.com.
Tel: +1-519-888-7111
Toll Free Canada/USA: 1-800-499-6544 International: +800-4996-5440
Fax: +1-519-888-0677
Support: https://support.opentext.com
For more information, visit https://www.opentext.com
One or more patents may cover this product. For more information, please visit https://www.opentext.com/patents.
Disclaimer
Every effort has been made to ensure the accuracy of the features and techniques presented in this publication. However,
Open Text Corporation and its affiliates accept no responsibility and offer no warranty whether expressed or implied, for the
accuracy of this publication.
Table of Contents
1 About Information Extraction Service ..................................... 5
1.1 How does it work? ............................................................................. 5
1.2 Technology ....................................................................................... 6
1.3 Machine learning ............................................................................... 7
1.4 IES architecture and SAP integration .................................................. 9
1.5 Solutions and profile configuration .................................................... 10
1.6 Supported languages ...................................................................... 12
1.7 Supported image formats ................................................................. 13
OpenText Information Extraction Service for SAP Solutions – Administration Guide iii
CPIE160500-AGD-EN-09
Table of Contents
7 Troubleshooting ...................................................................... 51
• Solution for Incoming Sales Orders. For more information, see part III
“Solution for Incoming Sales Orders” in OpenText Business Center for SAP
Solutions - Scenario Guide (BOCP-CCS).
• Solution for Incoming Quotations. For more information, see part IV
“Solution for Incoming Quotations” in OpenText Business Center for SAP
Solutions - Scenario Guide (BOCP-CCS).
• Solution for Incoming Delivery Notes. For more information, see part V
“Solution for Incoming Delivery Notes” in OpenText Business Center for SAP
Solutions - Scenario Guide (BOCP-CCS).
• Solution for Incoming Order Confirmations. For more information, see part
VI “Solution for Incoming Order Confirmations” in OpenText Business Center
for SAP Solutions - Scenario Guide (BOCP-CCS).
• Solution for Incoming Remittance Advices. For more information, see part
VII “Solution for Incoming Remittance Advices” in OpenText Business Center
for SAP Solutions - Scenario Guide (BOCP-CCS).
• Solution for Incoming HR Documents. For more information, see part VIII
“Solution for Incoming HR Documents” in OpenText Business Center for SAP
Solutions - Scenario Guide (BOCP-CCS).
1.2 Technology
IES recognition technology is based on adaptive learning algorithms which learn
from user feedback. This provides a fast project setup as no rules or parameters for
data extraction are required, and a fast success rate for excellent recognition results.
IES combines a proven invoice knowledge base with learning algorithms, which
extends the recognition rate within short time of productive use to a new level.
VIM and BC Beyond excellent recognition results, the tight integration of IES with VIM and BC
integration provides an ideal technical footprint, not requiring an external repository, but using
SAP repository, not requiring any data duplication from SAP system but keeping all
data in the SAP system. The connection to SAP is implemented based on REST based
web services supporting HTTP and HTTPS. This architecture is ideal for hosting IES
on cloud platforms as an alternative to the on premises installation.
ICC and BCC Customers using OpenText™ Invoice Capture Center for SAP® Solutions (ICC) or
integration OpenText™ Business Center Capture for SAP® Solutions (BCC) can integrate IES, as
it can run in parallel, also connecting to the BC Inbound framework. You can use
existing Validation Clients for processing IES profiles in parallel to ICC and BCC
applications.
All scenarios are based on intelligent field types for the data transfer to the VIM or
BC solution. These fields include:
• Amount
• Business partner
• Classification
• Date
• Decimal
• List
• Lookup
• String
• Table
Intelligent field types include the knowledge about the data structure, potential local
formats, their meaning in different business contexts, tax compliance and how they
are used in SAP. As an example, a field type specifies date formats, for example
MMDDYYYY, DDMMYYYY, DD.MM.YYYY, DD/MM/YYYY, and date writing styles, for example
month spelled out, without the need of defining it for any solution.
Learning For the invoice solution the invoice knowledge base has been integrated and runs
algorithms combined with the intelligent field types leveraging a learning voting algorithm.
Thus IES starts learning as soon as the first document is processed. The learning is
based on a combination of two different algorithm types and a mixture of both:
• Layout based algorithms, for example business partner determination, and item
table recognition. These work based on the layout and where the data is typically
located on the document.
Layout based algorithms learn very fast, providing automatic recognition result
from the second pass of the same layout. For complex tables more passes may be
necessary.
• Layout independent algorithms, for example data, such as amounts, dates and
document references. These algorithms identify relevant data structures and
keywords and learn to extract this on documents with an entirely different
layout.
Layout independent algorithms take more time to learn data structures, related
keywords and surrounding text. What has been learned by this algorithm type
will also be applied to completely unknown documents containing the same data
structures, keywords and surrounding text. With this technique, the data entry
effort diminishes more and more, and the longer the system is in production it
learns more document layouts.
Learning Before using IES, the user manually entered data from business documents. With
system IES, the user continues to do his daily work, and with this he provides feedback data
to the learning service, which learns to capture the data automatically starting from
the second pass of a document with either the same layout or showing the same data
type, for example date field with the same format and key word.
For invoices the user starts from a higher level, because he will get “out of the box”
recognition results from the invoice knowledge base where applicable. They will
only need to enter data for additional fields that are not supported by the invoice
knowledge base, for example a material number, or that could not be extracted by
the invoice knowledge base for some reason, for example complex invoice line
items.
IES connects to VIM and BC using the ABAP component Inbound Framework (IDF),
where an Information Extraction Service API complements the existing BCC APIs.
IES provides three services which are called from SAP by the IES API:
Profile configu- For every IES scenario, at least one profile must be configured. The number of
ration profiles per scenario or overall number of profiles is not limited.
A profile defines the set of fields that are required for extraction based on the fields
of the standard scenario. The profile also defines the behavior and sequence of the
fields in the Validation Client, which can be set as required, read-only or hidden.
The profile creation depends on the specific business requirement, such as different
company codes or countries.
The default fields can be extended by adding following custom fields: Date,
Decimal, String. The custom fields will also be shown in the Validation Client.
Learning data The learning data will be stored in the repository of the SAP system. In the current
configuration version IES connects to one SAP system only. That means only one feedback link can
be configured.
Validation The Validation Client is the user interface for comfortable data completion and
Client configu- manual data entry. All user actions contribute to the continuous learning process.
ration
The client itself can be used on an unlimited number of workstations without
charge.
The Validation Client has no connection to the IES Server, it connects only to the
SAP system using an RFC call and polls for assignments. If a document is assigned
to the validation step, the Validation Client loads the profile settings, notably the list
and sequence of fields, as well as the invoice image, including the full text
recognition details but without any connection to the archive system.
You can configure the Validation Client in BC Inbound IMG. The configuration
defines if validation is used or not, which are the criteria for documents used for
validation, and which validation agents are assigned to the different profiles.
The profile configuration defines how data fields are shown in the Validation Client.
Data capturing The Validation Client provides comfort functions for fast and secure data capture:
• With Single Click Entry (SCE), the user can pick the data directly from the image
into the data fields and does not need to type the data. This avoids errors due to
typos.
• With Table Auto Completion (TAC), the user only needs to pick the data of the
first item line with Single Click Entry (SCE), Table Auto Completion (TAC)
automatically captures the data of all line items.
The image view highlights all data that has been recognized automatically and the
user can see at first glance if data is missing, for example line items.
• Character sets supported by the text recognition engine which supports most of
the languages worldwide with high precision.
• Data interpretation for different cultures and languages. Cultures refers to the
typical writing styles for data in a country. In Germany, for example, the date
format is typically 10.12.2018 and spelled out in words as 10. Dezember 2018.
In the United States it is 12/10/2018 and December 10, 2018. For the latter, the
language must also be understood for the correct interpretation of the month,
and the context may be required to determine whether the data is European or
the U.S. for the correct interpretation of the numeric date. In some countries, very
specific date formats need to be considered, such as the Emperor date in Japan.
Also, amounts are culture specific with regards to the use of decimal separators
and the number of decimals used. The same also applies to quantities.
• For some documents, such as invoices, country specific reference fields, local
taxes and tax compliance requirements, such as mandatory tax invoice, need to
be considered.
For more information of supported languages and character sets, see the Release
Notes of VIM and BC.
• Country groups using the same data fields. A profile also defines data fields for
the screen in the Validation Client for data completion.
In Inbound IMG, the preferred culture for data interpretation can be configured. For
example, by selecting the UK as the first sequence, the date and amount fields are
interpreted with first priority for the UK format, but the formats of other cultures are
still recognized if no UK date or amount exist on the document.
• All document formats used for invoices, such as DIN A4, Legal, Letter, and
smaller formats, in portrait and landscape orientation with a minimum font size
of 1.5 mm high and 0.5 mm wide.
• JPEG for color and gray images with a maximum image resolution of 400 dpi.
• PDF
• TIFF 6.0 binary images in the compression modes:
– Uncompressed
– Fax Group 3
– Fax Group 4
– Packbits
– LZW
1. Install IES.
2. Configure BC respectively VIM. For more information, see section 4.5.2
“Customizing Information Extraction Service” in OpenText Business Center for
SAP Solutions - Configuration Guide (BOCP-CGD).
5. Open this node and its subnode Application Development, and then select
ASP.NET 4.5 and later versions.
3. In the End-User License Agreement dialog box, accept the license agreement,
and then click Next.
4. In the Destination Folder dialog box, check the path to the installation folder,
and then click Next.
If you want to install IES at a different location, click Change, and then choose
the desired path in the appearing folder dialog box.
5. Click Install.
6. Click Finish.
Service packs are cumulative, that is a service pack contains all changes contained in
previous service packs of the respective IES version. Therefore, it is sufficient to
install only the latest service pack after installing IES. For the same reason, the
Release Notes of a service pack list the changes of all previous service packs.
However, you can install a service pack on IES with any previous service pack.
Patches are related to a service pack, that is you can only install a patch on an IES
system where the corresponding service pack has been installed. Patches are
cumulative, that is a patch contains all changes contained in previous patches for the
same service pack.
1. In the Server group of the General tab, click Cluster, and wait until all cluster
nodes have finished processing, that is, until all are in state Ready.
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=RFC1
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=RFC_METADATA
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=SDIFRUNTIME
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=SYSU
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=SYST
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=SRFC
• S_RZL_ADM:ACTVT=03
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME= /OTX/PF11_VALIDATION
• For BC Inbound Framework version 16.3.1 and later, full authorization for the
J_6NPF_RFC object is required for all users. For more information, see section 7.4
“Authorization objects” in OpenText Business Center for SAP Solutions -
Configuration Guide (BOCP-CGD).
As of SAP Basis Release 7.10 you can choose a finer granularity for authorizations.
For more information, see SAP Note 460089. You can execute the authorization
check on individual function modules, instead of entire function groups.
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=SYST
replace with
S_RFC:ACTVT=16,RFC_TYPE=FUNC,RFC_NAME=RFCPING
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=SRFC
replace with
S_RFC:ACTVT=16,RFC_TYPE=FUNC,RFC_NAME=RFC_PING
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=RFC1
replace with
S_RFC:ACTVT=16,RFC_TYPE=FUNC,RFC_NAME=RFC_FUNCTION_SEARCH
S_RFC:ACTVT=16,RFC_TYPE=FUNC,RFC_NAME=RFC_GET_FUNCTION_INT
ERFACE
• S_RFC:ACTVT=16,RFC_TYPE=FUGR,RFC_NAME=SDIFRUNTIME
replace with
S_RFC:ACTVT=16,RFC_TYPE=FUNC,RFC_NAME=DDIF_FIELDINFO_GET
You can also replace the other function group authorizations by the function module
authorizations but it is not necessary because nearly each function module within
the groups is used.
1. Create or purchase the certificate. Ensure that you also have the private key.
2. Import the certificate into the Local Computer account. Ensure to select Allow
private key to be exported.
3. Assuming that the Internet Information Services (IIS) website is running under
ApplicationPoolIdentity, do the following:
a. Run certlm.msc.
b. In the console tree, expand Certificates - Local Computer > Personal >
Certificates.
c. In the result pane, right-click the appropriate certificate, and then click All
tasks > Manage Private Keys.
d. Add the %LOCALSYSTEM%\IIS_IUSRS user and grant it Full control.
e. Click Apply.
2. On the Connections view, right-click Sites, and then click Add Web Sites.
4. In the Physical path box, enter the installation path of IES with its subfolder
WebServiceAPI.
5. In the Type list, click http or https depending on your connection type, and
then enter the port in the Port box.
6. Optional If you want to use a SSL certificate for HTTPS communication, click
Select, and then browse to the certificate file (optional). For more information,
see “Adding a certificate for client/server authentication for secure
communication using HTTPS (optional)” on page 20.
8. On the Actions view on the right side, under Manage Web Site, check if web
site is already started. If not, click Start, and then click Browse.
The default web browser opens and shows a start message which indicates that
the web service is running.
9. In the web browser address, add status.aspx to the URL which is already
openend, and then press RETURN.
A webpage should be shown which shows information about the web service
and additional status information on the processed jobs.
3. In the Edit Application Pool dialog box, select .NET CLR v4.0.30319 in
the .NET CLR version list if it is not already selected, and then click OK.
4. Under Client certificates, click the Require option, and then click Apply on the
Actions view.
3.3.3 Increasing the upload size (IIS 7.0 and later versions)
To increase the upload size to the web service, change the properties
maxAllowedContentLength, and maxRequestLength:
6. On the Configuration Editor view, in the Section list, open the System.web
node.
7. Click httpRuntime, select the maxRequestLength property, and then enter the
maximum upload size you want to support (up to 2 GB). The default value is
4096 bytes (4 KB).
To increase the upload size for the IES web application which communicates
via HTTPS, change the property uploadReadAheadSize:
CompressMultipartResponse
Compress the HTTP multipart which gets sent to the leading (SAP) system.
The default value is true and must not be changed.
DefaultOcrTimeout
Maximum runtime of an operation. If runtime exceeds this value, the
operation will be cancelled. The default value is 50 minutes and must not be
changed.
LoadManagerMasterServer
Hostname of the Load Manager Server. The default value is 127.0.0.1
(localhost) and must not be changed.
NoOfOcrRetries
The number of retries if an operation gets failed. Not used for productive
use because retry handling is set by leading (SAP) system. The default value
is 0 and must not be changed.
SupportedSecurityProtocolTypes
Security protocols that are supported for secure communication via HTTPS.
The TLS protocol types Tls,Tls11,Tls12 are supported by default. A comma-
separated string or also a single value can be entered here. If the value is
empty, the default system setting will be used. For more intensive security
reason some SAP systems does not support older TLS protocols than Tls12.
In this case the setting should be changed to value Tls12 only.
TracingConfigFileName
Name of the configuration file which is used for application tracing setup.
The default value is InformationExtractionService.traceconfig and must
not be changed.
WebProxyAddress
Web proxy address if a proxy is used for outbound calls.
When you use client certificates for the communication via HTTPS (see “Adding
a certificate for client/server authentication for secure communication using
HTTPS (optional)” on page 20), set the following configuration parameters
accordingly:
StoreCertLocation
Each of the Microsoft Windows certificate stores has the following types
which you can use:
ClientCertificateOption
Value that indicates if the certificate is automatically picked from the
certificate store or if the caller is allowed to pass in a specific client
certificate.
DistinguishedCertName
Subject value of the client certificate which is used for sending HTTP
requests to the leading SAP system(s). This value must match to an existing
subject name located in the local cert store, for example CN=*.opentext.
net, OU=SAP Solutions Development, O=OpenText, C=DE.
Additional there are a few settings which offers support for error analysis, for
example StoreRequestToFolder, which prints out the HTTP multipart request
to a text file on the local storage. But these settings should only be used by
administrators or customer service for temporary analysis.
3.4.1 Tracers
A tracer represents the source of a trace message. Tracers are named entities with a
hierarchical structure. Hierarchy levels are separated by dots. This corresponds to
the naming of .NET classes including namespaces. Therefore usually the class name,
for example DOKuStar.Runtime.Server, is used as name of the respective tracer.
The tracer name is used to configure the properties of the tracer such as the trace
level. You may use the full name of the trace or only a part of the name. If a
particular trace has not been configured explicitly, it inherits its configuration from
its parent tracer.
A special tracer, called the RootTracer, is the parent of all tracers and has no name.
RollingFileListener
This trace listener writes trace messages to a file. It generates a new file with a
unique file name for every process by merging the current time and the process
identifier into the file name. The output is flushed every 30 seconds. A new file
is created each day. Files older than 10 days (configurable) are removed.
RemotingTraceListener
This listener writes trace messages to a remote sink, for example the Trace
Viewer in order to display trace messages on-the-fly.
ConsoleTraceListener
Writes trace messages to console output.
Trace listeners may trace at different levels. Every tracer may have assigned its own
collection of trace listeners but usually only the root tracer has them all and all other
tracers inherit them.
Note: The Application Data folder is hidden. Therefore, select the respective
option in the Folder Options of Microsoft Windows Explorer to display the
folder.
1. Set the registry value TraceRootPath (type REG_SZ) at the key HKLM\SOFTWARE
\Open Text, and on 64-bit systems also at the key HKLM\SOFTWARE
\Wow6432Node\Open Text to a new path.
3. If you want to use a remote trace folder, run the service with a domain user
account instead of the LocalSystem account.
4. Restart the DOKuStar Load Manager Microsoft Windows service and the
DOKuStar Tracing Microsoft Windows service.
Trace configuration files are stored directly under the root path. Trace files are
written to a subfolder, typically the folder name is the application name, for example
DOKuStar Load Manager.
Fatal
Used in case of errors where the administrator has to be called immediately.
Those errors risk to crash down processing for example disk full.
Error
Used in case of errors that aborted the current job. Probably processing
continues with the next job.
Warning
An unexpected issue occurred and should be traced in order to be able to
analyze it later especially if it occurs again.
Info
This is the default level. It is use to trace normal progressing as an administrator
would see it in a monitor, for example:
Fine
First debug level, for example to additionally trace failed polling calls or
important parameters, and so on.
Finer
Next debug level.
Finest
Highest debug level.
Note: The default level is Info. Do not stress tracing using this level Info with
your debug messages.
The trace configuration files must reside in the current trace folder. Therefore do not
forget to copy the trace configuration files to the new location if you change the trace
root path.
You may configure traces by editing the appropriate trace configuration file directly
or by selecting a predefined trace configuration in the Trace Viewer. For more
information, see “Analyzing trace files using Trace Viewer” on page 31. Selecting a
predefined trace configuration in the Trace Viewer changes the trace configuration
files automatically.
listener
This element configures a trace listener by specifying the following:
name
Any describing name.
type
The full qualified name of the class which implements the listener.
threshold
Trace level (optional, default is Finest).
It also specifies other trace listener-specific parameters.
<listener name="file"
type="DOKuStar.Diagnostics.Tracing.RollingFileTraceListener">
<file value="DOKuStar Load Manager\
\DOKuStarClusterNode.log" />
</listener>
<listener name="remote"
type="DOKuStar.Diagnostics.Tracing.RemoteTraceListener">
<url value="tcp://localhost:20304/
DOKuStar.Diagnostics.Tracing" />
<threshold value="fine" />
</listener>
root
This element configures the root tracer by specifying the following:
level
Trace level.
listener-ref
The listeners. Usually all listeners are configured only at the root tracer, not
at the categories.
<root>
<listener-ref ref="file" />
<listener-ref ref="remote" />
<level value="fine" />
</root>
category
This element configures certain tracers by specifying the following:
name
Name of the tracer or part of its hierarchical name.
level
Trace level.
<category name="DOKuStar.Cluster.Operation">
<level value="fine" />
</category>
merge (default)
Configures only a few items and merges them with a configuration that may
have been initialized by software and already has specified its root tracer and
trace listeners so that you only want to change trace levels of certain categories.
Example:
overwrite
Configures all features and completely resets an initial configuration made
through software. You must specify the root tracer and its listeners.
Example:
type="DOKuStar.Diagnostics.Tracing.RollingFileTraceListener">
<file value="DOKuStar Load Manager\
\DOKuStarClusterNode.log" />
</listener>
<listener name="remote"
type="DOKuStar.Diagnostics.Tracing.RemoteTraceListener">
<url value="tcp://localhost:20304/
DOKuStar.Diagnostics.Tracing" />
<threshold value="fine" />
</listener>
<root>
<listener-ref ref="file" />
<listener-ref ref="remote" />
<level value="fine" />
</root>
<category name="DOKuStar.Cluster.Operation">
<level value="fine" />
</category>
<category name="DOKuStar.Cluster.ClusterNode">
<level value="fine" />
</category>
<category name="DOKuStar.Cluster.Extraction">
<level value="fine" />
</category>
</trace>
Tip: You can find examples for trace configuration files in the <Information
Extraction Service_install>\TraceConfig folder.
• Trace files
<TraceRoot>\DOKuStar Load Manager\DOKuStarLoadManager*.log
<TraceRoot>\DOKuStar Load Manager\DOKuStar.ClusterNode*.log
• Trace config file
<TraceRoot>\DOKuStarLoadManager.traceconfig
<TraceRoot>\DOKuStarClusterNode.traceconfig
Customizing Client
• Trace files
<TraceRoot>\rda1\rda1*.log
• Trace config file
<TraceRoot>\rda.traceconfig
All services
The Microsoft Windows event log is used to log creating, starting, and stopping
of the services.
DateTime
Time when the trace message was written, sortable format yyyy.
MM.ddTHH:mm:ss.
Level
Trace level Fatal, Error, Warning, Info, Fine, Finer or Finest.
Computer
Name of the host where the trace message was written.
Application
Name of the application that wrote the trace message.
PID
ID of the process that wrote the trace message.
Category
Trace category (hierarchically to reflect classes and modules).
ThreadId
ID of the thread that created the trace message.
Message
Trace message enwrapped with 2 squared brackets at begin and end. Note: A
trace message is multi-line; it may contain carriage return and line feed.
1. In the Customizing Client, on the General tab, in the Tools group, click Trace
viewer.
Tip: Alternatively, you can start the Trace Viewer on the Microsoft
Windows start menu in the OpenText Information Extraction Service for
SAP® Solutions program group.
The Trace Viewer shows a list of all trace files found on the local computer. The
list is grouped by the different components.
Tip: If the toolbar is not shown, right-click in the right area, and then click
Toolbar.
Local Machine
You can open the trace folder by clicking the link in the header.
Filter
You can display only trace files containing messages of the respective types
by clicking All, Only errors, or Only errors and warnings in the Filter list.
Trace configuration
You can select one of three trace configurations. For more information, see
“Selecting a trace configuration” on page 33.
Display Level
You can specify one of seven different trace levels. In the most restrictive
level only fatal error messages are shown. In the most talkative trace level,
messages of all message types are shown.
The toolbar text indicates the currently selected trace level.
Display filter
You can set different kinds of filters that control which log messages are
shown. For more information, see “Filtering messages” on page 34.
Find
You can search in the currently open file. For more information, see
“Searching trace messages” on page 35.
Tip: You can enable more functions in the main menu, or in the context
menu of the right area.
• This configuration affects the local computer. If you want to search an error
on a processing cluster, you may need to modify the trace configurations on
all computers of the cluster.
• The dialog box cannot indicate the current trace configuration, because you
could modify the trace configuration files using a text editor at any time,
creating a custom configuration differing from all three default
configurations described below.
To filter messages:
1. In the toolbar, click Display filter. Alternatively, in the context menu of the
message list view, click Filter.
Level
This filter permits to switch all trace messages off or to specify a trace level.
If the trace level is set to Error, only error messages are shown. The other
values add messages of other types successively. If the trace level is set to
Finest, messages of all types are shown.
Computer
Lists computers of the cluster used by the project. By default, messages
from all computers are shown. If you work with a cluster you can exclude
some computers or restrict output to the messages of a single computer.
Application
Lists all applications of the current project. By default, all applications are
enabled.
PID
Lists process IDs of all processes of the current project. By default, all
processes are enabled.
ThreadId
Lists thread IDs that created the trace message.
EventId
Lists event IDs that created the trace message.
Category
A category is a group of classes. This filter permits to restrict messages of
the type Info to functional units within the Document Reader during
debugging.
Find filter
If the text box in this area is not empty only matching messages are shown
in the messages list view. The check boxes permit to control text matching.
• If the Match case check box is selected, the message must contain the
string in exactly the same spelling with respect to upper and lower case
letters.
• If the Match whole word only check box is selected, the string will not
be matched against a part of a word.
1. In the toolbar, click Find. Alternatively, in the context menu of the message list
view, click Find.
2. In the Find dialog box, enter the search string. You have the following
additional options: Match case, Match whole word only, Regular expression
3. Click Find Next or Find Previous to search the message list. The next or
previous matching trace message is selected in the messages list view.
By default, the number of cache entries in each project cache is controlled in relation
to a specified cache size. If the number of entries reaches twice the specified number,
the oldest entries are deleted until the number of entries equals the specified cache
size. The default value of the cache size is 32.
1. Go to the IES program files folder, and then open the configuration file
DOKuStarClusterNode.exe.config.
2. Go to the tag appSettings, and then change the parameters which are specified
as key and value attributes of an add tag.
<appSettings>
<add key="CacheSize"value="20" />
</appSettings>
CacheLifeTime
Minimizes the cache size on a production system by switching to non-
buffered mode. The cache entries are deleted as soon as a job is terminated.
The specified cache size does not take effect in this case.
SaveIntermediateResult
Reduces further cache size by specifying that intermediate results that are
useful for error analysis but are not needed for the following processing
steps should not be stored in the cache folder.
Example: The example shows the respective part of the configuration file where the
appSettings tab and the two additional parameters have been added:
<?xml version="1.0"?>
<configuration>
. . .
<appSettings>
<add key="CacheLifetime"value="NonBuffered" />
<add key="SaveIntermediateResult"value="false" />
</appSettings>
</configuration>
The following sections describe which components are located on which participants
and how the different participants interact in respect to the components.
If the Load Manager Service of a IES Cluster Node is running using the
LocalSystem account, which is the default, the temporary files are written into the
temp folder defined in the system temp variable.
2. At this key, create a string value tempPath containing the path for these data.
3. For the change to take effect, restart the service DOKuStar Load Manager.
Alternatively, you can move the location of the whole temporary files folder using
Control Panel.
You can move the program data files, for example to a different drive.
The Load Manager is a Microsoft Windows service which runs on the IES Server.
This Microsoft Windows service is started automatically after installation and at
every system start. The Load Manager on the IES Server is the master and controls
the load balancing. This master receives job requests and distributes these tasks.
Usually the IES Cluster contains as many cluster nodes as processors. If the IES
Server is a two-processor computer and if there are no further IES Server Nodes, the
IES Cluster will consist of two Cluster Nodes.
A Cluster Node hosts a service of a specified type called the Service Type. For the
IES Server, the default services types are Document Extraction, Learning (Feedback)
and Configuration.
The IES Server automatically creates and configures the appropriate jobs which runs
within specific Cluster Nodes at the Load Manager. You do not need to configure
them manually.
The Load Manager processes a job request by creating an operation. The Load
Manager keeps a list of waiting and active operations and manages them until their
execution has been completed.
The Jobs view and the Cluster view on the IES Server Cluster Monitor permit to
monitor the Load Manager.
The cluster configuration is managed centrally at the IES Server. You can configure
the IES Server Cluster with the Load Manager Configuration tool hosted within the
Cluster Monitor.
The pre-configuration of the IES Server Cluster will setup a defined number
of cluster nodes which depends on the available system hardware, that is one IES
Server Node for each processor of the IES Server. The pre-configuration of the IES
Server Cluster will look as follows on a computer with four processors:
With a high number of processing nodes, you also need more RAM to be able to use
them effectively or throughput will decrease because of increased swapping.
Typically, you need about 1.5 GB per Runtime Node.
2. In the Load Manager Configuration dialog box, click the Cluster node you
want to change in the tree view.
3. Click the property you want to change, define the value, and then click OK.
LookupInterval
Time interval after which the Load Manager checks whether the cluster
node is still alive.
Operation Types
Load Manager operation type.
Optimization Time Window
To prevent loading time, the Load Manager tries to assign an operation of
the currently loaded operation type to a node that becomes available. As
soon as the first operation in the Load Manager queue waits longer than the
specified time, it is assigned to the next available node. Enter the time in the
format hh:mm:ss.
Process Priority
With the default value BelowNormal, priority of the computationally
intensive Runtime Node processes is reduced. Otherwise other important
processes would often have to wait for processor time and would respond
slowly. If you set this property to Normal, priority is not reduced.
Reset Cycle
After the specified number of operations the cluster node is reset
automatically. This can also be used to ensure that the project is reloaded.
Reset on failure
If this property has the value True, the Load Manager tries to reset the
cluster node in case of failure.
Startup Time
Specifies the maximum time the Runtime Nodes should need for start-up. If
the start-up of a Runtime Node exceeds this limit, the corresponding
process is terminated. Enter the time in the format hh:mm:ss.
Use IPC Channel
As default, the Load Manager uses TCP for communication with the local
cluster nodes. If this property is set to True, it uses IPC channels (named
pipes) instead.
2. In the Load Manager Configuration dialog box, click the IES Server Node you
want to change in the tree view.
3. Click the property you want to change, define the value, and then click OK.
Name
Cannot be edited. It is comprised from the computer name, service type and
instance number.
Agent Url
URL of the agent service for this cluster node within the cluster.
Description
Only for logging purposes.
Exclusive Operation Types
Restricts a cluster node to the specified Load Manager operation types.
Priority
Priority of the cluster node. The Load Manager prefers Runtime Nodes with
a higher priority value when it wants to assign an operation to a Runtime
Node. The default value is 0. To give a node a higher priority, enter a value
greater 0, to give a node lower priority, you can enter a negative value.
These services are started automatically after the installation and at every system
start. In case of failures, you may need to stop, start, or restart a service.
3. Press Enter.
Tip: If after system reboot a service does not automatically start although its
Startup Type is set to Automatic, set the Startup Type to Automatic (Delayed
Start).
Note: This file is delivered with a patch. Make sure you have installed
the latest patch. For more information, see “Installation procedure
overview” on page 15, and “Installing patches and service packs”
on page 17.
2. Open the IIS manager, and then click the site with the default name
InformationExtractionService.
The settings should be the same as listed in step 2 and 3.
3. In the Actions view, click Basic Settings, and then copy the following settings:
4. In the Actions view, click Bindings, and then copy the following settings:
• Type: http
• Binding Information:
1. Install SAP Diagnostics Agent, and then connect it to Solution Manager. For
more information, see SAP Note 1365123.
2. Upgrade the SAP Host Agent to the latest version. For more information, see
SAP Note 2598404.
3. Open the Agent Administration in Solution Manager, and then click the Non-
authenticated Agents tab.
4. Update the Diagnostic Agent agent, and then click Trust Agents.
5. Open Landscape Management (LMDB), click Technical Systems > Select Type:
Microsoft Internet Information Services.
a. Click Software, and then add and configure the Product Instances, and
Software Component Versions.
b. Click Technical Instances, and then copy the IES application specific data
into the corresponding areas.
MSIIS Applications
MSIIS Pools
MSIIS Sites
For monitoring of an IES system, you can use external monitoring tools, and the
Inbound Configuration and Inbound Administration work center in Business Center
or VIM. For more information, see section 4 “Business Center Workplace: Inbound
Administration work center” in OpenText Business Center for SAP Solutions - User
Guide (BOCP-UGD), and section 4 “Inbound Configuration” in OpenText Business
Center for SAP Solutions - Configuration Guide (BOCP-CGD).
If you monitor IES with external monitoring tools, take the following considerations
into account:
• CPU usage: On the server and on Recognition Nodes, CPU usage will often be at
100% while a document is being processed. Whereas a single page is processed in
a few seconds, processing may take up to several minutes for a document with a
large number of pages. Therefore high CPU usage may indicate a problem only if
it persists for more than about 5 to 10 minutes depending on the maximum
number of pages of your documents.
• Disk space: IES does not collect and accumulate data. Temporary files written
during document processing are being deleted when the document is exported.
Trace files are deleted automatically after several days. Therefore disk space
should pose no problems if you provided sufficient resources.
• Main memory: Because the Cluster Node processes on the IES Server and the
Recognition Nodes are reset automatically after a certain number of documents
has been processed, main memory usage should not grow on the long run. If you
provided sufficient main memory depending on the number of Cluster Nodes,
main memory problems should not occur.
• Microsoft Windows services: For document processing, the Microsoft Windows
services DOKuStar Load Manager, and DOKuStar Tracing must be available.
Therefore it is meaningful to monitor whether these services are running. If a
service is not running, start it.
The most critical part of a IES system is the BC Inbound Configuration, or the ICC
Dispatcher, respectively. These components are parts of BC or VIM in SAP ERP.
Therefore their high availability is guaranteed.
The application configuration data are stored in SAP ERP. Therefore, the high
availability of the application configuration data is guaranteed.
The Operation Nodes which runs on the IES Server are not critical at all. Therefore,
there is no fail-over scenario defined for them.
For more information about high availability in SAP ERP, see the SAP Netweaver
Technical Operations Manual (https://help.sap.com/doc/erp2005_ehp_03/6.03/en-US/
72/cd1e4261ea5433e10000000a155106/frameset.htm).
Per default the Load Manager service runs under Local System. There is no need to
change this setting to a domain user account as long as no remote communication
will be setup.
If you want to use a domain user account for running the service, you must prepare
it before the installation and enter during installation. In this case the Load Manager
Service user must have local administrator rights.
For more information about using the Microsoft Windows user management, see
Best Practice Guide for Securing Active Directory Installations (https://
docs.microsoft.com/en-us/windows-server/identity/ad-ds/plan/security-best-
practices/best-practices-for-securing-active-directory).
Security-relevant events on the SAP ERP side can be logged using SAP ERP means.
6.5 Responsibilities
In an IES system, no real users are involved.
On the BC/VIM SAP system, there are additional user types and responsibilities.
BC or VIM manages the documents which are then proceed by IES. The components
manage a list of all documents and their current states. This list can be monitored
with defined tools in SAP ERP. When a specific request cannot be processed by the
IES Server, the corresponding entry in the SAP ERP system gets an error status.
Communication issues
All communication between the SAP ERP system and the IES Server takes place
using HTTP(S). All HTTP(S) request are logged by the IES Server and IES
application level also. In general, there could be various reasons for
communication problems using HTTP(S).
IIS web server
Before the IES web application receives any incoming requests, IIS is checking
for valid HTTP(S) first. By default, all IIS logs are written to the local folder
C:\inetpub\logs\LogFiles. The incoming requests are logged to specific trace
files having timestamp information in the file name.
Example: No client certificate is send with the HTTPS request, although it is requested
by the service. In this case the IIS refuses the request with the following response: “HTTP
error code: 403 / HTTP error message: The page you are attempting to access requires
your browser to have a Secure Sockets Layer (SSL) client certificate that the Web server
recognizes.”
In case IIS reports the HTTP error code 403 13, disable the client revocation
check on the IIS web server as described in https://blogs.msdn.microsoft.com/
kaushal/2012/10/15/disable-client-certificate-revocation-crl-check-on-iis.
IES web application
The IES web application defines the following specific HTTP response codes for
temporary and permanent error cases:
If an error occurs refer to the application traces for more information. The IES
web application logs are written to a specific trace folder and can be viewed
using the Trace Viewer Tool. For more information, see “Configuring tracing”
on page 24.
Monitoring activity
The IES setup installs following tools which helps you monitoring the requests:
Additional IES offers a web page where general information about the service
and jobs which has been processed are available. You can open the web page
using http(s)://<hostname>:<port>/status.aspx.
You can see the last jobs that were processed, together with eventual errors.
Only a limited number of jobs is shown. It may even happen that memory was
recently cleared and no details are shown, this is not an error. In this case check
the status again a few minutes later.
Restart web service and Microsoft Windows services
Sometimes temporary issues can be resolved by restarting the IES web service
on the IIS Manager and additionally restarting the Microsoft Windows services
DOKuStar Load Manager and/or DOKuStar Tracing.
If the status web page cannot be displayed on a web browser running on a
remote system, wait for a few minutes to see whether this is only a temporary
network problem. If the issue persists, restart the web service.
If some of the runtime jobs run into same error case, first stop the IES web
service on the IIS Manager and, if the service has been stopped, wait for a view
minutes, and then restart the Microsoft Windows service DOKuStar Load
Manager.