xBus

Error Handling

Sometimes errors will occur while processing messages. Typical errors will concern the unavailability of external resources, for example a HTTP call to a URL where the server is not responding or corrupt resp. inconsistent messages, for example a XML message not conform to its DTD. In case of an error, some actions will be done by the xBus to handle the situation in a controlled way:

  • All modules of the xBus are build to detect problems very fast. If an error occurs, the chain of processing the message will be stopped, i.e. no further senders will be called. Instead of it, a rollback will be performed by the transaction handling. Some receivers have different ways of handling the incoming message during a rollback.
  • The problem will be reported in up to four ways, see Reporting Errors.
  • For background receivers there is a configuration to stop them after some errors in a sequence, to avoid reporting lots of errors.
 Handling of Messages

Most of the receiver and sender modules work together with the transaction handling of the xBus. This means they perform a commit after the successful processing of a message or a rollback when an error has occurred.

The chapter about transactions has more information, including a list of all transaction-aware receivers and senders and how they work on commit and rollback.

Some of the receivers reading a message from a resource (e.g. the FileReceiver) are able to delete the message in case of an error. That avoids trying to process the same message again and again. Before deleting, the message can be stored in the Deleted Message Store, which is described in more detail below.

 Reporting Errors

If an error occurs, an exception is thrown internally. This exception contains an error code and a text, explaining the error.

The error code consists of four sections:

    Location - indicates whether this is an internal or an external problem
  • Layer - indicates the layer where the error has occured
  • Package - indicates the package where the error has occured
  • Number - to identify the error

The text, explaining the error, is normally read from a properties files named $XBUS_HOME/etc/errors*_en.properties or $XBUS_HOME/plugin/etc/errors*_en.properties. The only exception is, when the Number has the value 0. In this case a Java exception has been caught and the text is directly the message of this Java exception.

More information on error codes can be found below.

An error may be reported in up to three ways:

  • Writing an entry to the Trace with the trace level error (1). The trace entry contains the Java stack trace, to help an experienced administrator to locate the problem.
  • Setting the returncode of the message to RC_NOK. Entries written in the Journal will contain the returncode, the number of the error and the text, explaining the error.
  • Sending a notification to an administrator, for example by mail. The technical way to perform this is configurable.

The error code together with the error text tells the administrator the reason of the error and in which part of the xBus it has occurred. The four parts of the error code are connected with an underscore. So an error code looks like Location_Layer_Package_Number.

  • The first part identifies the location where the error has occurred:
    • E - External - A system outside the xBus is responsible, e.g. a resource where a message shall be sent to is not reachable.
    • I - Internal - The reason is inside the xBus, e.g. a mandatory entry in the configuration is missing.
  • The next two parts are a two-digit number specifying the layer and a three-digit number specifying the package where the error has occurred.

    Here is a table with the definition of the layers and packages:


    Layer Package
    00 = Base functions 000 = Base functions
    001 = Configuration
    002 = Trace
    003 = TimeOutCall
    004 = XML support
    005 = String support
    006 = Arithmetic support
    007 = Reflection support
    01 = Technical 000 = Technical base
    001 = File
    002 = AS/400
    003 = Database
    004 = HTTP
    005 = Message Queueing
    006 = Java
    007 = Mail
    008 = Socket
    010 = FTP
    011 = LDAP
    02 = Protocol 000 = Protocol base
    001 = AS/400
    002 = ByteArrayList
    003 = Record Types
    004 = SimpleObject
    005 = SimpleText
    006 = SOAP
    007 = XML
    009 = CSV
    03 = Application 001 = Router
    002 = Adapter
    003 = ApplicationFactory
    04=xBus base functions 001 = NotifyError
    002 = Journal
    003 = XBUSSystem
    05 = Administration 000 = Administration base
    001 = JMX
    002 = HTML
    06 = Bootstrap 000 = Bootstrap base

  • The last part is a sequential number indicating the error in more detail. This number is used to find the error message. A special meaning has the number 0: In this case a Java exception has been caught and the text is directly the message of this Java exception.
 Background Receivers

When a background receiver, e.g. the MQReceiverThread, tries to process a message and a sender has a problem, e.g. a HTTPSender where the corresponding server is not available, the background receiver will try again to process this message after some amount of time. The time to wait between the attempts is configurable.

This can lead to lots of error messages, when the resource where the message shall be sent to is unavailable for a longer time. To avoid being flooded with error messages, the background receiver can be stopped automatically after a specified amount of errors without a successful processing between them. There is a configuration entry for each type of background receiver, which controls this amount.

The administrator will be notified of stopping a background receiver by the same mechanism as for other errors. Additionally the state of all background receivers (running or stopped) can be viewed with the remote administration. The remote administration provides services to start a stopped background receiver and to stop a running one.

 Deleted Message Store

Some of the receivers have different options how to deal with the incoming message when an error occures during processing it. These options are explained in the table in chapter Transactions. To avoid trying to process the same message again and again, the receivers which are reading a message from a resource (e.g. the FileReceiver) are able to delete the message in case of an error. Before deleting, the message can be stored in a file, the so called Deleted Message Store. After remedying the cause of the error, the message can be resent to its destinations with the remote administration.

Wether the Deleted Message Store shall be used and the directory where to put in the files containing the deleted messages must be specified in the Configuration.