Top
Interstage Big Data Complex Event Processing Server V1.0.0 User's Guide
Interstage

5.4.13 Designing an Event Log Analysis Application

If logging is to be used to accumulate event logs in a Hadoop system, design an application to analyze the content of the accumulated event logs. The application will use the Hadoop API and operate on the Hadoop system.

Refer to the Interstage Big Data Parallel Processing Server (hereafter, referred to as "BDPP") manuals for information on designing and developing applications to operate on a Hadoop system.

The data formats of the event logs to be analyzed by this application are shown below.

5.4.13.1 Output Destination and File Format of an Event Log

Event logs are output to a log storage area specified in the event type definition or in the logging listener in a complex event processing statement. The log storage area that will be the output destination is generated automatically.

If the output destination is a Hadoop system, the details are as follows:

Output destination

The output destination can be changed using the value specified in the directory element of the engine configuration file.

If a directory name is specified in the directory element, the output destination will be a path made by joining the following values:

  • Value set in "pdfs.fs.local.basedir" (*1)

  • Directory name specified in the engine configuration file

  • Log storage area specified in the event type definition or logging listener

  • Automatically generated log file name

*1: "pdfs.fs.local.basedir" is the Hadoop mount directory. Refer to the BDPP manuals for details.

If a slash (/) only is specified in the directory element, the output destination will be a path made by joining the following values:

  • Value set in "pdfs.fs.local.basedir"

  • Log storage area specified in the event type definition or logging listener

  • Automatically generated log file name

Example

Example of output destination

The output destination will be "/mnt/pdfs/hadoop/tmp/logFileName" for the following conditions:

  • If the value set in "pdfs.fs.local.basedir" is "/mnt/pdfs"; and

  • If "hadoop" is specified as the directory name in the engine configuration file; and

  • If "/tmp" is specified as the log storage area specified in the event type definition or logging listener of the complex event processing statement

The output destination will be "/mnt/pdfs/tmp/logFileName" for the following conditions:

  • If the value set in "pdfs.fs.local.basedir" is "/mnt/pdfs"; and

  • If a slash (/) is specified as the directory name in the engine configuration file; and

  • If "/tmp" is specified as the log storage area specified in the event type definition or logging listener of the complex event processing statement

Note

If the output destination of the event log is duplicated and the format of the event data is the same, event data of a different event type will be output to the same file. If analysis is to be performed by event type or by output by logging listener, separate the output destinations.

Log file format

The format will be Hadoop SequenceFile (binary file) format.

Log file name

A log file will be automatically generated in the log storage area using the file name shown below.

This file will be renamed with the ".done" extension in 300 seconds by default.

dateTime_VMname_branchNumber
  • dateTime: yyyyMMddHHmmssSSS

  • VMname: processID@CEPserverHostName

  • branchNumber: 0000000001 to 0000000122

Point

A file with the ".done" extension will be analyzed by the event log analysis application. Move it to an arbitrary directory to analyze it.

Note

A file with an extension other than ".done" is a file that is being output, so do not perform an operation on it.

Upper limit of file size

The upper limit of the file size is LONG MAX (263 - 1).

Upper limit of number of files

None

Key of SequenceFile

The date and time information (yyyyMMddHHmmss) will be the key. The corresponding Hadoop type (API) is "org.apache.hadoop.io.Text".

The date and time above will be the date and time at which the event data was written. (This may differ from the date and time at which the CEP engine received the events.)

Value of SequenceFile

Input events are output as they are. The corresponding Hadoop type (API) is "org.apache.hadoop.io.BytesWritable".

Compression format of SequenceFile

Record compression

Versions of SequenceFile

6

Information

If outputting to the engine log

Input events are output to the engine log unchanged.