Top
Interstage Big DataParallel Processing ServerV1.0.0 User's Guide
Interstage

1.1.1 Background

Not only are enormous amounts of data collected from smart devices, such as smart phones and tablets, and from sensors, but the formats and structures are many and varied, and these are continuously increasing.

This is known as Big Data and it is a major focus, for leading corporations in particular, as use of Big Data progresses and unprecedented business advantages are created.

Features of Big Data

Big Data has the following features:

  1. Massive size of data
    Enormous amounts of data, with data sizes reaching the terabyte to petabyte range

  2. Variety of data
    Data in a variety of formats: structured data (database data), non-structured data (sensor information, text data such as access log information), semi-structured data (data having the qualities of both structured data and non-structured data)

  3. Data frequently generated
    Continuous generation of new data from sensors and similar

  4. Need to use data in real-time
    Performing analysis in a short amount of time and using the data in real-time

"Apache Hadoop" (*1) is widely used and is the world standard for applications that can resolve the above Items 1 and 2 in Big Data processing (processing data of massive size, and variety of data).

*1: Apache Hadoop: Open source software, developed by Apache Software Foundation (ASF), that efficiently performs distribution and parallel processing of Big Data