Guide:

Using the Data Lake

This guide is made to give you an overview on how you can use and access the Data Lake. With ITBI™ Data Lake access you can do the analysis of your choice directly on your data and with the tools of your choice.

What data is in the Data Lake?

MXG-like Tables:

Only selected fields from the SMF types we support
Same naming and structure as MXG

Raw SMF in Tables:

Essentially all fields from all SMF types we support
Mostly decoded
Same naming and structure as standard IBM SMF names

Raw SMF files:

As sent from the customer

Ways of working with the Data Lake

Direct access from the ITBI portal using Amazon Athena

Prototyping
Ad hoc queries
Hosted by SMT Data

Programs using a remote ODBC connection

Programming
Re-use of existing SAS/MXG
Any language that supports remote ODBC
Hosted at by the customer

What can the Data Lake be used for?

General Use Cases

Reporting on recent data – within minutes of data being received by SMT Data
Reporting on fields that are not supported in the cubes
Reporting on details that are aggregated away in the cubes
Reporting on ‘event’ based data
Complex logic or calculations that is not easily implemented in the BI tool
Graphics or formatting not supported in the BI tool
Integration with customer or third-party systems

Raw tables

Reporting on fields that are not supported in the cubes
Reporting on details that are aggregated away in the cubes
Reporting on ‘event’ based data
Requires a good understanding of SMF

MXG-like tables

Reuse of existing SAS programs that work on MXG tables, but note, only selected fields are supported
Taking advantage of existing skills with SAS/MXG
Requires a good understanding of MXG

Accessing the Data Lake

SQL from Athena

Log on to the Portal
Choose Data Lake
Choose AI Developer

Accessing the Raw tables

Choose the Raw Database

Choose a ‘nodup’ view based on an SMF tables

View names start with v_smf_*
The main views are named v_smf_smfyyyzz_nodup
Where yyy is the SMF number
And zz is the subtype
So v_smf_smf03001_nodup is SMF30 subtype 1

(Note the difference between the ‘nodup’ view and the underlying raw table is that the ‘nodup’ view removes any duplicated records)

Accessing the MXG-like Tables