Working with XXL or XXS tables

Introduction

XXL

XXL table rules are performance related rules that help detect incorrect or poorly performing SQL queries running on XXL tables. XXL tables can be defined as extremely large tables, containing a large amount of data. The goal is to use table size from production systems because development / integration systems may not feature really large tables and would not help detect real threat on application performance levels. If the information is not physically accessible (e.g.: first release of an application with no production environment), it is worth simulating the information by identifying tables that are expected to be large and by inputting a fake size, yet greater than the threshold in use in the “XXL tables” diagnostics.

Each XXL rule has two versions: one for XXL and another for non-XXL:

  • Violations and grades are always calculated for the non XXL rule whatever configuration is provided
  • If a configuration for XXL is provided for one or several schemas, the extra XXL rule is activated on those.

XXS

The rule Avoid SQL queries that no index can support for artifacts with high fan-in is configured to ignore tables that are too small (i.e. XXS), simply because adding an index on a very small table is generally considered as useless.

Enabling XXL/XXS rules

To enable these XXL/XXS rules it is necessary to provide the analyzer with table row size information in XML based *.sqltablesize files. This can be done as follows:

Using the “out of the box” analyzers

If you are using the “out of the box” analyzers in CAST Imaging Core for Microsoft SQL Server, Sybase ASE, Oracle Server, SAP ABAP and Oracle Forms/Reports then:

Using CAST Imaging Console

Deliver the *.sqltablesize files that define your table row size information along with your extraction files in the ZIP file or the source code upload folder:

Using CAST Management Studio (legacy)

You can configure the Table Size Folder option in the relevant Analysis Unit editor in the CAST Management Studio. This folder must be populated with the *.sqltablesize files that define your table row size information. For example:

Using the SQL Analyzer (com.castsoftware.sqlanalyzer) extension

If you are using the com.castsoftware.sql extension then you can deliver the *.sqltablesize files that define your table row size information in the same folder alongside the source code.

The Table Sizes Folder option in the relevant Analysis Unit editor in the CAST Management Studio does not exist for Universal Analyzer Analysis Units which are used for the SQL Analyzer extension.

.sqltablesize file samples

There are two formats used for sqltablesize files:

  • legacy - this format is accepted by both the “out of the box” CAST Imaging Core Analyzers and the com.castsoftware.sql extension, however, this format can only be used for XXL rules.
  • new - this format is accepted ONLY by the com.castsoftware.sql extension and therefore cannot be used with “out of the box” CAST Imaging Core nalyzers. This format can be used for XXL and XXS rules.

Both formats are explained below.

There are no examples for SAP ABAP since the CAST SAP Extractor NG generates the file automatically.

Legacy format

The following “legacy” format is accepted by both the “out of the box” CAST Imaging Core Analyzers and the com.castsoftware.sql extension. This format can only be used for XXL rules:

Oracle Server

<?xml version="1.0" encoding="UTF-8" ?>
<config name="SQL Table Size" version="1.0.0.0" extraction-date="2009/02/10" >
<oracle castformat="7.0">
<!-- "server name" is equal to the value defined for the CAST_Oracle_Instance item in the .UAXdirectory file resulting from the extraction process -->
<server name="ORA10G" >
<schema name="CASTPUBS">
...
<table name="ORDER_LINE" rows="2000000000"/>
...
</schema>
</server>
</oracle>
</config>

Microsoft SQL Server

<?xml version="1.0" encoding="UTF-8" ?>
<config name="SQL Table Size" version="1.0.0.0" extraction-date="2009/02/10" >
<microsoft>
<!-- server name can be written as HOST\INSTANCE_NAME but INSTANCE_NAME alone will also function -->
<server name="PDOXPLAP2\SQLS2K5" >
<!-- schema name is optional - if it is not applied, then the table row value is applied to all tables of that name in the database -->
<schema name="schema or user name">
<database name="CASTPUBS">
...
<table name="Order_line" rows="20000000000"/>
...
</database>
</schema>
</server>
</microsoft>
</config>

Sybase ASE Server

<?xml version="1.0" encoding="UTF-8" ?>
<config name="SQL Table Size" version="1.0.0.0" extraction-date="2009/02/10" >
<sybase>
<!-- server name can be written as HOST\INSTANCE_NAME but INSTANCE_NAME alone will also function -->
<server name="Bordeaux" >
<!-- schema name is optional - if it is not applied, then the table row value is applied to all tables of that name in the database -->
<schema name="schema or user name">
<database name="cwmm">
...
<table name="ACC" rows="20000000000"/>
...
</database>
</schema>
</server>
</sybase>
</config> 

IBM DB2-UDB Server

<?xml version="1.0" encoding="UTF-8" ?>
<config name="SQL Table Size" version="1.0.0.0" extraction-date="2009/02/10" >
<ibm-udb>
<server name="Bordeaux" >
<schema name="cwmm">
...
<table name="ACC" rows="1000000"/>
...
</schema>
</server>
</ibm-udb>
</config> 

IBM DB2 z/OS Server

<?xml version="1.0" encoding="UTF-8" ?>
<config name="SQL Table Size" version="1.0.0.0" extraction-date="2009/02/10" >
<ibm-zos>
<server name="Bordeaux" >
<database name="cwmm">
<schema name="cwmm">
...
<table name="ACC" rows="1000000"/>
...
</schema>
</database>
</server>
</ibm-zos>
</config> 

SAP

<?xml version="1.0" encoding="utf-8"?>
<config name="SQL Table Size" version="v8.0.3" extraction-date="2017/12/04/15/33/19">
<!--Following data are extracted from MANDANT:100-->
<sap>
...
<table name="/.../..." client-dependant="X" rows="9"/>
<table name="/.../..." client-dependant="X" rows="2"/>
...
</sap>
</config>

New format

The following format is accepted ONLY by the com.castsoftware.sql extension and therefore cannot be used with the “out of the box” CAST Imaging Core Analyzers. This format can be used for XXL and XXS rules. The new format uses the pattern: schema_name.table_name=\<table row count\> i.e. one line per XXL table. xxl_threshold and xxs_threshold options allow the default thresholds which will trigger the rules (100,000 for XXL and 10 for XXS) to be overridden: 

xxl_threshold=100000
xxs_threshold=10
 
CASTPUBS.ORDER_LINE=2000000000
cwmm.ACC=2000000000

Starting with com.castsoftware.sql extension 3.4.1-funcrel, the pattern database_name.schema_name.table_name is also accepted.

Managing thresholds

Each XXL/XXS rule has a threshold which determines when a table should be considered as “XXL” or “XXS”. This threshold is, by default set to 100,000 for XXL (i.e. anything over 100,000 will trigger the XXL rules) and 10 for XXS (i.e. anything under 10 will trigger the XXS rules). You can override the default threshold in two ways:

Using the CAST Management Studio

This method is valid for both new and legacy .sqltablesize formats.

Each XXL/XXS rule has its own specific Contextual Parameter that defines the threshold. For example, the Quality Rule 7666 “Avoid using SELECT … ENDSELECT statement on XXL Tables” uses a Contextual Parameter as follows:

Editing the parameter shows the threshold value:

It is therefore possible to change this value and therefore the threshold at which a table is deemed to be “XXL”.

Using a parameter in the .sqltablesize file

This method is valid for the new .sqltablesize format only.

If you are using the new .sqltablesize format accepted ONLY by the com.castsoftware.sql extension, then you can override all threshold values for all XXL / XXS rules by adding the following to the top of your .sqltablesize file:

xxl_threshold=200000
XXS_threshold=10

.sqltablesize file generation methods

You can generate .sqltablesize files by executing the following scripts:

  • If you are collecting table size from a production system while analyzing a development system, please remember to change the server name value in the generated file.
  • There are no examples for SAP ABAP since the CAST SAP Extractor NG generates the file automatically.
  • When running any scripts for generating legacy format .sqltablesize files, please make sure that the very first line of the output file is not blank, and if it is, remove the blank line as this will cause issues during the analysis procesS.

Oracle Server

Legacy format

Execute the following script using the owner of the schema you are interested in. This will generate output in the “legacy format”.

set pagesize 0;
set echo off;
set heading off;
set feedback off;
spool ORACLE.SQLTABLESIZE;
select '<?xml version="1.0" encoding="UTF-8" ?>' from dual;
select '<config name="SQL Table Size" version="1.0.0.0" extraction-date="' ||to_char(sysdate, 'YYYY' )||'/'||to_char(sysdate, 'MM' ) ||'/'||to_char(sysdate, 'dd' )||'" >' from dual;
select '<oracle castformat="7.0">' from dual;
select '<server name="'|| sys_context('USERENV', 'INSTANCE_NAME')||'" >' from dual;
select '<schema name="'||sys_context('USERENV', 'CURRENT_SCHEMA')||'">' from dual;
select '<table name="'||table_name||'" rows="'||nvl(num_rows,0) ||'"/>' from user_tables ;
select '</schema>' from dual;
select '</server>' from dual;
select '</oracle>' from dual;
select '</config>' from dual;
spool off;
exit;

You can also automate it as follows:

sqlplus <owner>/<password>@<connect identifier> @<script name>

New format for the SQL Analyzer (com.castsoftware.sqlanalyzer) extension

This will generate output in the “new format” for the com.castsoftware.sql extension (note that this is provided as an example with no official support):

set echo off;
set heading off;
set feedback off;
spool ORACLE.SQLTABLESIZE;
select 'xxl_threshold=200000' from dual
union all
select 'xxs_threshold=10' from dual
union all
select '   ' from dual
union all
select sys_context('USERENV', 'CURRENT_SCHEMA')||'.'||table_name||'='||to_char(nvl(num_rows,0)) from user_tables;
spool off;
exit;

Microsoft SQL Server

Legacy format

Execute the following script against the database you are interested in. This will generate output in the “legacy format”.

set nocount on
select '<?xml version="1.0" encoding="UTF-8" ?>'
select '<config name="SQL Table Size" version="1.0.0.0" extraction-date="' + convert(varchar(10),GETDATE()) + '" >'
select char(9) + '<microsoft>'
select char(9) + char(9) + '<server name="'+ @@servername +'">'
select char(9) + char(9) + char(9) + '<database name="'+ DB_NAME() +'">'
select char(9) + char(9) + char(9) + char(9) +'<table name="' + so.name + '" rows="' + convert(varchar(10),convert(int,MAX(si.rows))) +'"/>'
FROM
sysobjects so,
sysindexes si
WHERE
so.xtype = 'U'
AND
si.id = so.id
GROUP BY
so.name
select char(9) + char(9) + char(9) + '</database>'
select char(9) + char(9) + '</server>'
select char(9) + '</microsoft>'
select '</config>'
go

You can automate it as follows:

sqlcmd -U sa -P <password> -S <server> -H <host> -d <target database> -i <script> -o SQLSERVER.SQLTABLESIZE -h -1

If you run the above query in Microsoft SQL Server Management Studio, the default output mode cannot easily be copied out of the results window. Therefore CAST recommends that you change the query output to either Text or File as follows:

  • Click the Query menu (available when working in the Query window)
  • Select Results To and then choose either:
    • Results To Text
    • Results To File

New format for the SQL Analyzer (com.castsoftware.sqlanalyzer) extension

This will generate output in the “new format” for the com.castsoftware.sql extension (note that this is provided as an example with no official support):

SELECT s.Name + '.' + t.Name + '=' + CAST(SUM(p.rows) AS VARCHAR)
FROM sys.tables t
INNER JOIN sys.indexes i ON t.object_id = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
GROUP BY s.Name, t.Name
ORDER BY s.Name, t.Name

Sybase ASE

Execute the following script against the database you are interested in. This will generate output in the “legacy format”.

select "<?xml version="1.0" encoding="UTF-8" ?>"
select "<configSQL Table Size"" version=""1.0.0.0"" extraction-date="""|| convert(varchar(12),getdate()) || """>"
select char(9) ||"<sybase>"
select char(9) || char(9) || "<server name="'+ @@servername +'">"
select char(9) || char(9) || char(9) || "<database>"
begin SELECT "<table" || o.name || """ rows=""" || convert(varchar(50),convert(int,s.rowcnt)) || """ />"
FROM sysobjects o, systabstats s
WHERE o.id = s.id AND s.indid IN (0,1)
AND o.type = 'U'
ORDER BY o.type, o.name end
select char(9) || char(9) || char(9) || "</database>"
select char(9) || char(9) || "</server>"
select char(9) || "</sybase>"
select "</config>"
go

IBM DB2 UDB

Execute the following script against the schema you are interested in.

Legacy format

This will generate output in the “legacy format”.

connect to <server> user <user> using <role>;
select '<?xml version="1.0" encoding="UTF-8" ?>' from SYSIBM.SYSDUMMY1;
select '<config name="SQL Table Size" version="1.0.0.0" extraction-date="' || varchar(current date) || '" >' from SYSIBM.SYSDUMMY1;
select CAST(char(' ') as char(4)) || '<ibm-udb>' from SYSIBM.SYSDUMMY1;
select CAST(char(' ') as char(8)) || '<server name="' || HOST_NAME || '">' FROM TABLE(SYSPROC.ENV_GET_SYS_INFO()) AS SYSTEMINFO;
select CAST(char(' ') as char(12)) || '<database name="'DB2UDB'">' from SYSIBM.SYSDUMMY1;
select CAST(char(' ') as char(16)) || '<schema name="CASTPUBS">' from SYSIBM.SYSDUMMY1;
select CAST(char(' ') as char(20)) || '<table name="' || tabname || '" rows="' || TRIM(TRAILING FROM CAST(card as char(128))) || '" />' FROM syscat.tables where tabschema = 'CASTPUBS' and type = 'T' order by tabname;
select CAST(char(' ') as char(16)) || '</schema>' from SYSIBM.SYSDUMMY1;
select CAST(char(' ') as char(12)) || '</database>' from SYSIBM.SYSDUMMY1;
select CAST(char(' ') as char(8)) || '</server>' from SYSIBM.SYSDUMMY1;
select CAST(char(' ') as char(4)) || '</ibm-udb>' from SYSIBM.SYSDUMMY1;
select '</config>' from SYSIBM.SYSDUMMY1;

Please ensure that you:

  • Modify the first line to use your own login credentials.
  • You will need to run this script against each schema you are interested in and for each schema you will need to modify the lines containing:
    • <schema name=“CASTPUBS”>
    • where tabschema = ‘CASTPUBS’
  • The resulting DB2UDB.SQLTABLESIZE file is not valid XML, please ensure that you remove the first few lines that are not part of the valid generated XML.

New format for the SQL Analyzer (com.castsoftware.sqlanalyzer) extension

This will generate output in the “new format” for the com.castsoftware.sql extension.

SELECT concat(concat(trim(TABSCHEMA), '.') , trim(TABNAME)) concat '='  concat trim(cast(CARD as char(50))) TableSize
FROM SYSCAT.TABLES
WHERE type = 'T'
AND TABSCHEMA = 'YOUR SCHEMA NAME'
ORDER BY TableSize 

For example:

SELECT concat(concat(trim(TABSCHEMA), '.') , trim(TABNAME)) concat '='  concat trim(cast(CARD as char(50))) TableSize
FROM SYSCAT.TABLES
WHERE type = 'T'
AND TABSCHEMA = 'TT3'
ORDER BY TableSize

Will generate the following ouput :

TT3.BAN_IM_1=7
TT3.BAN_IM_2=7
TT3.BATCH_REF=4
TT3.BUS_LIST_1=18
....

IBM DB2 z/OS

Legacy format

There is no script available to generate XXL table size information in the legacy format for a schema hosted on an IBM DB2 z/OS server. In previous releases of CAST AIP the DB2 z/OS Extractor would automatically generate the raw XXL table size data and therefore no manual file creatiion was necessary, however, this extractor is no longer used for IBM DB2 z/OS analyses in CAST AIP ≥ 8.3.x. Please use the “new format” explained below instead.

New format for the SQL Analyzer (com.castsoftware.sqlanalyzer) extension

This JCL statement can be used as is to generate output in the “new format” for the com.castsoftware.sql extension:

/*
//*------------------------------------------------------
//*
//* 26 - EXTRACTING NUMBER OF ROWS FOR TABLES
//*
//* COLUMN        TYPE     COL-SIZE EXTRACT-SIZE
//* ============= ======== ======== ============
//* CREATOR       VARCHAR  128      128
//* NAME          VARCHAR  128      128
//* CARDF         FLOAT    11       11
//*------------------------------------------------------
//* Following filters can be inserted in the WHERE clause:
//*   - CREATOR IN ('xxx','yyy', ...)
//*   - CHAR(DBNAME) IN ('xxx','yyy', ...)
//*------------------------------------------------------
//STEP26   EXEC PGM=IKJEFT01,DYNAMNBR=20
//SYSTSPRT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//*
//SYSREC00 DD DSN=CAST.DB2.TABROWS,DISP=(NEW,CATLG,),
//            SPACE=(TRK,(100,100),RLSE)
//SYSPUNCH DD SYSOUT=*
//SYSPUNCH DD DUMMY
//SYSTSIN  DD *
  DSN SYSTEM(DBx)
  RUN  PROGRAM(DSNTIAUL) PLAN(DSNTIBxx) -
      LIB('DSNxxx.RUNLIB.LOAD') PARMS('SQL')
//SYSIN    DD *

  SELECT
      CONCAT(CREATOR, CONCAT('.', CONCAT(NAME, CONCAT('=', CARDF))))
  FROM SYSIBM.SYSTABLES
  WHERE TYPE = 'T'
  ORDER BY DBNAME, CREATOR, NAME;

MySQL/MariaDB

This will generate output in the “new format” for the com.castsoftware.sql extension (note that this is provided as an example with no official support):

select concat(TABLE_SCHEMA, '.', TABLE_NAME, '=', TABLE_ROWS) as TABLESIZE 
from INFORMATION_SCHEMA.TABLES 
where TABLE_TYPE = 'BASE TABLE'
and TABLE_SCHEMA not in ('INFORMATION_SCHEMA', 'MYSQL', 'PERFORMANCE_SCHEMA', 'SYS')
and TABLE_SCHEMA = 'XXX' --  to be specified

PostgreSQL

This will generate output in the “new format” for the SQL Analyzer extension (note that this is provided as an example with no official support):

SELECT concat(n.nspname, '.',relname, '=', reltuples::bigint) as TABLESIZE 
    -- n.nspname = schema_name, 
    -- relname = Table_Name, 
    -- reltuples = number of rows in the table 
FROM pg_class, pg_namespace n 
WHERE pg_class.relnamespace = n.oid 
    AND relkind = 'r' -- r = ordinary table
    AND n.nspname NOT LIKE 'pg_%' -- we don't want the system tables 
    AND n.nspname = 'XXX' -- to be specified

Teradata

This will generate output in the “new format” for the com.castsoftware.sql extension (note that this is provided as an example with no official support). Replace MY_DB with your database name:

select 'show ' || case 
  when tablekind = 'T' then 'table' 
  when tablekind = 'V' then 'view' 
  when tablekind = 'P' then 'procedure' 
end || ' ' || trim(databasename) || '.' || trim(tablename) || ';'
from dbc.tables 
where tablekind in ( 'T', 'V', 'P') 
  and databasename = 'MY_DB'
order by 1;

Informix

This will generate output in the “new format” for the com.castsoftware.sql extension (note that this is provided as an example with no official support). Replace MY_DB with your database name:

database sysmaster;
set isolation to dirty read;
select trim(n.owner) || "." || trim(n.tabname) || "=" || cast(h.nrows as varchar(50)) as TableSizeFile
from sysptnhdr h, MY_DB:systabnames n
where h.dbsname = 'MY_DB'
and h.partnum = n.partnum;

SQLite

This will generate output in the “new format” for the com.castsoftware.sql extension (note that this is provided as an example with no official support):

SELECT "xxl_threshold=100000" AS TableSizeFile
UNION ALL
SELECT "xxs_threshold=10" AS TableSizeFile
UNION ALL
SELECT "" AS TableSizeFile
UNION ALL
SELECT tbl || "=" || stat AS TableSizeFile FROM sqlite_stat1;

Tips - Server, database, schemas, table identification

The server, database/schema and table nodes in both new and legacy formats of the sqltablesize file must match the information stored in the Analysis Service schema following an initial analysis. You can determine this information in several ways:

Find the server name

Using CAST Enlighten
  • Right click the Instance either in the Object Browser or in the Graphical View:

  • Select Properties to update the Property window. The “host” name highlighted below can be used to define the “Server” node in the .SQLTABLESIZE file:

Using CAST System Views

You can query the CAST System Views in the Analysis Service schema to identify the “server” name. Run the following query (this is customised for CAST Storage Service/PostgreSQL instances servers, but can be adapted to Microsoft SQL Server and Sybase ASE):

select DESCRIPTION
from <analysis_service>.CSV_OBJECT_DESCRIPTIONS
where DESC_TYPE = 'host name'

Find the Database and Table name

Databases/Schemas and Tables can be easily identified in CAST Enlighten using the Object Browser or the Graphical View:

Troubleshooting

Checking sqltablesize information upload

To identify which tables have been tagged with the sqltablesize information, you can run on the following query against your Analysis Service schema, following an analysis:

Using the legacy “out of the box” analyzers

Select OBJECT_NAME, OBJECT_FULLNAME, OI.InfVal
From CDT_OBJECTS CO, ObjInf OI
Where OI.IdObj = CO.OBJECT_ID
And OI.InfTyp = 115
And OI.InfSubTyp = 1

Using SQL Analyzer (com.castsoftware.sqlanalyzer) extension

Select OBJECT_NAME, OBJECT_FULLNAME, OI.InfVal
From CDT_OBJECTS CO, ObjInf OI
Where OI.IdObj = CO.OBJECT_ID
And OI.InfTyp = 1101000
And OI.InfSubTyp = 2

Finding XXL tables

To identify the XXL tables, you can run the following query against your CAST Analysis Service schema, following an analysis:

Using the legacy “out of the box” analyzers

Select OBJECT_NAME, OBJECT_FULLNAME, OI.InfVal
From CDT_OBJECTS CO, ObjInf OI
Where OI.IdObj = CO.OBJECT_ID
And OI.InfTyp = 115
And OI.InfSubTyp = 1
And OI.InfVal >= 100000 -- change the value if you changed the threshold

Using SQL Analyzer (com.castsoftware.sqlanalyzer) extension

Select OBJECT_NAME, OBJECT_FULLNAME, OI.InfVal
From CDT_OBJECTS CO, ObjInf OI
Where OI.IdObj = CO.OBJECT_ID
And OI.InfTyp = 1101000
And OI.InfSubTyp = 2
And OI.InfVal >= 100000 -- change the value if you changed the threshold