Selecting the right datatype to improve database performance

By Holger Klemt

Here is a further example of just one more method to improve your database performance: use the right data type!

We were set the challenge to find out how much influence the changes between GUID and Int32 or Int64 primary keys have in the database design regarding performance. So we created 3 different databases on a Windows machine, each with two simple tables (m for master, d for detail).

Here is the database structure for Int32 IDs:

       CREATE TABLE M (
           ID   INTEGER NOT NULL PRIMARY KEY,
           TXT  VARCHAR(30));

       CREATE TABLE D (
           ID    INTEGER  NOT NULL PRIMARY KEY,
           M_ID  INTEGER REFERENCES M(ID),
           TXT  VARCHAR(30));

Here is the database structure for Int64 IDs:

      CREATE TABLE M (
          ID   BIGINT NOT NULL PRIMARY KEY,
          TXT  VARCHAR(30));

      CREATE TABLE D (
          ID    BIGINT  NOT NULL PRIMARY KEY,
          M_ID  BIGINT REFERENCES M(ID),
          TXT  VARCHAR(30));

Here is the database structure for GUIDs:

     CREATE TABLE M (
         ID   CHAR(32) NOT NULL PRIMARY KEY,
         TXT  VARCHAR(30));

     CREATE TABLE D (
         ID    CHAR(32)  NOT NULL PRIMARY KEY,
         M_ID  CHAR(32) REFERENCES M(ID),
         TXT  VARCHAR(30));

To create the database for the GUID, we used a UDF from https://www.ibexpert.com/download/udf/uuidlibv12.zip.

   DECLARE EXTERNAL FUNCTION GUID_CREATE
       CSTRING(36) CHARACTER SET NONE
       RETURNS PARAMETER 1
       ENTRY_POINT 'fn_guid_create' MODULE_NAME 'uuidlib';

Next we created a stored procedure to generate the data in the GUID database.

  CREATE PROCEDURE INITDATA (ANZ INTEGER)
  AS
  declare variable m varchar(40);
  declare variable d varchar(40);
  declare variable dx integer;
  begin
    while (anz>0) do
    begin
      m=guid_create();
      m=strreplace(m,'-','');
      insert into m(id,txt) values (:m,current_timestamp);
      dx=10;
      while (dx>0) do
      begin
        select guid_create() from rdatabase$database into :d;
        d=strreplace(d,'-','');
        insert into d(id,txt,m_id) values (:d,current_timestamp,:m);
        dx=dx-1;
      end
      anz=anz-1;
    end
  end

The procedure to create the Integer ID data is much easier using a generator.

After we created all 3 databases with the parameter 500000 (i.e. 500,000 master and 5,000,000 detail records were created), we disconnected and reconnected again to the database to ensure that any cache influence did not alter the results.

To perform a typical SQL operation, we started a SELECT that joins all records from all tables:

 select count(*) from m join d on d.m_id=m.id

Here are the results:

 Operation/Info   Int32   Int64   GUID 
 Database Size    505 MB   550 MB   1030 MB 
 INITDATA(500000) 271s   275s   420s 
 Backup 49s   54s   90s 
 Restore    124s   127s   144s 
 Select     22s   22s   49s

Résumé

The changes between Int64 and Int32 are negligible, but the changes to a GUID is a problematic design. The integer datatypes will give you better performance.

To discover more hints and tips about where you can improve the performance of your database, just open the IBExpert menu item Tools / Stored Procedure/Trigger/View Analyzer and press [F9]. This analyzes all objects and displays all parts that do not use an index in a red color. To modify these objects, simply double click the line. A well-designed database should have no red line at all!

This feature is not available in the free IBExpert Personal Edition.

The IBExpert Full Version gives you unlimited access to these performance-tuning tools and is available for just EUR 199.00 at https://ibexpert.net/ibe/pmwiki.php?n=Main.OnlineShop.

Search

German

Follow IBExpertise on:

edit SideBar

the database experts

Selecting the right datatype to improve database performance

Selecting the right datatype to improve database performance

Search