Selecting the right datatype to improve database performance
<< | Database technology articles | >>
Selecting the right datatype to improve database performance
By Holger Klemt
Here is a further example of just one more method to improve your database performance: use the right data type!
We were set the challenge to find out how much influence the changes between GUID and Int32 or Int64 primary keys have in the database design regarding performance. So we created 3 different databases on a Windows machine, each with two simple tables (m
for master, d
for detail).
Here is the database structure for Int32 IDs:
CREATE TABLE M ( ID INTEGER NOT NULL PRIMARY KEY, TXT VARCHAR(30)); CREATE TABLE D ( ID INTEGER NOT NULL PRIMARY KEY, M_ID INTEGER REFERENCES M(ID), TXT VARCHAR(30));
Here is the database structure for Int64 IDs:
CREATE TABLE M ( ID BIGINT NOT NULL PRIMARY KEY, TXT VARCHAR(30)); CREATE TABLE D ( ID BIGINT NOT NULL PRIMARY KEY, M_ID BIGINT REFERENCES M(ID), TXT VARCHAR(30));
Here is the database structure for GUIDs:
CREATE TABLE M ( ID CHAR(32) NOT NULL PRIMARY KEY, TXT VARCHAR(30)); CREATE TABLE D ( ID CHAR(32) NOT NULL PRIMARY KEY, M_ID CHAR(32) REFERENCES M(ID), TXT VARCHAR(30));
To create the database for the GUID, we used a UDF from https://www.ibexpert.com/download/udf/uuidlibv12.zip.
DECLARE EXTERNAL FUNCTION GUID_CREATE CSTRING(36) CHARACTER SET NONE RETURNS PARAMETER 1 ENTRY_POINT 'fn_guid_create' MODULE_NAME 'uuidlib';
Next we created a stored procedure to generate the data in the GUID database.
CREATE PROCEDURE INITDATA (ANZ INTEGER) AS declare variable m varchar(40); declare variable d varchar(40); declare variable dx integer; begin while (anz>0) do begin m=guid_create(); m=strreplace(m,'-',''); insert into m(id,txt) values (:m,current_timestamp); dx=10; while (dx>0) do begin select guid_create() from rdatabase$database into :d; d=strreplace(d,'-',''); insert into d(id,txt,m_id) values (:d,current_timestamp,:m); dx=dx-1; end anz=anz-1; end end
The procedure to create the Integer ID
data is much easier using a generator.
After we created all 3 databases with the parameter 500000
(i.e. 500,000 master and 5,000,000 detail records were created), we disconnected and reconnected again to the database to ensure that any cache influence did not alter the results.
To perform a typical SQL operation, we started a SELECT
that joins all records from all tables:
select count(*) from m join d on d.m_id=m.id
Here are the results:
Operation/Info Int32 Int64 GUID Database Size 505 MB 550 MB 1030 MB INITDATA(500000) 271s 275s 420s Backup 49s 54s 90s Restore 124s 127s 144s Select 22s 22s 49s
Résumé
The changes between Int64 and Int32 are negligible, but the changes to a GUID is a problematic design. The integer datatypes will give you better performance.
To discover more hints and tips about where you can improve the performance of your database, just open the IBExpert menu item Tools / Stored Procedure/Trigger/View Analyzer and press [F9]. This analyzes all objects and displays all parts that do not use an index in a red color. To modify these objects, simply double click the line. A well-designed database should have no red line at all!
This feature is not available in the free IBExpert Personal Edition.
The IBExpert Full Version gives you unlimited access to these performance-tuning tools and is available for just EUR 199.00 at https://ibexpert.net/ibe/pmwiki.php?n=Main.OnlineShop.