This first paper speaks about some comparisons between PostGreSQL and SQL Server in pointing out the differences in terms of performances of somes administrative queries that’s ordinary DBA needs to execute.
Our test material is a HP Z840 workstation with:
- 2 sockets (Intel Xeon E5-2680 v3);
- 128 Gb of RAM;
- OS Windows 10 Enterprise.
The RDBMS are:
- SQL Server 2019 Developper’s edition;
- PostGreSQL v 13.1 community edition.
The client applications used for testing are:
- SQL Server management Studio v 18.5 (for SQL Server);
- PGadmin 4 v 4.29 (for PostGreSQL).
Configuration of RDBMS:
- SQL Server has no particular configuration. The server collation is French_100_BIN2;
- PostGreSQL has a specific parametarization:
1 2 3 4 5 6 7 |
ALTER SYSTEM SET effective_cache_size = 12000000; -- 12 000 000 x 8 192 b : 96 Gb ALTER SYSTEM SET shared_buffers = 2560000; -- 2 560 000 x 8 192 b : 20 Gb ALTER SYSTEM SET work_mem = 500000; -- 500 000 x 1 024 b : 500 Mb ALTER SYSTEM SET temp_buffers = 512000; -- 512 000 x 8 192 b : 4 Gb ALTER SYSTEM SET max_worker_processes = 48; ALTER SYSTEM SET max_parallel_workers = 48; ALTER SYSTEM SET max_parallel_workers_per_gather = 8; |
Configuration of the database:
SQL Server has a storage configuration as follow:
- Data file : 12 Gb;
- Transaction log file : 8 Gb.
PostGreSQL has a specific collation comming with the OS parameterization:
- Encoding = UTF8;
- Collation = French_France.1252;
- Character type = French_France.1252.
Our test protocol was:
- running 10 times each queries;
- eliminating best and worst response time;
- computing the average response time of the 8 remaining elapsed time;
- we stopped the other RDBMS service while testing one.
The SQL queries to reproduce the test can be download at:
https://1drv.ms/u/s!AqvZfiQYoNpBihI4efgbhodhRZec?e=07igoa
- The file « Benchmark test PostGreSQL.sql » contains the SQL script to create the table and populate it, and also all the queries we tested for PostGreSQL;
- The file « Benchmark test SQL Server.sql » contains the SQL script to create the table and populate it, and also all the queries we tested for SQL Server;
- The file « personnes.zip » contains the data to load into the table of our tests (table have two versions). It’s a ZIP file containing a text file of 239 820 Kb inside which you will find 10 million rows of persons.
The tables are:
SQL Server | PostGreSQL |
---|---|
CREATE TABLE T_PERSONNE_PRS (PRS_ID INT IDENTITY PRIMARY KEY, PRS_NOM VARCHAR(32), PRS_PRENOM VARCHAR(25)) | CREATE TABLE T_PERSONNE_PRS (PRS_ID SERIAL PRIMARY KEY, PRS_NOM VARCHAR(32), PRS_PRENOM VARCHAR(25)); |
CREATE TABLE T_PERSONNE_CI_AI_PRS (PRS_ID INT PRIMARY KEY, PRS_NOM VARCHAR(32) COLLATE french_ci_ai, PRS_PRENOM VARCHAR(25) COLLATE french_ci_ai); | CREATE COLLATION fr_ci_ai (provider = icu, locale = 'fr-u-ks-level1-kc-false', deterministic=false); CREATE TABLE t_personne_ci_ai_prs (prs_id INT PRIMARY KEY, prs_nom VARCHAR(32) COLLATE fr_ci_ai, prs_prenom VARCHAR(25) COLLATE fr_ci_ai); |
We had to create a collation for PostGreSQL’s table 2 because PostGreSQL has very few internal collation unlike SQL Server which has over 5500.
The DBA queries are:
SQL Server | time ms | PostGreSQL | time ms | ratio (MS/PG) | |
---|---|---|---|---|---|
DBA 1 | BULK INSERT T_PERSONNE_PRS FROM 'C:\tmp\Personnes.txt' WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', KEEPIDENTITY, BATCHSIZE = 10000000, CODEPAGE = 'ACP'); | 19 729 | COPY T_PERSONNE_PRS (PRS_ID, PRS_NOM , PRS_PRENOM) FROM 'C:\tmp\Personnes.txt' WITH (DELIMITER ',', ENCODING 'WIN1252'); | 50 850 | 2.58 |
DBA 2 | ALTER TABLE T_PERSONNE_PRS REBUILD; GO | 2 269 | VACUUM FULL t_personne_prs; REINDEX TABLE t_personne_prs; | 80 000 | 35.26 |
DBA 3 | CREATE INDEX X_1 ON T_PERSONNE_PRS (PRS_NOM); | 2 348 | CREATE INDEX X_1 ON T_PERSONNE_PRS (PRS_NOM); | 37 713 | 16.06 |
DBA 4 | CREATE INDEX X_2 ON T_PERSONNE_PRS (PRS_PRENOM); | 2 312 | CREATE INDEX X_2 ON T_PERSONNE_PRS (PRS_PRENOM); | 35 341 | 15.29 |
DBA 5 | CREATE INDEX X_3 ON T_PERSONNE_PRS (PRS_NOM, PRS_PRENOM); | 4 779 | CREATE INDEX X_3 ON T_PERSONNE_PRS (PRS_NOM, PRS_PRENOM); | 57 825 | 12.04 |
DBA 6 | ALTER TABLE T_PERSONNE_PRS REBUILD WITH(DATA_COMPRESSION = PAGE); | 2 338 | PG does not have any compression of relational data | ||
DBA 7 | ALTER INDEX X_1 ON T_PERSONNE_PRS REBUILD WITH(DATA_COMPRESSION = PAGE); | 1 465 | PG does not have any compression of relational indexes | ||
DBA 8 | ALTER INDEX X_2 ON T_PERSONNE_PRS REBUILD WITH(DATA_COMPRESSION = PAGE); | 1 632 | PG does not have any compression of relational indexes | ||
DBA 9 | ALTER INDEX X_3 ON T_PERSONNE_PRS REBUILD WITH(DATA_COMPRESSION = PAGE); | 1 760 | PG does not have any compression of relational indexes | ||
DBA 10 | ALTER TABLE T_PERSONNE_PRS REBUILD; ALTER INDEX ALL ON T_PERSONNE_PRS REBUILD; | 8 556 | VACUUM FULL t_personne_prs; REINDEX TABLE t_personne_prs; | 266 000 | 31.09 |
DBA 11 | INSERT INTO T_PERSONNE_CI_AI_PRS SELECT * FROM T_PERSONNE_PRS; | 13 197 | INSERT INTO T_PERSONNE_CI_AI_PRS SELECT * FROM T_PERSONNE_PRS; | 102 000 | 7.73 |
DBA 12 | UPDATE STATISTICS T_PERSONNE_PRS(X_1) | 173 | ANALYZE T_PERSONNE_PRS (prs_nom); | 1 376 | 7.95 |
DBA 13 | UPDATE STATISTICS T_PERSONNE_PRS(X_3) | 338 | ANALYZE T_PERSONNE_PRS (prs_nom, prs_prenom); | 1 492 | 4.41 |
DBA 14 | EXEC sp_updatestats | 316 | ANALYZE; | 5 729 | 18.13 |
Some explanations about statistics: computing parameters are close from the one to the other. The entries are limited to 100 in PG and 200 in SQL Server. SQL Server uses a 0.5 % sample, while PostGreSQL uses a 0.4 % sample. So SQL Server does a little more extra jobs rather than PostGreSQL. For the complete database statistics (which includes system tables), SQL Server has 236 stats while PG has 423.
What we can say about the performance of the database administration queries, are the following:
- Inserting data is about 5 times slower in PostGreSQL vs SQL Server;
- recomputing stats is about 10 times slower in PostGreSQL vs SQL Server;
- Creating indexes is about 14 times slower in PostGreSQL vs SQL Server;
- Index maintenance is about 32 times slower in PostGreSQL vs SQL Server.
Creating indexes is a rare and non recurrent process. This difference will be problemtic only for big tables. The same for inserting data except if you have a sheduled import of data.
On the other hand, maintenance, like recomputing stats or defragmenting indexes is a very recurent process if you want to maintain a high level of performances. Unfortunately PosGreSQL will be nearly 30 times slower than SQL Server in this process. It is a huge difference when dealing with big databases…
See also:
- PostGreSQL vs Microsoft SQL Server – Comparison part 2 : COUNT performances
- PostgreSQL vs. SQL Server (MSSQL) – part 3 – Very Extremely Detailed Comparison
Frédéric Brouard - SQLpro - ARCHITECTE DE DONNÉES - expert SGBDR et langage SQL
* Le site sur les SGBD relationnels et le SQL : https://sqlpro.developpez.com *
* le blog SQL, SQL Server, SGBDR... sur : https://blog.developpez.com/sqlpro/ *
* Entreprise SQL SPOT : modélisation, conseil, audit, optimisation, formation *
* Site d'entreprise : https://www.sqlspot.com -- email : SQLpro[@]SQLspot.com *
* * * * * Enseignant CNAM PACA - ISEN Toulon - CESI Aix en Provence * * * * *
A french version of this paper can be read at URL : http://mssqlserver.fr/postgresql-vs-microsoft-sql-server-partie-1-performances-des-commandes-pour-le-dba/