Restarting MS SQL Server : bad or good idea ?

I am often appalled by the lack of knowledge of the database culture of computer scientists in general and more particularly of systems engineers, as well as developers who naively believe that restarting a server is a necessity, it feels good. In particular in the RDBMS world it hurts… Very bad! Including when you believe you are doing the right thing, such as applying patches, CUs or other fixes. Some have paid dearly for this very serious mistake…

It is strongly advised never to shut down an MS SQL Server instance. This practice is the worst of all. In recent years, this bad practice that seems to have disappeared has come back at a gallop in my audits. In question, the fear of attacks combined has a lack of culture from system administrators

Restarting SQL Server – always a good idea?

https://www.sqlpassion.at/archive/2016/08/08/restarting-sql-server-always-a-good-idea/

https://www.linkedin.com/pulse/frequent-sql-server-restart-bad-practice-vitaly-bruk-1e

https://www.dbrnd.com/2018/01/sql-server-frequently-restart-a-sql-server-is-a-bad-practice/

https://dba.stackexchange.com/questions/160290/what-reasons-are-there-to-stop-the-sql-server

Why ?

Buffer

As soon as you restart SQL Server, you lose all cache content. It should be understood that the machine’s RAM serves as a cache for SQL Server to avoid accessing the disk or recalculating certain complex processing.

Once in the cache, this data persists for as long as possible and, thanks to sophisticated algorithms based on the general principle of LRU (Least Recent Used), the data is only released from the cache if there is no more RAM. available starting with the least frequently used data…

In order of volume, the different cache strata include:

Data from tables and index
Components and code for query execution plans
Metadata from system tables

and many other entries (nearly a hundred… see DBCC MEMORYSTATUS).

Clearing the cache has the effect of making the server restart extremely slow overall…

Queries

No query can be executed if the data is not present in the cache in RAM. This means that all data access goes through the cache, even for writes. If the data is not in cache, the query is put to sleep while the storage engine reads the data from disk back into the cache. Once the data is fetched from the cache, the storage engine wakes up the query so that it can finally be executed. Queries therefore read and write data logically, ie in memory. As far as writes are concerned, they are first carried out in the cache and then reflected in the data files from time to time… All this for reasons of efficiency and performance: no need to slow down the system, if the data can be written asynchronously… Don’t worry, the transaction log is a physical file that contains the changes made to the data and which is written synchronously for any transaction that changes the state of the database, whether it concerns the data or the structure of database objects.

In the event of a restart, the queries launched when SQL Server restarts will have to access the data in two stages: reading from disk then reading from memory. Knowing that a disk is 1,000 (very good SSD) to 10,000 (magnetic disk) times slower in access time than memory, your queries will suffer terribly!

Query plans

A query execution plan is the set of unit tasks that the SQL engine must perform to process a query. It should not be forgotten that the SQL commands launched on a server, are only « requests » (hence the name of request) and not the executable code. This SQL code does not correspond in any way to what must be executed, because writing a query is concerned with defining the expected result and not how to achieve this result.

To respond to a query, the engine must build a query execution plan comprising numerous unit tasks (a bit like a workflow) each of which is decided according to numerous optimization parameters (heuristics, constraints, cardinality, etc.). Each unit task in the query plan takes time to build:

syntactic analysis of the request (is the code consistent)
access to metadata to check if the tables exist, what are the data types of the columns, if the user is authorized…
translation of the SQL code into an algebraic tree (a mathematical formula of sorts)
algebraic simplification of the query (substitution, factorization, etc.)
heuristic optimization (trivial simplifications when only reading the query)
semantic optimization (based on table structure, column data types, constraints, etc.)
statistical optimization (choice of indexes to use and algorithms for each stage of the plan based on distribution statistics)

All of this takes time, and to avoid recalculating very similar execution plans, SQL Server caches query plans.

The restart loses all these cache entries and forces to recalculate the thousands of plans…

The very latest versions of SQL Server use an intelligent « retro feedback » technique to analyze all executions of similar plans with slight variations in order to establish the best common denominator, linked to the best performance (memory size of execution of the query, number of threads in parallel to execute it, choice of indexes, algorithms, etc.). All this fine-tuning requires running these plans many times and therefore a long term, all work lost if you restart SQL Server!

The consequence is that the overall performance will suffer considerably…

The DMVs : Data Management Views

These system views (several hundred) allow you to diagnose a whole lot of things about the operation of your SQL Server instance. To do this, the engine stores thousands of statistics in a reserved memory area and aggregates them permanently. For example, for each query execution, more than a hundred metrics are collected, added, correlated so that lessons can be learned about the most penalizing queries either in intrinsic duration, or in number of executions, or on any other criteria such as memory usage, number of threads, reads, writes… All things that can quickly solve many performance problems…

When you restart SQL Server, the process of collecting this information is destroyed and then recreated and of course these statistics cleared from the cache. You then lose all the data accessible via the various DMVs… From the point of view of troubleshooting malfunctions, restarting a server is like sawing off the branch on which we sat… no possibility of finding what happened in the execution of SQL Server in the hours, days, weeks, months preceding the shutdown. Therefore any diagnosis of past problems becomes impossible…

Database recovery

Restarting a SQL Server instance is an inappropriate operation in the sense that it is not intended to be executed on a regular basis… Indeed, stopping SQL Server causes the automatic abandonment of current transactions ( in SQL Server any operation or SQL command is a transaction…), ie the system forces a rollback of transactions (ROLLBACK). And this process can take time…

Upon restart, SQL Server goes through a RECOVERY phase during which each database performs a rollback of transactions whose data has not yet been physically written. Remember… in the “Cache” section we said that the data is modified in memory, but not immediately on disk! This is not necessary because new data can be reconstructed from old data to which changes that have been recorded in the transaction log are reapplied. To guarantee the successful completion of this process, the database becomes temporarily inaccessible as long as the re-execution of these transactions lasts… After a certain time, the recovery phase ends and the databases having become consistent again, they finally become accessible.

And this process also takes time!

Cache recovery

Once the server is restarted, the acquisition of the memory used by SQL Server is done gradually according to the necessities of caching. Which will also take time and can be prevented by concurrent processes that have already taken up some of the RAM…

tempdb performances

Temporary tables (which are created in the tempdb database) have a particular behavior. When deleted by the DROP TABLE command or by the end of the session, these tables are not physically deleted, but simply marked as being deleteable.
In fact, this allows two complementary optimizations:

deleting a table requires logging all the entries in this table in the transaction log and this takes time.
when a temporary table creation is initiated, SQL Server attempts to find a similar temporary table among the tables marked as deleteable rather than having to recreate the table, because a recreation is more expensive than reusing an empty object already existing

Vulnerability

Icing on the cake, it is when it is stopped that SQL Server is most vulnerable…! Many computer service technicians, because they have no knowledge of databases, naively believe that passing « Windows Updates » and « SQL Server CU » should be done as frequently as possible. However, most of these patches concern minor, even cosmetic, modifications that have nothing to do with security.

First, regarding Windows patches, no rush. SQL Server hardly uses Windows in the sense that its thread management is internal, as is memory management and disk access. SQL Server does not delegate any of these operations to the OS and performs them directly in a layer called SQL Operating System (SOS for short). Only network access and Windows service start is performed by Windows (or Linux) commands. In particular SQL Server knows how to shut itself down with the internal Transact SQL command SHUTDOWN.

It is therefore only necessary to pass Windows patches which relate to security vulnerabilities as well as those relating to network access (very rare in practice…).

If we analyze the SQL Server patches, the results are as follows:

version 2017: 31 CUs from 23/10/2017 to 14/02/2023 (i.e. 5 years, 4 months) including 7 security patches, i.e. 1 every 277 days (or 1 every 9 months).
version 2019: 20 CUs from 7/1/2020 to 13/04/2023 (i.e. 5 years, 4 months) including 4 security patches, i.e. 1 every 298 days (or 1 every 10 months).

But the biggest security problem is when SQL Server is shut down, when the SQL Server service is not running. Indeed, when this service is active and run onn Windows, all the files of all the databases are locked for the exclusive use of the SQL Server engine. It is therefore not possible to access these files either in read/write mode or to delete or move them. But in case of server shutdown, these files become extremely vulnerable…

In conclusion, stopping SQL Server is the surest way to have its databases attacked by ransomware!

A PASSING NOTE…

Although Microsoft products appear quite prominently in patch lists, compared to similar products, they are paradoxically the most reliable tools. In particular, Microsoft SQL Server is by far the safest Relational DBMS compared to Oracle and any other free product like PostGreSQL, MySQL or MariaDB… It’s the TOYOTA syndrome! Explanation: TOYOTA, often considered as the manufacturer of the most reliable cars in the world (there are many cars still running having done more than a million kilometers) is also the manufacturer who recalls the most models to impose preventive corrections in order to that possible incidents never happen.

Microsoft has been attacked for a long time and, thanks to its experience, has developed methods to fight against these attacks which have been in place since the end of the 90s, which allows it a high level of security for R&D and its products. professionals (databases, operating systems, development tools, etc.).

Just compare the volume of lines of code of a SQL Server instance and its number of vulnerabilities compared to any other RDBMS, including Oracle Database, to measure the difference in terms of security that MS SQL Server brings!

Frédéric Brouard - SQLpro - ARCHITECTE DE DONNÉES - expert SGBDR et langage SQL * Le site sur les SGBD relationnels et le SQL : https://sqlpro.developpez.com * * le blog SQL, SQL Server, SGBDR... sur : https://blog.developpez.com/sqlpro/ * * Expert Microsoft SQL Server, MVP (Most valuable Professional) depuis 14 ans * * Entreprise SQL SPOT : modélisation, conseil, audit, optimisation, formation *

Restarting MS SQL Server : bad or good idea ?

Why ?

Buffer

Queries

Query plans

The DMVs : Data Management Views

Database recovery

Cache recovery

tempdb performances

Vulnerability

A PASSING NOTE…

Articles récents

Commentaires récents

Archives

Catégories

Méta