SQL Server Data Compression was first introduced in SQL Server 2008. Unfortunately, it was an Enterprise Edition only feature until SQL Server 2016 SP1. This feature lets you compress individual row store indexes with either ROW or PAGE compression. Data compression can be a huge win for many SQL Server workloads. It can reduce both your required disk I/O and your memory usage at the cost of some added CPU usage in some scenarios. This post gives you a script you can use for estimating data compression savings in SQL Server. You can get the script from here.
By design, there is no “Compress Entire Database” command in SQL Server. Instead, you need to evaluate individual indexes, based on their size, estimated compression savings and volatility. The ideal case is a large table that shows very good compression savings that is read-only. A terrible candidate is a small table, that does not show much compression savings, with very volatile data.
Estimating Data Compression Savings in SQL Server
I long ago got tired of manually running the sp_estimate_data_compression_savings system stored procedure for each index in a database. Instead, I decided to write some T-SQL that would somewhat automate the process. If you set the schema name, table name, and desired data compression type in the variable declarations at the top of the script, you will get some pretty detailed information about all of the indexes in that table.
In SQL Server 2019, sp_estimate_data_compression_savings was enhanced by adding the COLUMNSTORE and COLUMNSTORE_ARCHIVE compression types as options.
You should run the entire query at once, after you have supplied your own values. It may take some time to run, depending on your hardware, storage and on how large your tables are. You could also use some of this code in a stored procedure that you could call for each table in a database. Then, it could also write the results out to a table that you could easily query later.
Since the Data Compression feature is available in SQL Server Standard Edition, I am seeing more organizations use it. Many organizations are moving from legacy versions of SQL Server to a modern version, so this makes perfect sense.
Modern processors are also significantly more powerful than they were back in 2008, so the CPU overhead of data compression is much less of an issue than it used to be. I have a couple of blog posts that describe the best current processors for SQL Server usage.
If you have any questions about this post, please ask me here in the comments or on Twitter. I am pretty active on Twitter as GlennAlanBerry. Thanks for reading!