Digging into SQL Server 2012 columnstore index

Business Intelligence

Digging into SQL Server 2012 columnstore index

bizadmin

February 28, 2023

[ad_1]

The SQL Server 11.0 launch (code named “Denali”) introduces a brand new knowledge warehouse question acceleration characteristic based mostly on a brand new kind of index known as the columnstore. Columnstore indexing is formally introduced in SQL Server 2012. It’s working based mostly on xVelocity reminiscence optimised expertise and it improves knowledge warehouse question efficiency considerably. Resulting from the truth that knowledge warehousing, resolution assist methods and enterprise intelligence purposes are rising in a short time, we’d like to have the ability to learn and course of very giant knowledge units rapidly and precisely into helpful info and information. Columnstore index expertise is very acceptable for knowledge warehousing knowledge units. It improves the widespread knowledge warehousing queries’ efficiency considerably.

Columnstore index is storing knowledge for every column and joins all of the columns to finish the index. There are lots of benefits of utilizing columnstore indexing as compared with the normal rowstore indexing. The time period “rowstore” is utilizing to explain both a heap or a B-tree that accommodates a number of rows per web page. As columnstore indexing is fairly new, it has some restrictions and limitations. So, you need to be conscious of these limitations when you find yourself planning to implement columnstore index in your knowledge warehouse. On this article we are going to focus on concerning the under subjects:

§ How columnstore index works?

§ Advantages of utilizing columnstore indexes

§ Restrictions of columnstore indexes

§ How one can create a SQL Server columnstore index?

§ Planning for creating columnstore index

§ Selecting columns for a columnstore index

Whereas rowstore indexing shops a number of rows per web page, columnstore index shops every column in disk pages individually. The next picture illustrates the distinction between columnstore and rowstore indexing from storage perspective:

As you’ll be able to see C1, C2…C6 are saved in numerous pages, so:

· solely the columns wanted in a question are fetched from the disk

· because of the redundancy of information inside a column it’s simpler for knowledge compression

· due to the information compression and continuously accessed components of generally used columns are nonetheless stay in reminiscence, therefore, buffer hit price is improved.

As mentioned, columnstore is working based mostly on xVelocity expertise that’s in widespread with SQL Server Evaluation Companies Tabular Mannequin in addition to PowerPivot. Really, it doesn’t imply that columnstore indexes have to slot in reminiscence; nevertheless, they will use accessible server reminiscence successfully to maneuver parts of columns out and in of reminiscence on demand. As columnstore indexes retailer all knowledge for separate columns in separate pages, utilizing columnstore indexes improves I/O scan efficiency considerably.

There are a number of advantages of utilizing columnstore indexes as compared with rowstore indexes as under:

· As mentioned, solely the columns wanted in a question are fetched from the disk, so, the information warehouse question efficiency is approach sooner for widespread knowledge warehouse queries

· As knowledge is very compressed utilizing xVelocity expertise the disk area reduces successfully

· Because the pages are considerably compressed, the pages containing probably the most continuously accessed columns stay in reminiscence

· As batch mode processing that’s a complicated question execution expertise that processes chunks of columns is used, the CPU utilization is lowered.

Columnstore indexing is a brand new expertise, so, you need to be conscious of its restrictions in case you are planning to implement columnstore indexes. The next restrictions needs to be thought-about:

· Columnstore index is accessible solely in SQL Server Enterprise, Developer and Analysis editions, so, you’ll face to the next error message if you wish to use columnstore index in different editions of SQL Server 2012: “CREATE INDEX assertion failed as a result of a columnstore index can’t be created on this version of SQL Server.”

· Tables containing columnstore indexes can’t be up to date. This restriction may be eliminated within the subsequent releases of SQL Server. Now, find out how to insert, replace or delete knowledge in a desk that accommodates a columnstore index? There are three options for this objective; nevertheless, it appears that evidently the primary resolution is extra easy than the others.

1. Drop the columnstore index, carry out any INSERT, UPDATE, DELETE or MERGE operations, and recreate the columnstore index.

2. Partition the desk and change partitions. For a bulk insert:

§ insert knowledge right into a staging desk

§ construct a columnstore index on the staging desk

§ change the staging desk into an empty partition

For different updates:

§ change a partition out of the principle desk right into a staging desk

§ disable or drop the columnstore index on the staging desk

§ carry out the replace operations

§ rebuild or re-create the columnstore index on the staging desk

§ change the staging desk again into the principle desk.

3. Place static knowledge right into a primary desk with a columnstore index, and put new knowledge and up to date knowledge more likely to change, right into a separate desk with the identical schema that doesn’t have a columnstore index. Apply updates to the desk with the latest knowledge. To question the information, rewrite the question as two queries, one in opposition to every desk, after which mix the 2 outcome units with UNION ALL. The sub-query in opposition to the big primary desk will profit from the columnstore index. If the updateable desk is way smaller, the dearth of the columnstore index can have much less impact on efficiency. Whereas it’s also doable to question a view that’s the UNION ALL of the 2 tables, chances are you’ll not see a transparent efficiency benefit. The efficiency will rely upon the question plan, which can rely upon the question, the information, and cardinality estimations. The benefit of utilizing a view is that an INSTEAD OF set off on the view can divert updates to the desk that doesn’t have a columnstore index and the view mechanism could be clear to the person and to purposes. In case you use both of those approaches with UNION ALL, take a look at the efficiency on typical queries and determine whether or not the comfort of utilizing this strategy outweighs any lack of efficiency profit.

Word: As we mentioned, the tables containing columnstore index, can’t be up to date. However, it doesn’t appear to be a good suggestion to make use of columnstore to make a read-only desk. As a result of, columnstore index is just not designed for this explicit objective and it’s doable that Microsoft removes this restriction within the subsequent releases of SQL Server.

· Columnstore indexes should not supporting greater than 1024 columns

· Solely nonclustered columnstore indexes can be found (there isn’t any clustered columnstore index)

· A columnstore index can’t be a singular index

· Creating columnstore indexes on a view or listed view is just not supported

· Columnstore indexes can not embrace a sparse column (an atypical column that has an optimized storage for null values)

· Columnstore indexes can not act as major keys or international keys (do not forget that a columnstore index can’t be a singular index)

· Columnstore indexes can’t be modified utilizing “ALTER INDEX” assertion. Nonetheless, the “ALTER INDEX” assertion can be utilized to disable and rebuild a columnstore index. So the one method to modify a columnstore index is to drop and recreate the columnstore index.

· The key phrase “INCLUDE” is just not supported to create a columnstore index

· Sorting is just not allowed in a columnstore index, so, “ASC” and “DESC” key phrases should not supported. Really, columnstore indexes are ordered in keeping with the compression algorithm. Values chosen from a columnstore index may be sorted by the search algorithm, however it’s essential to use the ORDER BY clause to ensure sorting of a outcome set.

· A columnstore index doesn’t use and even maintain statistics as rowstore index does

· A columnstore index doesn’t assist FILESTREAM attribute, so, solely the columns within the desk that aren’t used within the columnstore index can comprise the FILESTREAM attribute.

· As column retailer index is optimized for in-memory processing, so, server reminiscence limitations needs to be thought-about

· Columnstore indexes don’t assist SEEK, so, if the desk trace FORCESEEK is used, the optimizer won’t take into account the columnstore index.

· Columnstore indexes can’t be mixed with web page and row compression, as columnstore indexes are already compressed in a distinct format.

· Replication is just not supported for tables containing columnstore index

· Change monitoring and alter knowledge seize should not supported

· Filestream is just not supported

· The next knowledge sorts can’t be included in a columnstore index:

1. binary and varbinary

2. ntext , textual content, and picture

3. varchar(max) and nvarchar(max)

4. uniqueidentifier

5. rowversion (and timestamp)

6. sql_variant

7. decimal (and numeric) with precision higher than 18 digits

8. datetimeoffset with scale higher than 2

9. CLR sorts (hierarchyid and spatial sorts)

10. xml

Making a columnstore index is rather like creating every other index. Usually, there are two methods to create a columnstore index, creating index utilizing T-SQL statements or utilizing SSMS (SQL Server Administration Studio).

Making a columnstore index utilizing T-SQL

In a question editor window execute the next assertion:

CREATE NONCLUSTERED COLUMNSTORE INDEX IndexName

ON TableName (Column1, Column2, …)

Making a columnstore index utilizing SSMS

Open SQL Server Administration Studio (SSMS) and hook up with a SQL Server database engine. Do not forget that columnstore index is accessible simply in SQL Server 201 Enterprise Version.

1. From “Object Explorer”-> increase the instance-> increase the databases-> increase the database-> increase the table-> proper click on on “Indexes”-> New Index-> Non-Clustered Columnstore Index

2. In “New Index” window-> Index Title (kind a reputation)-> Add-> choose the column-> OK-> OK

Now the columnstore index is created and you may see it within the “Indexes” in object explorer.

As columnstore index is a brand new expertise, it has many limitations and restrictions. Though all the columnstore index restrictions needs to be thought-about, one of the normal and vital restrictions of columnstore index is that it’s NOT accessible in all variations of SQL Server 2012. So, it’s actually vital to know what model of SQL Server goes for use in manufacturing surroundings. In case your organisation is just not going to make use of SQL Server 2012 Enterprise version, you can not use columnstore index in any respect. So, it’s a must to plan to create rowstore indexes in your knowledge warehouse.

Resulting from the truth that the indexing is de facto associated to the queries, it needs to be investigated in a case by case foundation. Though columnstore indexing is enhancing the question efficiency, nevertheless, in some instances it’ll trigger poorer question efficiency.

Among the efficiency good thing about a columnstore index is derived from the compression strategies that scale back the variety of knowledge pages that should be learn and manipulated to course of the question. Compression works greatest on character or numeric columns which have giant quantities of duplicated values. For instance, dimension tables may need columns for postal codes, cities, and gross sales areas. If many postal codes are situated in every metropolis, and if many cities are situated in every gross sales area, then the gross sales area column could be probably the most compressed, the town column would have considerably much less compression, and the postal code would have the least compression. Though all columns are good candidates for a columnstore index, including the gross sales area code column to the columnstore index will obtain the best profit from columnstore compression, and the postal code will obtain the least.

References: SQL Server 2012 Books On-line, SQL Server Technical Article: Columnstore Indexes for Quick Knowledge Warehouse Question Processing in SQL Server 11.0; November 2010

Associated

[ad_2]