I have a data model that contains System-Versioned Tables, one of which had an initial history load of 50+ million rows. The tool I used to load data is an SSIS package run via Azure Data Factory using Azure SSIS Integration Runtime.
I created an SSIS package to accomplish this task, which was deployed as part of a project to the SSIS catalog residing on an Azure SQL Database.
The package uses an OLEDB Source connection to a SQL Server 2014 database residing on an Azure VM and a Target connection to an Azure SQL DB database. As part of the database deployment there is a staging table with an index and a System-Versioned dimension table that contains two indexes. One for the system generated primary index key, and another column-stored index used for extract and reporting.
It was taking over 2 hours to load the data into the staging table.I assumed it would take another 2+ hours to load the data into the destination tables. I wanted to increase load performance and decrease the amount of time it took to load the data. I wanted to drop all the indexes during the load process to increase throughput but had to re-sequence system-versioning table-create process in order to do this. System-versioning requires a primary key during the creation process.
I dropped and recreated the dimension table as a non-system-versioned table containing the necessary columns for system versioning. I also added an identity column that would eventually become the primary index to the table. The table at this point contains no indexes.
CREATE TABLE [dim].[TableTest] ( [Id] BIGINT IDENTITY(1, 1) NOT NULL, [COL1] BIGINT NOT NULL, [COL2] BIGINT NULL, [COL3] VARCHAR(25) NULL, [RowStartTimeUTC] DATETIME2 GENERATED ALWAYS AS ROW START DEFAULT(getutcdate()) NOT NULL, [RowEndTimeUTC] DATETIME2 GENERATED ALWAYS AS ROW END DEFAULT(CONVERT([datetime2], '9999-12-31 23:59:59.9999999')) NOT NULL, PERIOD FOR SYSTEM_TIME(RowStartTimeUTC, RowEndTimeUTC) ) ON [PRIMARY]
I updated the SSIS package to do the following:
ALTER TABLE [dim].[Table] ADD PRIMARY KEY CLUSTERED ( [Id] ASC ) ON [PRIMARY];
ALTER TABLE [dim].[Table] SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = [history].[Table]))
CREATE NONCLUSTERED INDEX [DimCompanyProduct_IX1] ON [dim].[CompanyProduct] ( [COL1] ASC, [COL2] ASC, . [COLN] ASC )WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
By dropping all the indexes during the load process, I was able to decrease load time to 2 hours from 4+ hours for both staging and dimension table. So, how‘s your load time? If you still have questions around the optimization of any load process, please don’t hesitate to reach out to us. We’d love to help you get started.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
|cookielawinfo-checbox-analytics||11 months||This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".|
|cookielawinfo-checbox-functional||11 months||The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".|
|cookielawinfo-checbox-others||11 months||This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.|
|cookielawinfo-checkbox-necessary||11 months||This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".|
|cookielawinfo-checkbox-performance||11 months||This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".|
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.