As data-platform professionals, we are challenged to come up with unique answers to technical situations on a regular basis. So, keeping a backlog of techniques, tips and tricks to address these situations is a godsend. For instance, I was recently asked to support two sets of data, with virtually an identical column set in the same table. Data was for an operational data store, modeled in 3rd normal form in a SQL Server 2017 database. It shouldn’t have been a big deal, but add the requirement of Referential Integrity for each set of data back to their respective parent tables (and with a common primary key derived from the business key of each set of data loaded), and things start to get interesting.
One option would have the data loaders calculate the primary key column as a derived column. This is a common option, but due to the nature of the loaders used, it was problematic to include the logic, so a more durable solution was desired.
A second option, which allowed us to bake the solution directly into the structure of the table itself, turned out to be more elegant and durable. We had an option to create the table with a computed column as the primary key, and as a result, the logic became part of the table definition.
Other options were shelved, such as a system-generated column using an Identity column or Sequence value, but it would not have met the business requirements so was not used.
We were storing Fund and Index financial benchmark performance data in the same table, as the data attributes were identical, except for the business key from the respective sets of data. The business keys, Ticker and IndexID on the Performance table were Nullable columns that had to support a 0 to n foreign key relationship back to the FundClass and IndexMaster reference tables. It is impossible to create a primary key on nullable columns, yet we had two columns that were just that. One column would always be populated, so a computed column was created to select which one was populated and be included in the table definition as part of the primary key.
Create table DDL snippet
CREATE TABLE [dbo].[Performance ] ( … [EntityID] AS (case when [Ticker] IS NULL then [IndexID] else [Ticker] end) PERSISTED NOT NULL, …
EntityID was included in the primary key. The important item was to specify the computed column as PERSISTED. This instructs SQL Server to store the value of the computed column on the data page of the table, and not materialize it at run time. Without specifying the persisted keyword, it would prevent you from using a computed column in the primary key constraint.
Subsequently, we were able to create foreign key constraints for Ticker and IndexID on the nullable columns to their respective parent table.
The resulting data model around the Performance table looked like this.
An example of how the data was maintained on the Performance table:
The use of the solution above highlights an important point in development practices: always know the environment you’re operating in and maintain options to address design considerations and technical challenges. Identify what options are possible, select the most appropriate, and implement successfully. After all, there are more tools in your toolbox than just a hammer. Use the best tool for each situation.
Finally, if your organization ever needs help finding just the right tool to solve a complex technical issue, please don’t hesitate to reach out to us with any questions or concerns. Our expert consultants will be happy to get you on your way.
Have industry news sent right to your Inbox
Business Intelligence Architect