Designing a Clinical Data Mart (Part 2) - Clinical Multidimensional Cube
Using SQL Server Analysis Services (SSAS) | Online Analytical Processing (OLAP)
In this session we will approach creating an Analysis Services cube for our previously designed Clinical Data Mart (CDM). The cube will allow for us to query against patient encounter measures, based on date range and related dimensional filtering. We will show how to implement Many-to Many relationship type for the Diagnosis and Procedure dimensions.
We will cover the following steps for creating our Multidimensional Cube:
- Defining the Data Source Connection
- Creating a Data Source View
- Creating an Analysis Services Cube
- Create Dimensions
- Creating a Many-to-Many relationship
- Create a Calculated Member to get diagnosis/procedure count
- Configuring Dimensions
- Processing and Deploying the Cube
- Querying the Cube
What this article doesn't cover
The purpose of this article is to provide an entry-level approach to defining the clinical data mart cube. We do not cover optimization techniques for configuring the cube and supporting dimensions. We will cover optimization tips and tricks in a future article.
This article assumes you have experience with SQL Server Analysis Services (SSAS). You will need the following technology in order to complete this tutorial:
- SQL Server 2012 or greater
- SQL Server Analysis Services (SSAS)
- Excel 2010 or greater
Note: We assume you have populated the Clinical Data Mart with dummy data, prior to completing this tutorial.
Let's get started!
Step 1 - Data Source Connection
We assume you already have previous experience creating a data source connection, so we'll run through this step quickly.
1. Open SQL Server Data Tools and create a new Analysis Services Multidimensional and Data Mining Project.
2. Provide a name for the project and location. I'm using ClinicalASM (Analysis Services Multidimensional) for the name.
3. When the Solution Explorer renders, right click on the Data Source folder and click New Data Source...
4. Follow the wizard in selecting the location of your SQL Server database for the clinical data.
5. When you arrive to the Impersonation Information, choose Use the service account.
Note: Ensure that the SSAS service account has permissions to both the CDM and the Analysis Services instance. You can get the name of the services account via Windows Services >> SQL Server Analysis Services (MSSQLSERVER) and look for the Log On As account.
6. Provide the Data Source a name - ClinicalDS and click Finish.
Step 2 - Creating a Data Source View
We will select the tables/views to be used in modeling our dimensions and cube. For this tutorial, we will be using tables, but we recommend you get in the habit of using views for fact tables because defining Named Calculations get computed every time processing the cube , thus hindering performance. Optimizing Analysis Services is a broad topic and we will not cover optimization techniques in this tutorial.
1. Right click on Data Source View folder from within the Solution Explorer and click New Data Source View...
2. Click Next at the first Data Source View Wizard dialog.
3. On the Select a Data Source dialog, select the name of the data source your created in Step 1. and click Next.
4. On the Select Tables and Views dialog, select the following tables and click > to include objects to Data Source View. Click Next.
5. Provide a name for your Data Source View. I've named it ClinicalDSV.
Note: Analysis Services will automatically render the relationships of the tables and/or views in the designer. You should see something similar as below.
6. Save your project.
Step 3 - Create the Analysis Services Cube
In this step we will create an Analysis Services Cube for our clinical data. We won't spend too much time configuring the cube - we will only configure minimum requirement for getting the dimension usage and measures connected, so that we can demonstrate querying the clinical data.
1. Right click on the Cube folder and select New Cube...
2. Click Next on the first Cube Wizard dialog.
3. On the Select Creation Method dialog, select Use existing tables... and click Next.
4. On the Select Measure Group Tables, select the following from the ClinicalDSV and click Next.
5. On the Select Measures dialog, accept the default selected measures and click Next.
6. On the Select New Dimensions dialog, accept the default selected dimensions and click Next.
7. Provide a name for your Cube and click Finish. I've named it ClinicalResearch. Your Cube designer should look something similar to below.
Note: Analysis Services will create the Cube and Dimensions. You will need to now do some minor configurations to the cube and dimensions, before we will process and query the cube.
8. Save your project.
Step 3.2 - Many-to-Many (M2M)
When the cube is created, Analysis Services engine will find the many-to-many relationship between the Encounter fact and Diagnosis / Procedure dimension table(s). This M2M relationship will allow for us to render a grouping of all Diagnosis or Procedures for a given patient encounter, on a specific date, through an intermediate measure group.
Intermediate Measure Group - creates relationship between dimension and measure group. In our example, the Intermediate Measure Group is the BRIDGE_GROUP_DX. The FACT_ENCOUNTER joins to the GROUP_DX >> BRIDGE_GROUP_DX >> DIM_DIAGNOSIS. The Intermediate Measure Group will store the group Key along with one or many diagnosis keys. The single patient encounter fact can join to multiple dimension members, creating a many-to-many relationship. Look at the image below to see how Analysis Services creates the M2M relationship.
1. Click the Dimension Usage tab from the Cube designer.
2. Place pointer in the BRIDGE_GROUP_DX measure group and click the ellipses. This will render the Define Relationship dialog.
Notice the dimension DIM_DIAGNOSIS has a Many-to-Many join to the BRIDGE_GROUP_DX. Behind the scenes, the intermediate measure group is joined to Measure Group Group_DX and to the Fact Table FACT_ENCOUNTER.
3. Now that we have the Measures and Dimensions joined, lets perform a little configuration to Measure Group and Intermediate Measure Group. We don't need our end users to see the measure group or intermediate measure group tables, so we will hide them. While within the Dimension Usage tab, get a focus on the GROUP_DX dimension, right click and select Properties.
4. On the right Properties pane, find the Visible property and set to False. Do the same for the GROUP_PROC dimension.
5. Save project.
6. Select the Cube Structure tab and expand the BRIDGE_GROUP_DX measure and set focus on BRIDGE GROUP DX Count. Right click and select Properties.
7. On the right Properties pane, find the Visible property and set to False. Do the same for the BRIDGE_GROUP_PROC >> BRIDGE GROUP PROC Count measure.
8. Save project.
Step 3.3 - Create a Calculated Member to get diagnosis count
We will create a New Calculated Member to return the diagnosis count from our cube. This is helpful for when we want to get the aggregate for total diagnosis for a specific patient encounter(s). Optionally, you may follow the same steps, but for Procedure dimension.
1. From within the Cube Designer, click on the Calculations tab. Right click anywhere in the Script Organizer pane and select
2. The new calculated editor will render. Complete your configuration as follows:
Parent hierarchy: Measures
[Measures].[FACT ENCOUNTER Count]
Format String: (leave blank)
Non-empty behavior: FACT ENCOUNTER Count
Display folder: (leave blank)
Your edits should look similar to below.
3. Save project.
Step 4 - Configuring Dimensions
In this step we'll go over configuring the Patient dimension, but you will need complete the remaining dimensions based on your requirements. For example, for the Patient dimension, you may only require the following attributes: Pat Psuedo ID, Age, DOB, Ethnicity, Gender, and so on... We will provide a final configuration of the demo solution at the end of this tutorial, however.
1. From the left Solution Explorer pane, double click on DIM PATIENT. The Patient dimension designer should render like so.
2. From the right Data Source View pane, drag over the following fields to the Attribute pane.
3. Right click the PAT_ID and select Properties. Rename the attribute to PAT PSEUDO ID. You may retain the remaining default property settings for this attribute.
4. From the Attributes pane, select Age and changes its Order By property to Key. We make this edit so that when the dimension is processed, Age members will sort in ascending order. Do the same for DOB and YEARLY INCOME.
Note: When you process this dimension, you may run into an error related to duplicate City member. You can resolve this by creating a composite key. Please see the PAT CITY attribute properties configuration below.
5. Save the project.
Step 5 - Processing and Deploying the Cube
In this step we'll process our cube and deploy it to our Analysis Services instance.
1. From within the Solution Explorer, right click on your solution name and click Process...
2. You may see the following prompt. Click Yes.
3. The Process Database - ClinicalASM dialog will render. Click Run... this will render the Process Progress window and show a status of objects processed. When it completes successfully, click Close. Now close the Process Database window.
Your cube should now be available in your Analysis Services server instance.
Step 6 - Querying the Cube
In this final step, we will query the cube from within the Analysis Services Multidimensional solution.
1. From within the Solution Explorer, double click on the ClincialResearch.cube. Now fine the Browser tab and select.
2. Expand the DIME DATE dimension and drag Calendar Year into the filter pane. Filter Expression should equal 2017 (or any year you choose).
3. Expand the Measures folder, expand the FACT ENCOUNTER folder and drag and drop the FACT ENCOUNTER Count to the query pane. Click on 'Click to execute the query.' link. You should see similar results.
4. Now expand the DIM PATIENT and add the PAT PSEUDO ID attribute.
5. Expand the DIM DATE and add the DIM DATE.Date attribute.
5. Expand the DIM DIAGNOIS and add the DX CODE attribute.
6. Click on 'Click to execute the query.' link to run the query. You should see something similar to below.
You have completed querying the cube.
In this tutorial we showed you the basic steps to creating a cube against your previously defined Clinical UDM. We showed you how to connect to a data source, define a data source view and create cube. We configured the cube measures and dimensional usage, in support of many-to-many relationship between diagnosis and procedures dimensions, and developed a calculated member to count total diagnosis. Finally we processed or cube and browsed the data.
Download the full Clinical cube project below.
ClinicalASM.zip (69.62 kb)
Our next article/tutorial will cover implementing a Data Visualization and Dashboards, using Excel and Power BI.