Class Info

Online Course: SQL Server Integration Services End to End

Get Full Access for Only $49.00  

[tabby title=”Course Overview”]

In SQL Server 2005 Microsoft introduced SQL Server Integration Services, SSIS, as the replacement to Data Transformation Services, DTS and the primary ETL tool for developers. Over the past 9 years SSIS has evolved and now includes tasks and transformations for Change Data Capture as well as Data Quality Services. This workshop is meant to serve as a tool to prepare for the Microsoft 70-463 exam, Implementing a Data Warehouse with Microsoft SQL Server 2012, as well as provide a comprehensive source of training for SSIS. As there have been changes in SSIS each session points out the differences between each version of SSIS beginning with 2005 through 2014. This workshop is broken into 12 different modules:

Module 1: Introduction to a Data Warehouse
Module 2: ETL
Module 3: Introduction to SSIS
Module 4: Introduction to Control Flow
Module 5: Connection Managers
Module 6: Introduction to Data Flow
Module 7: Advanced Control Flow
Module 8: Advanced Data Flow
Module 9: SSIS Script Tasks and Components
Module 10: Troubleshooting and Debugging
Module 11: Data Quality Services and Master Data Services
Module 12: Deployment Models

Each module covers specific topics relating to SSIS and data warehousing which are presented in the exam as well as in depth examples and real life demonstrations.

All demonstrations are done using the Adventureworks2012 and AdventureworksDW2012 sample databases available
here

To complete all demonstrations only requires SQL Server Enterprise, Enterprise Evaluation, or Developer Edition

Developer Edition is available here

 

[tabby title=”Sessions”]

Module 01 – Introduction to Data Ware House – 

A data ware house is a relational database that serves as a repository for an organizations business information. It’s time to clear up the myths and misconceptions of a data ware house and provide definitive explanations and methods to implement a data warehouse in your organization.

This session focuses on the different ways to implement a data warehouse as well as the table structure and internals of an OLAP database:

1. Define an OLTP database
2. Define an OLAP database
3. What is a fact table
4. What is a dimension table
5. What are the different kinds of dimensions
6. What are the different data warehouse schemas
7. How to map and model your data warehouse

Module 02 – ETL – 

ETL, or extract, transform and load, is the processes of taking data from its source, transforming the data, and loading it into the destination(s). The “E”, or extract, is very straight forward and the “T”, or transformation, is used to apply the necessary transformations to make the data:

1. Accurate
2. Consistent
3. Relevant
and the “L”, or load, provides the end process of inserting, updating, or deleting of records into the destination. Although the workshop focuses on using SSIS for ETL solutions this session discusses the ETL process as well as methods native to SQL server that can be used to implement your ETL solution.

This session focuses on the different ways to implement an ETL solution using SQL Server as well as discuss different considerations of improving the load performance of a data warehouse. Upon completing this session attendees will be familiar with:

1. T-SQL
2. MERGE
3. Bulk copy program (bcp)
4. BULK INSERT
5. OPENROWSET(BULK)
6. Load considerations

a. Foreign keys
b. Primary keys
c. Unique constraints
d. Indexes

Module 03 – Introduction to SSIS –  

SSIS, SQL Server Integration Services, was first introduced to SQL Server in SQL 2005 as a replacement to DTS. SSIS is the primary ETL tool that is built on top of .NET and as such provides the full availability of the CLR to create robust ETL solutions.

This session focuses on a general overview of SSIS and the different tools available to create, deploy, and manage your SSIS projects. Upon completion of this session users will be familiar with:

1. Import/Export Wizard
2. dtexec
3. Dtexecui
4. SQL Server Management Studio, SSMS
5. SQL Server Data Tools, SSDT

a. Configuring and working with SSDT

Module 04 – Introduction to Control Flow –  

Control flow is the foundation of an SSIS solution. Control flow tasks provide the means of encapsulating tasks as well as provide the containers of execution. It is critical to have a solid understanding of control flow in SSIS as well as the different properties, precedence constraints, and expressions.

This session focuses on presenting control flow tasks and outlining:

1. Control flow categories and their associated tasks

a. Data Flow Tasks
b. Database Tasks
c. File and Internet Tasks
d. Process Execution Tasks
e. WMI Tasks
f. Custom Tasks
g. Database Transfer Tasks
h. Analysis Services Tasks
i. SQL Server Maintenance Tasks
j. Containers

2. Grouping and annotation
3. Precedence constraints
4. Variables
5. Parameters
6. Expressions

Module 05 – Introduction to Connection Managers –  

As SSIS is meant to extract data from a source and load it into a destination providing the connection to both source and destination is obviously a must. Connection managers provide a means of maintaining connection information to a variety of different connection types and provides a means of “scoping” these objects to the package or project, project managers are only available in SSIS 2012-2014.

Upon completion of this session attendees will be familiar with:

1. Connection managers
2. Connection types
3. Package connection managers
4. Project connection managers

a. Converting a package connection manager to a project connection manager

5. Connection manager properties
6. Parameter mapping

Module 06 – Introduction to Data Flow –  

Considering that the primary purpose of an ETL solution is to move data from one source, transform the data, and load it into the destination(s), the data flow task serves as the work horse in SSIS. The data flow task provides the means to encapsulate ETL within a single container.

Within a data flow task you have the ability to define one or many:

1. Data sources
2. Data transformations
3. Data destinations

This session focuses on outlining the data flow task and its place within an SSIS package as well as explaining how the data flow is truly a control flow task and what makes it different from other tasks. Upon completion of this session attendees will be familiar with:

1. Data flow task
2. Data flow sources
3. Data flow transformations

a. Row Transformations
b. Row Set Transformations
c. BI Transformations
d. Split and Join Transformations
e. Auditing Transformations
f. Custom Transformation

4. Data flow “sizing”

Module 07 – Advanced Control Flow –  

With a solid understanding of control flow it is time to begin to explore more advanced concepts. This session presents ways to leverage control flow tasks to accomplish more complex operations as well as to examine the more intricate properties within control flow properties.

This session outlines methods to complete complex types of operations such as:

1. Replacing dynamic SQL using SSIS
2. Pass input parameters using an execute SQL task
3. Utilize output parameters using an execute SQL task
4. Capture an entire result set within an SSIS variable
5. Profile data from an ADO source
6. Implementing transactions using SSIS

Module 08 – Advanced Data Flow –  

Quite often an ETL solution will require iterating through a set of

1. Files
2. Tables
3. Connections

To import into a destination or be required to create a staging table using the name of the imported source. Also once a “full load” has been completed most often a data warehouse will be required to provide incremental loads, insert new rows since the last load, update records that have changed, and provide a soft or hard delete of records that have been deleted from the source. Common and more complex requirements also include Identifying and merging or removing duplicates as well as logging extended data flow operations.
This session focuses on the more complex data flow tasks that require additional configuration than what can be implement out of the box. Upon completion of this task attendees will be familiar with:

1. Configuring dynamic connections

a. Source and destination

2. Configuring dynamic logging
3. Using the fuzzy grouping data flow transformation
4. Look up transformation
5. Incremental data loads

Module 09 – Script Tasks and Components –  

One of primary reasons that Microsoft replaced DTS, data transformation services, with SSIS is that SSIS integrates with .NET and uses the CLR rather than COM, COM+, and DCOM. Full integration with the CLR is fully leveraged to include script tasks and script components that provide the ability to write managed code, C# or Visual Basic .NET, within the control flow as well as the data flow. The script tasks are control flow tasks while script components are components in the data flow that can act as data sources, transformations, or destinations.
This session focuses on the using script tasks and components to extended SSIS capabilities. Upon completion of this task attendees will be familiar with:

1. Using a script task
2. Working with variables using a script task
3. Script task properties
4. Using a script component
5. Using a script components as a transformation
6. Custom components

Module 10 – Troubleshooting and Debugging –  

During development, deployment, and while in production SSIS provides a number of ways to troubleshoot and debug package and project execution. It is critical that developers are familiar with the means to troubleshoot, log, and debug packages to insure the package and project performance. SSIS fully supports debugging in control and data flow halting execution and providing the ability to view system and user variables and values as well as data flow buffers. There are several methods to log and troubleshoot deployed packages and projects that vary based on the deployment method, package or project deployment model.

This session focuses on implementing break points and data grid viewers as well as logging and error handling. Upon completion of this task attendees will be familiar with:

1. Break points and debugging in SSIS
2. Data grid viewer in the data flow
3. Configuring logging and log providers
4. Error handling

a. Control flow
b. Data flow
c. Script tasks/components

5. Precedence constraints
6. Custom logging

Module 11 – Data Quality Services and Master Data Services –  

Data quality services, DQS, was first introduced in SQL 2012 and is a knowledge-driven data quality product. DQS enables you to build a knowledge base and use it to perform a variety of critical data quality tasks, including correction, enrichment, standardization, and de-duplication of your data. Master data services, MDS, was first introduced in SQL 2008R2 and providessolution for master data management. Master data management (MDM) describes the efforts made by an organization to discover and define non-transactional lists of data, with the goal of compiling maintainable master lists.

This session focuses on using both DQS and MDS as a means to cleanse data, remove duplicates, and provide and compile master lists. Upon completion of this task attendees will be familiar with:

1. DQS objects
2. Knowledge base
3. DQS projects

a. Cleansing
b. Matching

4. DQS task and transformation in SSIS
5. MDS components
6. MDS Model
7. MDS entities and attributes
8. MDS Excel add in

Module 12 – Deployment Models –  

Since its introduction SSIS has supported the package deployment model to:

1. SQL, msdb database
2. File system
3. File store

SSIS 2012 introduced a new deployment model, project deployment model, which extends not just the storage of an SSIS solution, but also adds additional functionality such as:

1. Package parameters
2. The SSIS Catalog
3. Environments
4. The ability to execute packages using a stored procedure
5. Inherent logging
6. SSRS reports included with the SSIS catalog

This session will outline the differences between the two deployment models and provide detail as to which method should be used in specific circumstances. Upon completion of this task attendees will be familiar with:

1. Package deployment model
2. Project deployment mode
3. Executing packages
4. Project parameters
5. Package configurations
6. Environments

 

[tabby title=”Speaker”]
DavidDyeDavid Dye’s primary focus and expertise is with SQL Server, reporting services, integration services, and analysis services, and he was recognized for his work by SQL Server Magazine as “Innovator of the Year” runner up in 2007. David is an MCITP for SQL Server 2005 and 2008 in both database administration and business intelligence and is a Microsoft Certified Trainer. He regularly posts on the MSDN SQL Server forums where he also serves as a moderator, and is a contributor at SQLCLR.net. In addition to his knowledge of SQL Server David works as a software developer using VB.net and C# and has worked extensively in SharePoint development.

 

[tabbyending]