SSIS – Stopping an Error from Propagating to its Parent (Enabling an SSIS Package to continue when an error happens)

Due to our environment and packages we create, we have to enable the packages to continue even if it fails for some reason (Cannot connect to a Server for example) so that the rest of the data will be captured.

In the past we configured the MaximumError count to be 100, but this is not ideal, because if another component fails which is critical, due to the configuration, the package would complete with success.

 

NOTE: The above is all configured in SQL Server Data Tools in SQL Server 2012.

 

1.       In our example below, we are going to configure our Data Flow Task called: Extract Data From Live System for Purchases

a.        We are going to configure the above data flow task, so that if an error occurs it will continue with the next system.

2.       The first thing that you would need to do is to change the variables so that you can see the system variables by doing the following:

a.        Go into your Variables, either by going to an existing window or by clicking in a blank area in your Control Flow, and clicking on SSIS, and then click on Variables

                                                               i.      clip_image002[4]

b.       Now in your Variables window you will see a button which says Grid Options:

                                                               i.      clip_image004[4]

                                                              ii.      Click on the Grid Options

                                                            iii.      Now in the Grid Options Window under Filter, click on the Show System variables check box as shown below

1.       clip_image006[4] 

2.       Then click Ok.

                                                            iv.      Now you will see all the System Variables as well as any other Variables you have already configured.

3.       Next click on the Event Handlers Tab within your Package.

a.        Where it says Executable, click on the drop down and select the Executable where you want to allow it to continue if it fails.

                                                               i.      In our example the executable was called:

1.       Extract Data From Live System for Purchases

b.       This is what you will see once you have clicked on your Executable.

c.        clip_image008[4]

d.       Ensure that the Event Handler is set to OnError

e.       As shown above, where it says: Click here to create an ‘OnError’ event handler for executable ‘Extract Data From Live System for Purchases’

                                                               i.      Click on it, so that once completed the above text will disappear and the Event Handler screen will be blank.

4.       Now click on Variables Window, and scroll down until you see a System Variable called Propagate:

a.        clip_image010[4]

b.       As you can see by default for the OnError it is set to True, which means if there is an error it will propagate to the error.

c.        Click on the drop down next to True and change this to False.

d.       NOTE: Now your package will not propagate the error for your executable

5.       Now save your package and then run your package.

6.       You can see from the image below, how the package continued after an Error

a.        clip_image012[4]

 

SSIS – Fastest data flow task item for transferring data over the network

My goal was to find out which combination of data flow Source and Destinations would enable me to transfer data over a network as quickly and as efficiently as possible.

·         Below is what the outcome of the tests are

·         And then finally I have only detailed, based on the fastest data flow items how to configure the data flow source and destinations in order to get the maximum throughput.

Below is tests that I completed, each with what configurations I had configured to complete the test.

Just some details on how the test was performed:

·         It was performed between 2 servers which were on a 10Gbit Connection

·         At the time, there was nothing else running on the servers, they were both idle.

·         Both the Source and Destination tables had compression enabled on the physical tables.

 

Data Flow Source Type

Data Flow Source Configuration Settings

Data Flow Destination Type

Data flow Destination Configuration Settings

 Total Rows

Time In Seconds

 Rows\Sec

Network Speed during Transfer

ADO NET Source

Set Packet Size on ADO Connection Manager to: 32768

ADO NET Destination

Set Packet Size on ADO Connection Manager to: 32768

10 Million

169,828

58 883,11

14MB\sec

Batch Size: 0

ADO NET Source

Set Packet Size on ADO Connection Manager to: 32768

ADO NET Destination

Set Packet Size on ADO Connection Manager to: 32768

10 Million

158,75

62 992,13

14Mb\sec

In TSQL Query used the Hint “Option (10000)”

ADO NET Destination Batch Size: 10 000

OLE DB Source

Configured the OLE DB Source Connection Manager to use: SQL Server Native Client 11.0

OLE DB Destination

Set the Commit Size for the OLE DB Destination to zero “0”

   10 Million

21,046

475 149,67

100Mb\sec

In TSQL Query used the Hint “Option (10000)”

Configured the OLE DB Destination Connection Manager to use: SQL Server Native Client 11.0

OLE DB Source

Configured the OLE DB Source Connection Manager to use: SQL Server Native Client 11.0

OLE DB Destination

Configured the OLE DB Destination Connection Manager to use: SQL Server Native Client 11.0

   10 Million

20,578

485 955,88

110-115Mb\sec

In TSQL Query used the Hint “Option (10000)”

Set the Commit Size for the OLE DB Destination to zero “0”

Set Packet Size on OLE DB Connection Manager to: 32767

Set Packet Size on OLE DB Connection Manager to: 32767

OLE DB Source

Configured the OLE DB Source Connection Manager to use: Microsoft OLE DB Provider for SQL Server

OLE DB Destination

Configured the OLE DB Source Connection Manager to use: Microsoft OLE DB Provider for SQL Server

   10 Million

20,031

499 226,20

120Mb\sec

In TSQL Query used the Hint “Option (10000)”

Set the Commit Size for the OLE DB Destination to zero “0”

Set Packet Size on OLE DB Connection Manager to: 32767

Set Packet Size on OLE DB Connection Manager to: 32767

 

Observations from the above tests

·         The difference between the ADO NET source and ADO NET Destinations when compared to the OLE DB Source and OLE DB Destinations is significant.

o    It was just over 8 x faster.

·         It was interesting to note, that the network card throughput was almost 10 x faster when changing from the ADO NET to OLE DB.

·         So when moving large amounts of data, this would mean that data transfers can go from hours to minutes.

·         Below is a chart to show how significant the difference is:

·         clip_image002[7]

 

Configuration of OLE DB Data Flow Source and Destinations for maximum throughput over a network.

Below are the configuration settings for the final row in the table above.

 

1.       The first part is to configure your OLE DB Source Connection.

a.        Create your new OLE DB Source Connection, and then once it is open ensure that the following is selected:

                                                               i.      Where it says provider, from the drop down select:

1.       clip_image004[6]

b.       Then put in your Server Name and Database so that it will be shown as below:

                                                               i.      clip_image006[7]

c.        Next click on All in the left hand column.

d.       Scroll down near the bottom where it says Packet Size

                                                               i.      Change the value from 4096 to:

1.       32767

2.       NOTE: It must be the above number, if you put it any higher the connection WILL fail

                                                              ii.      This is what it looks like below:

1.       clip_image008[6]

2.       Next for the OLE DB Destination you will follow all the steps in Step 1, but this will just change to your destination Server.

3.       Now what sometimes makes the source query faster is using the TSQL Hint: Option (Fast 10000)  

a.        NOTE: This sometimes makes the query faster, and other times it can also slow the query down. So test first.

b.       EG:

SelectRowID,DateAdded

Fromdbo.tb_TableNamewith (nolock)

Option (Fast 10000)

4.       The final part is to configure the OLE DB Destination in the data flow task.

a.        Double click or right click on the OLE DB Destination and go into the Properties.

b.       Then ensure that you have configured with the following as shown below:

                                                               i.      clip_image010[11]

c.        NOTE: that the Maximum insert commit size is set to zero.

                                                               i.      You must ensure that you have enough memoryin order to only commit the entire transaction once.

                                                              ii.      If you do NOT have enough memory, when SSIS tries to commit the transaction, it will then fail.

                                                            iii.      If this happens to you, then configure the Maximum insert commit size, to accommodate your memory allocation.

 

Update – Potential issue with the Maximum Insert Commit size (25 July 2013)

1.       I recently was doing a large insert of data that was roughly 640 million rows, using the above method to get the data down as quickly as possible.

2.       I ran into an issue when it then had to complete the commit, which lead me to the following findings listed below.

3.       When using the Maximum Insert Commit size of zero and there is not enough memory on your server, SSIS then stores this data in the TempDB. So you are not really getting the benefit of having your data stored in memory.

4.       Due to the data now being transferred from the Temp DB into your working database, this once again meant that the performance was really bad. And once again due to the data not coming from the memory buffers the performance was really slow.

5.       Accompanied with this is that it was now logging ever insert into your working database. So due to the fact that it was now inserting and logging every row, my Log File drive ran out of space when the log hit 340 Gb.

6.       So once it failed the entire transaction rolled back.

Lessons Learnt Are

·         Only use the Maximum Insert Commit size when you are inserting maybe no more than 10 million rows at a time.

·         If it spills over to the Temp DB the performance is really slow.

·         Rather if it is a large data set that you are moving rather change the Maximum Insert Commit size to a value which has the best performance for the speed in which you can insert the rows.

SSIS – Automating SQL Server Backups, copying files and checking that they were copied

I had an issue where I lost all my backups. So what I have now created is an automated SQL Server backup process, which will back up all the databases, verify the backups, copy them to the required location. And then finally email the backup files copied to ensure that they are all there.

               

Backing up the databases

1.       Below is the script which is created on the master Database.

2.       This backs up the databases all except the TempDB to the desired location.

3.       NOTE: WE ARE ALWAYS USING THE SAME FILENAMES AND OVERWRITING THE CURRENT BAK FILE, BECAUSE WE ONLY NEED THE LATEST COPY.

a.       THIS ALSO SAVES SPACE FOR THE BACKUP FILES.

4.       Here is the example of the script to run to create it, and some notes afterwards

USE[master]

GO

 

/****** Object:  StoredProcedure [dbo].[prc_BackupDatabases]    Script Date: 2013-06-06 01:58:41 PM ******/

SETANSI_NULLSON

GO

 

SETQUOTED_IDENTIFIERON

GO

 

— =============================================

— Author:           Gilbertq     

— Create date: 07 June 2013

— Description:     

— =============================================

CREATEPROCEDURE[dbo].[prc_BackupDatabases]

AS

BEGIN

 

       SETNOCOUNTON;

 

 

DECLAREUserDatabases_CTE_CursorCursor

FOR

 

— Selecting user database names.

selectnameasDatabaseName

fromsys.sysdatabases

where ([dbid])<> 2

 

OPENUserDatabases_CTE_Cursor

DECLARE@dbNamevarchar(100);

DECLARE@backupPathvarchar(100);

DECLARE@backupQueryvarchar(500);

 

— make sure that the below path exists

set@backupPath=‘C:\SQLBackups\’

 

FetchNEXTFROMUserDatabases_CTE_CursorINTO@dbName

While (@@FETCH_STATUS<>1)

 

BEGIN

— Backup SQL statement

set@backupQuery=  ‘backup database ‘+@dbName+‘ to disk = ”’+@backupPath+‘SERVERNAME-‘++@dbName  +‘.bak” WITH COMPRESSION,INIT’

 

 

— Print SQL statement

print@backupQuery

 

— Execute backup script

–Select (@backupQuery)

EXEC (@backupQuery)

 

— Get next database

FetchNEXTFROMUserDatabases_CTE_CursorINTO@dbName

END

 

CLOSEUserDatabases_CTE_Cursor

DEALLOCATEUserDatabases_CTE_Cursor

 

 

SETNOCOUNTOFF;

END

 

GO

a.        NOTE: You will have to modify the following in the above script for it to run on your environment:

                                                               i.      @backupPath

1.       Change this to where you want to back up your SQL BAK Files.

                                                              ii.      Within the @backupQuery we have also put in the SERVERNAME into the BackupFile name

1.       This was because we were backing up multiple BAK files from multiple servers.

b.       This is what it looks like in SSIS

c.        clip_image002[4]

 

Verify the SQL Server Backups

1.       The next step is to then verify that the backups are consistent and can be restored.

2.       Again we created this script on the Master database

3.       Here is the create script:

USE[master]

GO

 

/****** Object:  StoredProcedure [dbo].[prc_VerifyDatabases]    Script Date: 2013-06-06 02:09:05 PM ******/

SETANSI_NULLSON

GO

 

SETQUOTED_IDENTIFIERON

GO

 

— =============================================

— Author:           Gilbertq     

— Create date: 07 Jun 2013

— Description:     

— =============================================

CREATEPROCEDURE[dbo].[prc_VerifyDatabases]

AS

BEGIN

 

       SETNOCOUNTON;

 

 

 

DECLAREUserDatabases_CTE_CursorCursor

FOR

 

— Selecting user database names.

selectnameasDatabaseName

fromsys.sysdatabases

where ([dbid])<> 2

 

OPENUserDatabases_CTE_Cursor

DECLARE@dbNamevarchar(100);

DECLARE@backupPathvarchar(100);

DECLARE@backupQueryvarchar(500);

 

— make sure that the below path exists

set@backupPath=‘C:\SQLBackups\’

 

FetchNEXTFROMUserDatabases_CTE_CursorINTO@dbName

While (@@FETCH_STATUS<>1)

 

BEGIN

— Backup SQL statement

set@backupQuery=  ‘Restore VERIFYONLY from disk = ”’+@backupPath++‘SERVERNAME-‘+@dbName  +‘.bak” ‘

 

— Print SQL statement

print@backupQuery

 

— Execute backup script

–Select (@backupQuery)

EXEC (@backupQuery)

 

— Get next database

FetchNEXTFROMUserDatabases_CTE_CursorINTO@dbName

END

 

CLOSEUserDatabases_CTE_Cursor

DEALLOCATEUserDatabases_CTE_Cursor

 

 

SETNOCOUNTOFF;

END

 

GO

a.        NOTE: You will have to modify the following in the above script for it to run on your environment:

                                                               i.      @backupPath

1.       Change this to where you backed up your SQL BAK Files.

                                                              ii.      Within the @backupQuery we have also put in the SERVERNAME into the BackupFile name

1.       This was because we were backing up multiple BAK files from multiple servers.

b.       This is what it looked like in SSIS

c.image

 

Copying files to backup location on the network

1.       Here we are then copying our backup files to a network location where they are then backed up.

2.       It is a simple File System Task, where we are copying all the files from one folder to another folder.

a.        The only thing to NOTE is that we are overwriting the files when we copy them over.

3.       This is what it looks like, simple to setup and create.       

image

 

 

Getting the file list and emailing it to the required users

1.       The final part is where I want to verify that the backups were backed up and copied over to the network location.

2.       The first part is using the PowerShell script to get the file listing into CSV.

a.        NOTE: I created a mapped drive to our backup location in order to make it easier in the script.

                                                               i.      And to also ensure that there is no authentication issues.

3.       All the files were saved in a Folder called PowerShell

4.       This is the PowerShell script that was used to get the file listing and exported to CSV

Get-ChildItem r:\*.*  -include SERVERNAME*.* | select name,length,LastWriteTime  | Export-Csv C:\SQLBackups\Powershell\SERVERNAME-BackupFileList.csv

a.        This was saved with the following filename:

                                                               i.      Filelisting_SERVERNAME.ps1

5.       Next this is how we configured it to run the PowerShell script in SSIS

a.        Drag in an Execute Process Task and configure it with the following:

                                                               i.      We gave it the following name:

1.       PowerShell to get Directory Listing for SERVERNAME Files

                                                              ii.      Then click on the Process on the left hand side.

                                                            iii.      This is where you actually configure how you will run the Execute Process

1.       Where it says Executable you have to put in the location for your executable.

2.       In our example this is the location for PowerShell

C:\Windows\System32\WindowsPowerShell\v1.0\PowerShell.exe

3.       Next is the Arguments that you want to pass with your executable

4.       In our example we put in the PowerShell script to call:

-ExecutionPolicy ByPass -command “. ‘M:\SQLBackups\Powershell\Filelisting_SERVERNAME.ps1′”

b.       You can leave all the other defaults so that it looks like the following below:

image

6.       The next step is to then go through the process of Importing the data from the CSV file into the table called:

Staging.tb_BackupFileListing

a.        NOTE: In this table we store when the file was modified as well as when data was imported into the table.

7.       We then import this data into the Fact Table so that we have got the history stored, into the table called:

a.       TF_BackupFileListing_Converted

8.       We then export just the current Backup File Listing data into an Excel Spread sheet which is saved locally into the PowerShell Folder.

9.       The final step is where we then attach the exported spread sheet and email it to the required people.

image

10.    This is what this package looks like:

a.image