This is continued live blogging from the Donald Farmer seminar from the SQL & BI convention in Tel Aviv of Data Integration in Large Organizations.
SSIS 2008 improvements over SSIS 2005
C# scripting can reference all .net assemblies
VSA replaced by VSTA
Pipeline Scale Up
Tuning SSIS through looking at the buffers on the Data Flow.
In SSIS 2005 you would add buffers and break up flows
Tools > Options > BI Designers > IS Designers:
Control Flow Data Connect
Data Flow Data Connect
Check them both to connect the components automatically
Use Row Sampling component to do performance testing.
One buffer structure goes all through the process.
We load data in the beginning but we don’t need it to move in memory. The problem is when one buffers slows down the process, it slows down for all the buffers before him as well.
In 2008 we have benchmarks for each buffer and if needed, create buffers artificially (rather than manually as we had to in 2005).
Asynchronous component – the input and output are different. We don’t know what have in the middle.
Sort – asynchronous, creates a new buffer as you can exclude a column.
The other components don’t need to be held up when an asynchronous component doesn’t run quickly, as they don’t depend on it.
Advanced Editor for the component: SynchronousInputID exists for synchronous components (and not for asynchronous).
ADO .net support – for improved source UI and added destination adapter (enabling you to use SSIS to load data into providers other than SQL Server).
The performance is still 30% slower because of the use of ADO .net provider, but it enables greater flexibility.
Use OLEDB providers as is, preferably don’t use .Net OLEDB providers through ADO .net providers unless you have to.
When you connect using the ADO .net, you will only be getting a list of tables. If the data isn’t in shape of tables, you won’t see anything in the dropdown list.
Use ADO .net if you have to use ODBC, or if the component you’re using requires it, or if you need it for your Script component.
Your default should still be OLEDB.
Import \ Export Wizard is much smarter:
ADO .net support for source and destination
Data types conversion – does only accurate conversions
Starting Import \ Export from the program menu enables straight execution. Starting it in SSIS Package requires execution to be done manually.
Persistent cache usage reducing DB and memory usage.
Cache in Lookup enabling more control over sharing and lifetime of reference data.
Cache potential cons – requiring more disk storage and data in it available only for SSIS.
New to SQL Server 2008 that’s useful for SSIS 2008:
SQL Merge Syntax
Change Data Capture CDC (works only against SQL Server or through partner product against other sources)
In case SSIS crashes, SuperDump collects all the data needed to analyze why it crashed. Enables quicker time to analyze and treat the problem.
Data Profiling which enables you to react to the data quality. Enables analysis of the amount of nulls in a column, the uniqueness of a field etc.
Community Samples on Codeplex – code for you to use freely (SharePoint List adapter for instance).
More connectivity out of the box in SSIS 2008. The Attunity connector which is inside SSIS 2008 moves data into Oracle quicker than with the connector Oracle gives you.