Loading 1TB in 10 Minutes in SSIS – SQL Server Scalability on high end hardware
This is a live post from the session of Henk van der Valk at Tech Ed Israel 2010 .
This session has great tips and tricks for you in SSIS and is in English (just in case you're reading this blog and don't understand Hebrew :))
Checking for problems when test loading data:
Select * from sys.dm_os_wait_stats
Number 1 wait: pageiolatch_up
Solution: add more spindles
So - just add more hardware?
Number 2 pagelatch_up
Solution: add more database files
Configure 1 to 4 files per filegroup to get 200+ KB writes IOs.
Use Soft Numa - assign a specific port for it
Use money data type instead of integer
Use Fast Parse
New to SQL Server 2008 R2:
Support for 64+ cores
Enterprise class SSD (Solid State Disks)
Unisys SQL Server SSD Solution codename SQL PowerRack
Bulk Insert file in SSIS - how can you speed it up?
Use a conditional split + modulo to double the speed (increasing the throughput)
Reading from a table as fast as possible
Read the data from 3 sources (all are the same table) using the union all to connect the data inside a new table. Use maxdop 1 for each OLEDB source.
Table partitioning with hashing speeds up reading from both Disks and Memory. Tested for spindles. In SSD we get even better results.
Page lookup \ sec and IO Read bytes \ sec to see how many pages are being read and whether they are from disk or from memory. Perhaps your problem isn't in SQL Server, but rather in your connectivity.
Backup with compression in SSD - for better performance, backup to multiple files rather than just one. Same for restore - restore from multiple file and limit it to Maxdop 1. Do not oversize the log file.
Activity monitor - check for response time in SQL Server 2008. In SQL Server 2005, look at sys.dm_io_pending_io_requests and sys.dm_io_virtual_file_stats.
For more information, check out Mr. Van Der Valk's blog (link above) and the paper on Data Loading Performance Guide.