I have a large (~50Gb, ~300 mln rows) tab separated file, which I want to import into a SQL Server table with columns: char(10), varchar(512), nvarchar(512), nvarchar(512)
.
It takes about a day to bulk import it using T-SQL, SSIS or C# SqlBulkCopy class.
Is there any faster way to load this data?
Or could there be some condition slowing it down, which I can remove or change?
If you are inserting to an existing table, drop all indexes prior to import and re-create them after the import.
If you are using SSIS, you can tweak the batch and commit sizes.
Verify there is adequate memory on the server for such a large data load.
Perform the loading operation on the local server (copy file locally, don't load over the network).
Configure your destination database and transaction log auto-growth options to a reasonable value, such as a few hundred MB chunks at a time (default is typically growth by 1MB for the master data file .mdf). Growth operations are slow/expensive so you want to minimize these.
Make sure your data and log files are on fast disks, preferably on separate LUNs. Ideally you want your log file on a mirrored separate LUN from your log file (you may need to talk to your storage admin or hosting provider for options).
I have just spent the last few weeks fighting with the optimizing a very large load myself. BULK INSERT is the fastest way, I found with BCP, as opposed to SSIS or TSQL Bulk Insert, but there are things you can do to tune this.