I have defined a DataTable by adding typed columns. I expect the DataTable to range between 1 million to 3 million rows.
I am using Microsoft's TextFieldParser (because it supports multiple fixed width formats, well kind of, via a Peek method), to populate the rows of the DataTable.
I would like to have some sort of operation that either copies rows from the DataTable into a mirrored SQL table.
If I populate the entire DataTable and then use SqlAdapter and SqlCommandBuilder to update the SQL table as mentioned here I run out of memory.
How can I accomplish this?
It is not recommended to use DataTable for 1 M of rows, you can simply create new SqlComand for INSERT and make all table fields to be parameters and run this command in the loop.
But if you already have written a lot of code with DataTable, you may consider it as a buffer, and have the following workaround:
1) as far as you read data rows, check how many have been already read so far
2) as soon as you get 10K rows you execute DataTable.Update();
(you may call this parameter as a buffer_size and put somewhere in configuration to avoid hardcoding)
3) then you clean up all data in the DataTable, by calling
4) you continue reading data from file, loop repeats
In addition, it is hard to make a general suggestions without knowing the structure of your file and purpose of the application, I only answered the question how to avoid "out of memory" when using very large datasets. But I also recommend to consider all possible options to avoid using .NET data tables as intermediate storage of data and process file directly with all possible means that SQL server provide, such as
1) BULK INSERT - http://msdn.microsoft.com/en-us/library/ms188365.aspx
2) bcp utility - http://msdn.microsoft.com/en-us/library/ms162802.aspx
3) MS SQL Intergartion Services - http://msdn.microsoft.com/en-us/library/ms141026.aspx
bulk insert operations using any of those commands and tools can be initiated from .NET.