sqlbulkcopy mem. management

datatable memory sqlbulkcopy

Question

I'm using SQLBULKCOPY to copy some data-tables into a database table, however, because the size of the files I'm copying run sometimes in excess of 600mb, I keep running out of memory.

I'm hoping to get some advice about managing the table size before I commit it to the database so I can free up some memory to continue writing.

Here are some examples of my code (some columns and rows eliminated for simplicity)

            SqlBulkCopy sqlbulkCopy = new SqlBulkCopy(ServerConfiguration); //Define the Server Configuration
        System.IO.StreamReader rdr = new System.IO.StreamReader(fileName);

        Console.WriteLine("Counting number of lines...");
        Console.WriteLine("{0}, Contains: {1} Lines", fileName, countLines(fileName));

        DataTable dt = new DataTable();

        sqlbulkCopy.DestinationTableName = "[dbo].[buy.com]"; //You need to define the target table name where the data will be copied
        dt.Columns.Add("PROGRAMNAME");
        dt.Columns.Add("PROGRAMURL");
        dt.Columns.Add("CATALOGNAME");

        string inputLine = "";
        DataRow row; //Declare a row, which will be added to the above data table

        while ((inputLine = rdr.ReadLine()) != null) //Read while the line is not null
            {
                i = 0;
                string[] arr;

                Console.Write("\rWriting Line: {0}", k);
                arr = inputLine.Split('\t'); //splitting the line which was read by the stream reader object (tab delimited)
                row = dt.NewRow();
                row["PROGRAMNAME"] = arr[i++];
                row["PROGRAMURL"] = arr[i++];
                row["CATALOGNAME"] = arr[i++];
                row["LASTUPDATED"] = arr[i++];
                row["NAME"] = arr[i++];
                dt.Rows.Add(row);
                k++;
        }

        // Set the timeout, 600 secons (10 minutes) given table size--damn that's a lota hooch
        sqlbulkCopy.BulkCopyTimeout = 600;
        try
        {
            sqlbulkCopy.WriteToServer(dt);
        }
        catch (Exception e)
        {
            Console.WriteLine(e);
        }
        sqlbulkCopy.Close();//Release the resources
        dt.Dispose();

        Console.WriteLine("\nDB Table Written: \"{0}\" \n\n", sqlbulkCopy.DestinationTableName.ToString());

    }

I continued to have problems getting SQLBulkCopy to work, and I realized I needed to do more work on each record before it was entered into the database, so I developed a simple LinQ to Sql method to do record by record updates, so I could edit other information and create more record information as it was being run,

Problem: This method's been running pretty slow (even on Core i3 machine), any ideas on how to speed it up (threading?) -- on a single processor core, with 1gb of memory it crashes or takes sometimes 6-8 hours to write the same amount of data as one SQLBulkCopy that takes a few moments. It does manage memory better though.

            while ((inputLine = rdr.ReadLine()) != null) //Read while the line is not null
        {
            Console.Write("\rWriting Line: {0}", k);
            string[] arr;              
            arr = inputLine.Split('\t');

            /* items */
            if (fileName.Contains(",,"))
            {
                Item = Table(arr);
               table.tables.InsertOnSubmit(Item);

                /* Check to see if the item is in the db */
                bool exists = table.tables.Where(u => u.ProductID == Item.ProductID).Any();

                /* Commit */
                if (!exists)
                {
                    try
                    {
                        table.SubmitChanges();
                    }
                    catch (Exception e)
                    {
                        Console.WriteLine(e);
                        // Make some adjustments.
                        // ...
                        // Try again.
                        table.SubmitChanges();
                    }
                }
            }

With helper method:

    public static class extensionMethods
{
    /// <summary>
    /// Method that provides the T-SQL EXISTS call for any IQueryable (thus extending Linq).
    /// </summary>
    /// <remarks>Returns whether or not the predicate conditions exists at least one time.</remarks>
    public static bool Exists<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate)
    {
        return source.Where(predicate).Any();
    }
}

Accepted Answer

Try specifying the BatchSize property to 1000 which will batch up the insert in a 1000 record batch rather than the whole lot. You can tweak this value to find what is optimal. I have used sqlbulkcopy for similar size data and it works well.


Popular Answer

Faced with the same issue, found that problem of OutOfMemory Exception was in DataTable.Rows maximum quantity limitations. Solved with recreating table, with maximum 500000 rows limit. Hope, my solution will be helpfull:

var myTable = new System.Data.DataTable();
myTable.Columns.Add("Guid", typeof(Guid));
myTable.Columns.Add("Name", typeof(string));

int counter = 0;

foreach (var row in rows)
{
    ++counter;

    if (counter < 500000)
    {
        myTable.Rows.Add(
            new object[]
            {
                row.Value.Guid,
                row.Value.Name
            });
    }
    else
    {
        using (var dbConnection = new SqlConnection("Source=localhost;..."))
        {
            dbConnection.Open();
            using (var s = new SqlBulkCopy(dbConnection))
            {
                s.DestinationTableName = "MyTable";

                foreach (var column in myTable.Columns)
                    s.ColumnMappings.Add(column.ToString(), column.ToString());

                try
                {
                    s.WriteToServer(myTable);
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.Message);
                }
                finally
                {
                    s.Close();
                }
            }
        }

        myTable = new System.Data.DataTable();
        myTable.Columns.Add("Guid", typeof(Guid));
        myTable.Columns.Add("Name", typeof(string));

        myTable.Rows.Add(
            new object[]
            {
                row.Value.Guid,
                row.Value.Name
            });

        counter = 0;

    }
}



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why