What is the proper way to insert millions of records from a csv file into a SQL Server database?

c# csv database sqlbulkcopy sql-server

Question

I have to insert 40 million records from a .csv file into a database and below is the process I followed.

  • Windows Service 1:

    • Reading the CSV
    • validating the records
    • inserting valid records into success table (intermediate table) using SqlBulkCopy.
  • Windows Service 2.

    • Getting 10 000 records from success table at a time
    • running a foreach for these 10 000
    • sending each record to database to insert into the main table.

Windows Service 1 takes about 30-40 Min but windows service 2 takes about 5 Hours to complete the task (minimum time). I have 2 ways to do this but cannot decide which is better and open for suggestions.

  1. Creating 4 separate windows service and processing 40000 records simultaneously
  2. Using a job watch we can use while loop
  3. Calling procedure async from windows service

My biggest doubt here is that we are using transactions in procedure and will async work with it because as per my assumption using transaction locks the table and other process needs to work on it.

Popular Answer

I think your using the wrong tool for this job. c# apps might do the trick but there is a much more powerful way to do this using integration services.

I am guessing here, but these tables in the middle are to transform or check, or maybe to batch the import down?

SSIS can do all of these things using it log limit and SQL Bulk Import tools. I currently do hospital data imports which is around 8,000,000 records each night and it takes me a matter of minutes not hours to do.

A good read too around how SQL deals with such large data inputs is this article



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why