How to keep row order with SqlBulkCopy?

excel export import sqlbulkcopy sql-server

Question

I'm exporting data programatically from Excel to SQL Server 2005 using SqlBulkCopy. It works great, the only problem I have is that it doesn't preserve the row sequence i have in Excel file. I don't have a column to order by, I just want the records to be inserted in the same order they appear in the Excel Spreadsheet.

I can't modify the Excel file, and have to work with what I've got. Sorting by any of the existing columns will break the sequence.

Please help.

P.S. Ended up inserting ID column to the spreadsheet, looks like there's no way to keep the order during export/import

Accepted Answer

I don't think that row ordering is specified or guaranteed by SQL unless you use an "ORDER BY " clause.

From a post by Bill Vaughn (http://betav.com/blog/billva/2008/08/sql_server_indexing_tips_and_t.html):

Using Order By: Even when a table has a clustered index (which stores the data in physical order), SQL Server does not guarantee that rows will be returned in that (or any particular) order unless an ORDER BY clause is used.

Another link with info:

http://sqlblogcasts.com/blogs/simons/archive/2007/08/21/What-is-the-position-of-a-row--.aspx


Popular Answer

After lots of research it seems evident that there's no way to retain row order with the Bulk Insert command written as it is featured by Microsoft. You either have to add an ID column yourself directly into the import file, use a shell or other external script, or you do without. It seems it would be a needed (and easy) feature for Microsoft to add, but after more than a decade of nothing from them, it's not going to happen.

Yet I needed to preserve the actual record order in the import file after importing as higher up records would supersede those lower if a set column had the same value.

So I went a different route. My constraints were:

  • I couldn't change the source file at all. (and set a bad precedent!)
  • I couldn't use an external script. Too complicated. It had to be a simple T-Sql based solution, no CMD executions. This needed to go into a single procedure so it could be automated.

I liked the logic of using Powershell to create ordered insert statements for each row, then running in Sql. It was essentially queuing each record up for individual insert rather than BULK insert. Yes, it would work, but it would also be very slow. I often have files with 500K+ rows in them. I needed something FAST.

So I ran across XML. Bulk upload the file directly into a single XML variable. This would retain the order of the records as each is added to the XML. Then parse the XML variable and insert the results into a table, adding an identity column at the same time.

There is an assumption that the import file is a standard text file, with each record ending in a Line Feed (Char(13)+Char(10))

My approach has 2 steps:

  1. Execute the IMPORT SQL statement (using OPENROWSET), encapsulating each record with XML tags. Capture the results into an XML variable.

  2. Parse the variable by the XML tags into a table, adding an incrementing [ID] column.

    ---------------------------------
    Declare @X xml;
    ---------------------------------
    SELECT @X=Cast('<X>'+Replace([BulkColumn],Char(13)+Char(10),'</X><X>')+'</X>' as XML)
    FROM OPENROWSET (BULK N'\\FileServer\ImportFolder\ImportFile_20170120.csv',SINGLE_CLOB) T
    ---------------------------------
    SELECT [Record].[X].query('.').value('.','varchar(max)') [Record]
    ,ROW_NUMBER() OVER (ORDER BY (SELECT 100)) [ID]
    --Into #TEMP 
    FROM @X.nodes('X') [Record](X);
    ---------------------------------
    
    • The XML tags replace each Line Feed.

    • If the file ends with a Line Feed, this will cause a blank row to be added at the end. Simply delete the last row.

I wrote this into my procedure using dynamic sql so I could pass in the FileName and set the ID to begin at 1 or 0 (in case there's a header row).

I was able to run this against a file of 300K records in about 5 seconds.



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why