I have a situation where I am importing many rows of data from text files. The import process occurs using SqlBulkCopy and initially loads into a staging table.
I perform some validation on this data and would like to be able to report back to the user which line of the file is in error if validation fails. I was hoping to simply use ROW_NUMBER() along with the default ordering of the SELECT on my staging table, but there are some questions on SO that have suggested I won't be able to do this.
What would you suggest as a method to allow me to retain the original order of the input file? The input file format is TSV. Do I need to create a mock value indicating file-line?
Is there any way to report the result of the query in the order of the original insertions? Could I rely on this given I used SqlBulkCopy to load the date en masse?
I found that I am able to preserve source table order using
BULK INSERT into a file, followed by adding an identity.
Given a tab-delimited table,
C:\MyTable.txt, where I intentionally moved rows out of order:
FileName FileType wmsetup log bar txt wmsetup10 log WMSysPr9 prx WMSysPrx prx Wudf01000Inst log xpsp1hfm log _default pif 0 log 002391_ tmp 005766_ tmp
I ran the following and preserved the text file order in SQL Server:
IF EXISTS( SELECT 1 FROM sys.tables t INNER JOIN sys.schemas s on s.schema_id=t.schema_id WHERE t.name='myTable' AND t.[type]='U' AND s.name='dbo' ) DROP TABLE myTable GO CREATE TABLE dbo.myTable(FileName VARCHAR(80), FileType VARCHAR(30)) GO BULK INSERT myTable FROM 'C:\MyTable.txt' WITH ( firstrow=2 , fieldterminator='\t' , rowterminator='\n' ) GO ALTER TABLE myTable ADD ID INT IDENTITY(1,1) GO SELECT * FROM myTable GO
FileName FileType ID --------------- -------- ----------- wmsetup log 1 bar txt 2 wmsetup10 log 3 WMSysPr9 prx 4 WMSysPrx prx 5 Wudf01000Inst log 6 xpsp1hfm log 7 _default pif 8 0 log 9 002391_ tmp 10 005766_ tmp 11