I am loading a big file (~25 GB, 400 million lines) into the SQL Server 2014 using SqlBulkCopy
.
My bulk copy size is 10k lines and timeout is 1 hour. The whole file is in a single transaction as I would like to store the entire file or roll back in case of failures. I have 3 indexes (one unique clustered index and other two are non clustered indexes) on this table.
Sometimes I get this exception:
System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action
1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at System.Data.SqlClient.SqlBulkCopy.RunParser(BulkCopySimpleResultSet bulkCopyHandler) at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsyncContinuedOnSuccess(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource
1 source) at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsyncContinued(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource1 source) at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsync(BulkCopySimpleResultSet internalResults, String updateBulkCommandText, CancellationToken cts, TaskCompletionSource
1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalRestContinuedAsync(BulkCopySimpleResultSet internalResults, CancellationToken cts, TaskCompletionSource1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalRestAsync(CancellationToken cts, TaskCompletionSource
1 source) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternalAsync(CancellationToken ctoken) at System.Data.SqlClient.SqlBulkCopy.WriteRowSourceToServerAsync(Int32 columnCount, CancellationToken ctoken) at System.Data.SqlClient.SqlBulkCopy.WriteToServer(IDataReader reader)
In case of exception, I do a rollback of entire transaction but then I see one more exception while trying to rollback. I didn't understand why the SqlTransaction
is not in usable state as my code doesn't close the connection/transaction anywhere else.
System.InvalidOperationException: This SqlTransaction has completed; it is no longer usable.
I have debugged to some extent, in SSMS I see that there is a open transaction
select @@TRANCOUNT ==> this gives "0" (I didn't understand why)
dbcc opentran ==> this gives me a open transaction
Oldest active transaction:
SPID (server process ID): 58
UID (user ID) : -1
Name : user_transaction
LSN : (159:2843368:1)
Start time : Dec 30 2014 11:20:13:903PM
SID : 0x0105000000000005150000005d28f57fd53ad8354354e02ae9881a00
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Then using this SQL script I figured out that it got stuck on the line
insert bulk dbo.#mytablename#
with status as running, timestamp exactly matching the time at which it threw the exception - 1 hour (bulk copy timeout). But I couldn't figure out why SQL Server got stuck here and why it is taking more time. Is there a way to debug this further?
SELECT
r.[session_id],
c.[client_net_address],
s.[host_name],
c.[connect_time],
[request_start_time] = s.[last_request_start_time],
[current_time] = CURRENT_TIMESTAMP,
r.[percent_complete],
[estimated_finish_time] = DATEADD
(
MILLISECOND,
r.[estimated_completion_time],
CURRENT_TIMESTAMP
),
current_command = SUBSTRING
(
t.[text],
r.[statement_start_offset]/2,
COALESCE(NULLIF(r.[statement_end_offset], -1)/2, 2147483647)
),
module = COALESCE(QUOTENAME(OBJECT_SCHEMA_NAME(t.[objectid], t.[dbid]))
+ '.' + QUOTENAME(OBJECT_NAME(t.[objectid], t.[dbid])), '<ad hoc>'),
[status] = UPPER(s.[status])
FROM
sys.dm_exec_connections AS c
INNER JOIN
sys.dm_exec_sessions AS s
ON c.session_id = s.session_id
LEFT OUTER JOIN
sys.dm_exec_requests AS r
ON r.[session_id] = s.[session_id]
OUTER APPLY
sys.dm_exec_sql_text(r.[sql_handle]) AS t
WHERE
c.session_id = 58;
It timed out because your bulk load will take over an hour to finish. As per MSDN, The timeout setting refers to
Number of seconds for the operation to complete before it times out.
It makes sense that a 25GB file may take over an hour to load into the database, so why not try setting a higher timeout value so it has enough time to complete?