I want to put as much data into a database as quickly as I can, which is a lot of data that is continually coming in (around 10,000 a minute, and rising). I now use prepared insert statements, but I'm considering switching to the SqlBulkCopy class to import the data in bigger batches.
The issue is that I'm not inserting into a single database; rather, the data item's components are spread over many tables, and other rows that are simultaneously inserted utilize their identity fields as foreign keys. I am aware that bulk copies aren't intended to support such intricate inserts, but I am considering switching my identification columns (in this instance, bigints) for uniqueidentifier columns. Since I can know the IDs before the insert, I won't need to check for anything like SCOPE IDENTITY that is prohibiting me from utilizing bulk copy, and I'll be able to execute a few bulk copies for each table as a result.
Is this a workable option, or might there be additional problems I could run into? Or is there an other method I can fast enter data while still using bigint identification columns?
It seems that you want to switch the approach from "SQL assigns a [bigint identity() column] surrogate key" to "data prep procedure creates a GUID surrogate key." In other words, the key will be assigned outside to SQL rather than internally. In light of your volume, if the data-generating process assigned a surrogate key, I'd go with it without a doubt.
It therefore becomes an issue of whether you must utilize GUIDs or if your data production process can generate auto-incrementing integers. It's difficult to design a procedure like this that functions consistently and without error (which is one of the reasons you pay money for SQL Server), but the cost of smaller, easier-to-understand keys in the database could be worth it.
Page splits and wider will certainly make matters worse; use of uniqueidentifier. watch zzz-5 zzz
If your load can/is batched, you have the following options:
Peak row counts of about 50k occur sometimes (and increasing this way). In order to prevent duplicate transaction log writes, we really utilize a separate staging database.)