BULK generate data SQL Server

bulk bulkinsert sqlbulkcopy sql-server

Question

I must enter fictitious data intofakeData following table contains the pseudocode:

foreach(t1.id in table1)
   foreach(t2.id in table2)
      foreach(t3.id in table3)
        INSERT INTO fakeData (t1.id, t2.id, t3.id, random(30,80))

where the table's primary key is the id.

I want to input billions of records, thus I need to do this as quickly as I can. I'm not sure what the best choice is for inserting this data into the database is—if using SQL or C# are the best ways to execute the commands.

How do I run the pseudocode in SQL Server, and what is the best approach to accomplish this relatively quickly, are basically the two components of this query. (I haven't built up any indices as of yet.)

This could seem to be a repetition of every other "Fastest approach to bulk insert." As opposed to a bulk insert, I believe this query is different since the data I'm importing may really be created by my SQL Server.

PS: I have 2012 SQL Server

Added information

A star schema is this. The fact table will be fakeData.

Table 2 has 7300 items with a date dimension of 20 years. Table 3 has 96 items in its temporal dimension. Another dimension with 100 million records is table1.

1
0
6/9/2015 8:26:18 PM

Popular Answer

OK, then... Given that none actually shown how to handle random values as well. I'll share the work I've done thus far. I'm doing this right now, using the straightforward recovery model:

BEGIN TRAN

declare @x int = 1
while @x <= 5000
begin
INSERT INTO dimSpeed
Select T1.id as T1ID, T2.DateValue as T2ID, T3.TIME_ID as T3ID, ABS(Checksum(NewID()) % 70) + 20
From lines T1, dimDate T2, dimTime T3
WHERE T1.id = @x AND T2.DateValue > '1/1/2015' AND T2.DateValue < '1/1/2016'

    if (@x % 100) = 0
    begin
        COMMIT TRAN
        BEGIN TRAN
    end

    set @x = @x + 1
end

COMMIT TRAN

Where 5000 is the number of entries I am adding into TABLE1 (t1). It takes about 5 minutes to complete 5000. To input all the data I need at this pace will take 70 days. Undoubtedly, a faster alternative is required.

0
6/10/2015 4:03:29 AM


Related Questions





Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow