I am working on writing a process which will write to SQL Server from Spark- Scala application. Its generating multiple INSERT BULK per partition of data (as expected), batchsize 100K records.
As I am monitoring the transaction log, I can see its filling up, and I was hoping with INSERT BULK it will not.
Can you please suggest, how I can achieve commits per batch.
When your DB recovery model is set to full, bulk insert will write to the transaction log and the transaction logs will continue to grow. You have two options:
Change the database recovery model to bulked logged while the data is loaded and reset it to full afterwards
Modify your process to backup transaction logs after each batch is loaded (or every few batches)