Entity Framework Core 1.0 - How to seed a large amount of data

asp.net-core entity-framework-core seeding sqlbulkcopy

Question

I am using an ASP.NET Core 1.0 (previously known as ASP.NET 5) application with the core framework and Entity Framework Core 1.0 (previously known as Entity Framework 7).

In this blog https://blogs.msdn.microsoft.com/dotnet/2015/11/18/entity-framework-7-rc1-available the people in charge of development of EF Core 1.0 say that it is not recommended to use an ORM like entity for bulk inserting a large amount of data and they recommend a lower level approach such as SqlBulkCopy to do so using the database provider directly and skipping Entity Framework.

I have 120,000 records that I need to seed. At the moment I am using something like this:

private readonly MyDbContext _context;

public MyDbContextSeedData(MyDbContext context)
{
    _context = context;

}

public void EnsureSeedData(string seedPortsFilePath)
{
    SeedPorts(seedPortsFile);
}

private void SeedPorts(string seedPortsFilePath)
{
    if (!_context.Ports.Any())
    {
        var ports = PortsData.GetPorts(seedPortsFile);
        List<Port> listPorts = ports.ToList();
        // the following statement commented out did not make any improvement
        // _context.ChangeTracker.QueryTrackingBehavior = Microsoft.Data.Entity.QueryTrackingBehavior.NoTracking
        _context.AddRange(portsRange);
        _context.SaveChanges();
    }
}

and I am calling the EnsureSeedData(path) from my Startup.cs where I have an instance of MyDbContextSeedData

My findings and questions with this approach:

  1. It takes more than 30 minutes to insert all the records
  2. I cannot use something like System.Data.SqlClient.SqlBulkCopy because this class does not exist in the core framework. Any alternative?
  3. Entity Framework, behind the scenes and according to the logs, seems to be splitting the INSERT SQL statemens in chunks of 1999 values. It also seems to close the DB connection between insertions. I guess this is all managed in the most efficient way by Entity anyway..
  4. I could not find any example on the best practices to seed a database. Should I use something similar to my approach and seed from the code when the application starts (assuming I find a quicker way to do the inserts)? Or should I use some kind of Migration that executes a SQL script?

Any example on how to seed large amounts of data using ASP.NET Core framework dnxcore50 would be appreciated!

Popular Answer

I believe you are mistaken about System.Data.SqlClient.SqlBulkCopy:

https://github.com/dotnet/corefx/blob/master/src/System.Data.SqlClient/src/System/Data/SqlClient/SqlBulkCopy.cs

Do as they recommend, ditch entity framework (of whatever stripe) and use sql bulk copy if this is going to be an on going task. On the other hand, you say "seed" -- if this is a one time thing, just do it straight in the database, and then restore from backup for future "start from scratch" situations.



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why