How do I efficiently store all OpenStreetMap data in an indexed way?

linq-to-sql openstreetmap sqlbulkcopy sql-server-ce

Question

Note: While I target Windows Phone 7, it doesn't introduce anything besides a size restriction.

In an attempt to write a GPS / Routing / Map application for the Windows Phone 7, I'm trying to attempt to use OpenStreetMap for this and I want to get my data stored in a SQL Server Compact Edition database on my Windows Phone 7. This is giving me a lot of trouble so I'm getting clueless what the right way is...

Here is my progress:

  1. I've downloaded Belgium.osm.pbf, which contains all the Belgium OSM data in PBF format.

    Note that Belgium isn't that large, it's the country I live in so it seems as a good start.

    It would be nice if my database were near the size of that PBF file, because it is only 80 MB...

  2. Using Marc Gravell's protobuf-net, I now have written a parser which is giving me all the OSM data.

  3. On a first attempt I tried to just load it all in memory but that seems to be too big for my Windows Phone 7, as it results in a size > 512 MB. Then the idea was that I needed a database to store this information in, so it seems logic to store this in a SQL Server Compact Edition sdf file.

  4. Hence, I created the following DataContext and Tables in LINQ to SQL:

    public class RoutingContext : DataContext
    {
        public RoutingContext()
    #if WINDOWS_PHONE
            : base("Data Source = 'isostore:/RoutingDB.sdf'; Max Database Size = 1024; Max Buffer Size = 65536")
    #else
            : base("Data Source = './RoutingDB.sdf'; Max Database Size = 1024; Max Buffer Size = 65536")
    #endif
        {
    
        }
    
        public Table<Node> Nodes;
        public Table<Road> Roads;
        public Table<RoadNode> RoadNodes;
        public Table<NodeProperty> NodeProperties;
        public Table<RoadProperty> RoadProperties;
        public Table<StringData> Strings;
    }
    
    [Table]
    public class Node
    {
        [Column(IsPrimaryKey = true)]
        public int Id { get; set; }
    
        [Column()]
        public int Lon { get; set; }
    
        [Column()]
        public int Lat { get; set; }
    }
    
    [Table]
    public class NodeProperty
    {
        [Column()]
        public int NodeId { get; set; }
    
        [Column(DbType = "NVarChar(255) NOT NULL")]
        public int Key { get; set; }
    
        [Column(DbType = "NVarChar(255) NOT NULL")]
        public int Value { get; set; }
    }
    
    [Table]
    public class RoadProperty
    {
        [Column()]
        public int RoadId { get; set; }
    
        [Column(DbType = "NVarChar(255) NOT NULL")]
        public int Key { get; set; }
    
        [Column(DbType = "NVarChar(255) NOT NULL")]
        public int Value { get; set; }
    }
    
    [Table]
    public class Road
    {
        [Column(IsPrimaryKey = true)]
        public int Id { get; set; }
    }
    
    [Table]
    public class RoadNode
    {
        [Column()]
        public int RoadId { get; set; }
    
        [Column()]
        public int NodeId { get; set; }
    }
    
    [Table]
    public class StringData
    {
        [Column(IsPrimaryKey = true)]
        public int Id { get; set; }
    
        [Column(DbType = "NVarChar(255) NOT NULL")]
        public String String { get; set; }
    }
    
  5. First I went on the InsertOnSubmitTour() with a SubmitChanges() every now and then, but that apparently is way to slow as SubmitChanges() inserts row-per-row. So then I went to try SqlBulkCopy which apparently doesn't work for SQL Server Compact Edition, this made me end up with SqlCeBulkCopy which seems to be faster but still is slow.

There are two problems that I am experiencing with this solution:

  1. It is still pretty slow.

  2. The resulting size is many times bigger. Please note that Belgium.osm.pbf is only ~80 MB. The .sdf however appears to be ~592 MB, is there anything I can do about this?

So, here are my questions:

  1. Where did I completely go wrong? What should I do instead?

    I find it really weird that it's so hard to process a 80 MB file properly. Please also note that I'm doing all this computing on my computer at the moment and once it runs fair on the computer I'll try it on the Windows Phone 7.

  2. If there really is no handy LINQ solution, would it make sense to produce an indexed PBF?

    This however requires me to reinvent what a database could already provide to me.

  3. Would it make sense to increase the size on my computer, essentially making a writing a converter, then sent the ~592 MB .sdf database file to my phone?

    This seems to be a last resort that's in between option 1 and 2, but that doesn't make the application upload-able to MarketPlace as it's quite nasty to have to convert on the computer in advance and then somehow get it onto the phone.

Please note that I focus on question 1 and that the other questions are merely solutions if that is shown to be impossible, I'm just missing something that would make this go fluent but I have no idea...

Popular Answer

It makes sense to use a database for this. The size may be due to the compactness of the pbf file, also keep in mind that all data in SQL CE is unicode. Your question is unclear - what is slow? Also, you can try to Compact the database file after the import, it may shrink the file a bit. Depending on the resulting size, your .xap may still be small enough for MarketPlace. (As the .xap zips the sdf file as well)



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why