Top 10 Reasons Not to Populate SQL Tables Using Shape2SQL
Google a phrase like “how do I load spatial data into SQL Server?” and you will find lots of blog references to Shape2SQL. That is because it is a great piece of software. In fact, I have used it for years to populate spatial tables in Microsoft SQL Server. I suspect its popularity among bloggers is tied to its price, which is zero. Couple that with the wide availability of “free” (public domain) datasets of basic geographies like country and state/province boundaries in ESRI’s shapefile format — and you have the makings of a blog post on how to load a spatial table.
In reality, you will not likely be dealing with a lot of free spatial data in your data warehouse. For many kinds of organizations, custom point locations of facilities or incidents, custom boundaries of trading areas, farms, or distribution and delivery zones, or even licensed, high value data products like railway lines, ZIP Codes, and telecom rate and wire center areas define the norm in the corporate data warehouse. With this kind of data, the practice is not to “upload and forget.” Instead, it is upload the first time, and then manage over time.
It is in this kind of environment that you need desktop GIS software such as MapInfo Pro from Pitney Bowes or ArcGIS from ESRI. Regardless of which one you use, there are lots of reasons for relying heavily on them for managing spatial data. Because I happen to be more familiar with MapInfo Pro, I will orient my reasons toward that product:
- MapInfo Pro works both ways. Upload data into SQL Server; or download a spatial table to MapInfo Pro’s desktop environment — all, or part of it; as a linked snapshot or a live cache; 1 record or 1 million records.
- MapInfo Pro handles both ESRI “shapefiles” as well as MapInfo Pro’s own “tab” format. (It handles a lot of other formats,
- Easy Loader. It is a bulk uploader, analogous to Shape2SQL, except it is for MapInfo Pro “tab” files. Fast uploads, plus an intuitive interface.
- With SQL Server Management Studio, we can at best get a crude visual look at spatial objects in a table. Context is minimal. With MapInfo Pro, you can combine spatial data housed in SQL Server with other layers (points of interest, demographics, highways, administrative boundaries, etc), or even with Bing Maps! Look below for a comparison using a gerrymandered Congressional District in Texas. Context makes all the difference. Without it, you might think the boundary has a few drawing errors; with it, you realize the boundary is drawn as desired.
- In SQL Server Management Studio, the tools for inspecting individual polygons are limited. In MapInfo Pro, I can inspect individual nodes of a polygon.
- Editing a spatial object in SQL Server? If it is something simple like a point or ring/buffer — no problem. A complex polygon? Not going to happen. Try finding and editing a specific node in a Well Known Text representation of a polygon with 1,000s of nodes. In MapInfo Pro, you can shift a single node in a polygon with a click of a mouse. Verify your work, and if it is not right, then do it again.
- Using the Make Valid method on geography and geometry spatial data types, SQL Server can fix common problems such as self intersecting polygons. However, unless you can read and visualize the well known text format of a spatial value, it is nearly impossible to know just how SQL Server edited the polygon. In MapInfo Pro, you can fix the problem before loading the spatial object, or review the changes made by SQL Server.
- Make Valid in SQL Server fixes problems like self intersections, but it cannot detect or fix problems like gaps and overlaps. MapInfo Pro can.
- SQL Server can tell you that two polygons overlap, perhaps inadvertently. MapInfo Pro shows you exactly where and how they overlap.
- Zillions of data sets are available in “tab” format. Well, maybe not zillions. However, within your own organization, you are likely to find a lot of “tab” files. Even if your organization emphasizes ShapeFiles, MapInfo Pro can read and write this format.
I preach data governance and data quality all the time. If you are serious about treating spatial data as a vital component to your data warehouse, then you need to be using some sort of GIS software to manage your spatial data. You cannot simply “load and forget” the data.
SQL Server Management Studio “map” (top)
the same map in MapInfo Pro (bottom)