4 steps to unleash the valuable insights hidden in your (spatial) data
R for Spatial Data Science
These are exciting times for Data Scientists who want to incorporate location awareness in their workflow in R. R, an open source software environment for statistical computing and graphics, has been around since the early nineteen nineties. And although geospatial analysis in R has a long-standing tradition, there have been some interesting developments in recent years which have made incorporating geographic information in R more easy and intuitive than ever before. In addition, both ArcGIS and FME provide bridges to R, creating unique added value for both Data Scientists and GIS specialists.
And what’s new in R to be so thrilled about?
Firstly, already a few years ago, the package leaflet has brought interactive mapping to R. But more recently the mapview and tmap libraries, who build on top of leaflet, have even further extended the options for interactive exploration on the map. And secondly, the recent release of sf, Simple Features for R, has really caused a revolution in the R ecosystem with it’s simple and consistent approach to the management of spatial data in R.
In this blog I share four steps to incorporate the full potential of geography into your data analysis with R.
1. Reading and writing geographic data
It is possible to load spatial data directly into R to combine these with data from other sources. The packagesfsupports a wide range of geographic vector file formats (points, lines and polygons with attribute data attached): from CAD and Shapefiles to data stored in spatial databases like PostGIS. With the functionst_read()you can also access data directly from online sources like a WFS server or theArcGIS Living Atlas of the World.
sflinks directly to three important geospatial libraries, to unlock their power for use in R:GDALfor reading and writing data,GEOSfor geometrical operations andProj.4for projection conversions and datum transformations. Of course, it is also possible to use raster GIS data in your analysis with the R packageraster.
With both thesfand therasterpackage you can also export the results of your analysis in R directly to a geographic file format or spatial database for further use in other mature applications like the ArcGIS platform or FME. With ArcGIS Online you can easily share the results of your analysis with a broad audience. And you can use FME, the ETL tool for geospatial data, both at the beginning and at the end of your analysis to automate and manage the flow of data within your organization.
2. Spatial analysis and modeling
With R you have access to the full range of analytical capabilities you would expect from a traditional desktop GIS. From spatial subsetting and aggregation to buffer and overlay analysis and modeling spatial phenomena. And here this revolutionarySimple Featuresapproach comes into play: thesfpackage treats spatial data almost as an ‘ordinary’ Rdata.frame. Almost, but not entirely, becausesfdata frames have this ‘special’ geometry column. But the fact that thesesfobjects also inherit from the classdata.framemeans that you can apply many R functions directly to your spatial data.
With R you can easily put your data on a map. There is no need any more to switch back and forth between R and a separate GIS application to view your data in their spatial context. Both static and interactive maps can be created quickly with only a few lines of code. Traditionally R is very good at producing high quality, publication ready graphs and figures and this is also true for static maps. And in recent years R has been extended with packages likeleaflet,mapview,tmapandshiny, which allow you to create interactive web (mapping) applications to share you results. So, whether you want to create a thematic map or to plot multiple layers on top of each other, R offers the tools to do so.
R’s command line allows you to script complex spatial and statistical analysis in an efficient and completely reproducible manner. This reproducibility allows you to clarify and explain to stakeholders every step taken and every decision made in your analysis. And of course, it also facilitates the extension or modification of your analysis the moment new, updated or more accurate data become available.
Do you want to really unleash the valuable insights hidden in your data? Without geography you are nowhere! Whether you have already looked into R’s geospatial capabilities or you are completely new to the subject: feel freeto contact meto investigate the possibilities.
Please comment: How do you use R? Standalone / in conjunction with ArcGIS / in conjunction with FME?