Plotting shapefiles with attributes using ggplot

Dear All,

I am trying to plot a shapfile (river network or streamlines) by filling the color from values in attribute table (e.g., Duration).
the head of the shapefile is

   readOGR("Duration.shp")
   OGR data source with driver: ESRI Shapefile 
    Source: "c:\Duration.shp", layer: "Duration"
     with 10291 features
      It has 15 fields
      Integer64 fields read as strings:  COMID_1 

head(shapefile)
       long      lat order piece id group      maxDef    Severity Duration   Magnitude
1 -2.979167 14.76667     1     1  0   0.1 0.005086238 4.447949231     3426 0.001203216
2 -2.979167 14.75417     2     1  0   0.1 0.005086238 4.447949231     3426 0.001203216
3 -2.900000 14.82500     1     1  1   1.1 0.025923798 20.47003689     3398 0.006718117
4 -2.937500 14.78750     2     1  1   1.1 0.025923798 20.47003689     3398 0.006718117
5 -2.945833 14.78750     3     1  1   1.1 0.025923798 20.47003689     3398 0.006718117
6 -2.945833 14.77083     4     1  1   1.1 0.025923798 20.47003689     3398 0.006718117

I tested the following using geom_polygon, line, and path but I didnt work.

ggplot() + **geom_polygon**(data = shp_df, aes(x = long, y = lat, group = group, fill =Duration)

Best regards,

Solomon

Hi Solomon,

Welcome.

I recommend trying the sf package, as in my opinion it simplifies a lot of the handling of maps.

For future reference, a reproducible example to work with would help to be certain, but try something like this:

library(ggplot2)
library(sf)

shp <- read_sf('Duration.shp')
ggplot(shp) + geom_sf(aes(colour = Duration))

or, if the rivers are polygons rather than lines:

ggplot(shp) + geom_sf(aes(fill = Duration))

Ron.

1 Like

Hi Ron,

Thanks for the tips. the rivers are lines and for every line there is avalue in the attribute table (e.g. Duration). i tried

shp <- read_sf('Duration.shp')
ggplot(shp) + geom_sf(aes(colour = Duration))

but it is not producing a map rather the values see image attached.

This is where a reproducible example would be helpful.

It looks like the values in the Duration column are being treated as a factor (discrete) rather than as (continuous) numeric values.

Questions:

  1. do you get a map if you just use ggplot(shp) + geom_sf()?
  2. what type of data is in the column Duration? (type shp and return, you should get an indication of the data type below the column name).

Ron.

Dear Ron,

Yes, I think the values were treated as a factor. I changed the values to numeric and it is working.

Thanks,

Solomon

Dear Ron,
the values in the file ( Duration) ranges from 0-3001 and i want to classfy them into 5 using somthing like breaks,
breaks=c(0, 2776, 3061, 3272, 3558, 4370)

Is there a way to add this in the following codes,

shp <- read_sf('Duration.shp')
ggplot(shp) + geom_sf(aes(colour = Duration))

Best regards,

Solomon

1 Like

Sorry not to have replied before, I've been a bit busy.

You could look at ?cut as that will split numbers into categories as you want.

However, you say the values go from 0-3001, but your screenshot from before showed values above 4000 and you are now saying you want a top category above 4000 too. I think you may not have converted from factor to numeric correctly. Can you show the code you've used?

I'd take a guess but I'm on a bus and would probably mess it up.

Ron.

1 Like

Right. No longer on a bus. So here's a guess which might still be messed up:

library(tidyverse)
library(sf)
shp <- read_sf('Duration.shp') %>%
              # Convert to numeric via character
              # (ie label of factor Duration converted to numeric,
              # I suspect that you did as.numeric(Duration) which would have
              # converted the level of Duration to a number)
       mutate(Duration = as.numeric(as.character(Duration)),
              # convert numberic values into a small number of caegories
              fDuration = cut(Duration, breaks = c(0, 2776, 3061, 3272, 3558, 4370)))
ggplot(shp) + geom_sf(aes(colour = fDuration))

nb, I haven't run this code. You should check the outputs to make sure they make sense and correspond to what you expect. Again, check out ?cut to understand what it does and what it can do if your requirements are different from what I have done.

And a reprex really would be helpful, it would help me and others to help you and creating one might help you to find and possibly fix the problems for yourself.

Ron.

Dear Ron,

Thanks for the help and the code. I found the cut option and I used a quantile method to class the data and compare the results from other method. Now, I have 3 Durations from 3 methods combined in shp

duration1= 0-6642

duration2=0-2089

duration3=0-4367

This is what I did

 a<-cut(shp$Duration1, breaks=data.frame(classIntervals(shp$Duration1, n=5,method="quantile")[2])[,1],include.lowest=T)

 f1= ggplot(shp) + geom_sf(aes(colour=a))

b<-cut(shp$Duration2, breaks=data.frame(classIntervals(shp$Duration2, n=5,method="quantile")[2])[,1],include.lowest=T)

f2= ggplot(shp) + geom_sf(aes(colour=b))

c<-cut(shp$Duration3, breaks=data.frame(classIntervals(shp$Duration3, n=5,method="quantile")[2])[,1],include.lowest=T)

f2= ggplot(shp) + geom_sf(aes(colour=c))

Now, I need to create a common legend and plot in one page. i tried grid_arrenge_shared_legend

grid_arrange_shared_legend(f1, f2, f3, ncol = 3, nrow = 1, position = "bottom")

But this is not creating a common legend, only display for the first plot (f1). any idea? thanks again for your help and time.

Best regards,

Solomon

I don't have data to play with, but if I did I'd be looking at using gather to make the make a long version of the data with Duration1, Duration2 and Duration3 (probably having used st_drop_geometry to drop the coordinate info) into a single column, then cut to categorise that whole column in some way before left_join(shp, .) back to shp ready to plot. Then facet_wrap to split into three graphs with one consistent legend when ggplotting.

I'm not clear on whether the quantiles should be aligned (eg same value defining the breaks for each Duration) or the quantiles should be aligned (eg this thing is in quantile A for Duration1, quantile B for Duration2 etc). This depends on what you want to show and talk about.

A reprex would make this easier.

Ron.

Hi Ron,

thansk for the help. now, i need to add some information to the shapefile from a netcdf file.
the plan is to compute trend and slope for each id and write to the attribute table by thier id (join in gis). the ID in the netcdf ID and shapefile similar. if writing to the shapefile is not easy i can do this in GIS and need the output in CSV.

the netcdf has two dimesions (time and netID) and one variable Qout (no lat and lon).

I have a netcdf file with 2 dimesions (time and netID);

  dimensions:
         time = UNLIMITED ; // (12784 currently)
         netid = 716862 ;

variables:
     float Qout(time, netid) 

the trend is based on the Mann-Kendall test.

  Rs=ts(s,start=1979)
  mk.test(Rs)
  sens.slope(Rs)

Please let me know if you have any idea.

Best regards,

Solomon

Hi Solomon,

I think you should create a new topic for this question, as it now seems quite far from where you started with this one. And a new topic will draw in people with the relevant interests/expertise.

I have little experience with netcdf files. Rasters, aren't they? I think you need to specify your question more clearly - how do the IDs relate to the shapefile, what do you expect to get out at the end (you had rivers, which are lines, and you want to attache some info derived from a netcdf - as points? as a summary along each river? etc. And, and this is very important, try to create a reproducible example with your code to date and some example data.

Writing to a shapefile is straightforward, see ?sf::st_write.

Ron.

Thanks Ron, I will create a new topic.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.