Python Geopandas package for shapefile management

Geopandas combines the ability of pandas & shapely for manipulating geographic data mainly vector data such as points, polylines, & polygons. Geopandas can read & write vector data in a variety of formats, including the ESRI shapefile format (.shp) & others such as KML, & GeoJSON.

Installing Geopandas package:

To install the latest version of GeoPandas, the conda package manager will be utilized. If you haven't already installed Anaconda on your PC , you can install it by following the previous post.


To start, add a new environment variable named geopandas_stable. Start ubuntu terminal & enter the following commands:

  
  conda create --name geopandas_stable 

Activate the geopandas_stable environment:

  
  conda activate geopandas_stable 

Add & set conda-forge channel:

  
  conda config --env --add channels conda-forge 
  conda config --env --set channel_priority strict 

Install Geopandas package with python-3:

  
  conda install python=3 geopandas 

Import the Geopandas library:

  
  import geopandas as gpd 

Check the Geopandas version:

  
  print(gpd.__version__) 

You have successfully installed Geopandas if you receive the version number. Finally, for reading & writing shapefiles, you can use any Python Notebook or IDE, such as Jupyter-Notebook & Spyder.

Reading & manipulating shapefile dataframes

Import the matplotlib library for visualization purposes:

  
  import matplotlib.pyplot as plt 
  from matplotlib.ticker import ScalarFormatter 

All of the shapefiles used in this tutorial have been compressed & saved at this URL

There are two methods to read the data:

  • You can download the compressed shapefiles from the above link to your local directory, extract it, & use this local directory to locate the path of the shapefiles
  • Or you can directly read the compressed shapefiles from the above site if you are connected to internate

Load the shapefile using gpd.read_file function:

  
  zipfile_basins = "https://yonsci.github.io/yon_academic//files/Geopandas_data/ETH_basins.zip" 
  ET_basin = gpd.read_file(zipfile_basins) 

Inspect the attribute table of the data by typing:

  
  ET_basin 

AREAPERIMETERRIVERBASINRIVERBAS_1BASINNAMElayerBasin_typegeometry
02.000141e+112836371.54554ABBAYAbby1POLYGON ((35.75669 12.67180, 35.75921 12.67317...
11.121677e+112050391.91265AWASHAwash1POLYGON ((39.52930 11.72403, 39.52891 11.72624...
24.755561e+09318650.13876AYSHAAysha1POLYGON ((42.30811 10.99144, 42.32126 10.99586...
37.572123e+102128499.78087BARO AKOBOBaro_akobo1POLYGON ((34.47938 10.83318, 34.48112 10.83473...
46.524619e+101228197.47343DENAKILDenakil1POLYGON ((39.37495 14.50856, 39.41100 14.52892...
51.722588e+112006366.8121312GENALE DAWAGenale_dawa11POLYGON ((41.13743 6.94134, 41.14177 6.93580, ...
65.868187e+09534670.62221MEREB GASHMereb11POLYGON ((37.80309 14.70250, 37.91271 14.88540...
77.969898e+101613036.10298OGADENOgaden1POLYGON ((43.31527 9.60498, 43.35539 9.58202, ...
87.871230e+101832234.3701110OMO GIBEOmo_gibe1POLYGON ((38.02335 8.93136, 38.03574 8.91034, ...
95.298484e+101381422.0161211RIFT VALLYRift_valley1POLYGON ((38.40912 8.39296, 38.41411 8.39297, ...
108.367317e+101468016.69532TEKEZETekeze1POLYGON ((39.40391 14.25409, 39.45916 14.23977...
112.017996e+112336905.978109WABI SHEBELEWabi_shebelle1POLYGON ((42.69533 9.58651, 42.69804 9.58040, ...

Instead of displaying the entire dataframe, I recommend displaying only the first & last 5 rows of the dataframe using the .head() & .tail() built-in functions. To display the first 5 rows of the dataframe, type:

  
  ET_basin.head() 

AREAPERIMETERRIVERBASINRIVERBAS_1BASINNAMElayerBasin_typegeometry
02.000141e+112836371.54554ABBAYAbby1POLYGON ((35.75669 12.67180, 35.75921 12.67317...
11.121677e+112050391.91265AWASHAwash1POLYGON ((39.52930 11.72403, 39.52891 11.72624...
24.755561e+09318650.13876AYSHAAysha1POLYGON ((42.30811 10.99144, 42.32126 10.99586...
37.572123e+102128499.78087BARO AKOBOBaro_akobo1POLYGON ((34.47938 10.83318, 34.48112 10.83473...
46.524619e+101228197.47343DENAKILDenakil1POLYGON ((39.37495 14.50856, 39.41100 14.52892...

Similarly, to display the last 5 rows of the dataframe, type:

  
  ET_basin.tail() 

AREAPERIMETERRIVERBASINRIVERBAS_1BASINNAMElayerBasin_typegeometry
77.969898e+101613036.10298OGADENOgaden1POLYGON ((43.31527 9.60498, 43.35539 9.58202, ...
87.871230e+101832234.3701110OMO GIBEOmo_gibe1POLYGON ((38.02335 8.93136, 38.03574 8.91034, ...
95.298484e+101381422.0161211RIFT VALLYRift_valley1POLYGON ((38.40912 8.39296, 38.41411 8.39297, ...
108.367317e+101468016.69532TEKEZETekeze1POLYGON ((39.40391 14.25409, 39.45916 14.23977...
112.017996e+112336905.978109WABI SHEBELEWabi_shebelle1POLYGON ((42.69533 9.58651, 42.69804 9.58040, ...

Alternatively, you can view a specific section of the dataframe by defining the row numbers, but keep in mind that Python counts starting from zero. For example, to display only the 3rd & 4th rows, you can type:

  
  ET_basin[3:5] 

AREAPERIMETERRIVERBASINRIVERBAS_1BASINNAMElayerBasin_typegeometry
37.572123e+102128499.78087BARO AKOBOBaro_akobo1POLYGON ((34.47938 10.83318, 34.48112 10.83473...
46.524619e+101228197.47343DENAKILDenakil1POLYGON ((39.37495 14.50856, 39.41100 14.52892...

To find out the size of the dataframe, type:

  
  ET_basin.shape 

Output: (12, 8)
As can be seen, the dataframe has 12 rows and 8 columns

To obtain the row numbers or length of the dataframe, type:

  
  len(ET_basin) 

Output: 12

To print the column names, type: 

  
  ET_basin.columns 

Index(['AREA', 'PERIMETER', 'RIVERBASIN', 'RIVERBAS_1', 'BASINNAME', 'layer', 'Basin_type', 'geometry'], dtype='object')

To list the column’s contents:

  
  ET_basin.BASINNAME.values 

array(['ABBAY', 'AWASH', 'AYSHA', 'BARO AKOBO', 'DENAKIL', 'GENALE DAWA', 'MEREB GASH', 'OGADEN', 'OMO GIBE', 'RIFT VALLY', 'TEKEZE', 'WABI SHEBELE'], dtype=object)

To know how the data is represented or stored in the dataframe, type:

  
  type(ET_basin) 

geopandas.geodataframe.GeoDataFrame In this example, each vector data is represented as GeoDataFrame

The geometry column is the most essential of the columns & it contains the x-y coordinate values of the vector data. For instance, you can look at the values using .geometry function:

  
  print(ET_basin.geometry) 

0     POLYGON ((35.75669 12.67180, 35.75921 12.67317... 
1     POLYGON ((39.52930 11.72403, 39.52891 11.72624...    
2     POLYGON ((42.30811 10.99144, 42.32126 10.99586...  
3     POLYGON ((34.47938 10.83318, 34.48112 10.83473... 
4     POLYGON ((39.37495 14.50856, 39.41100 14.52892...  
5     POLYGON ((41.13743 6.94134, 41.14177 6.93580, ...  
6     POLYGON ((37.80309 14.70250, 37.91271 14.88540...  
7     POLYGON ((43.31527 9.60498, 43.35539 9.58202, ...  
8     POLYGON ((38.02335 8.93136, 38.03574 8.91034, ... 
9     POLYGON ((38.40912 8.39296, 38.41411 8.39297, ...  
10    POLYGON ((39.40391 14.25409, 39.45916 14.23977...  
11    POLYGON ((42.69533 9.58651, 42.69804 9.58040, ...  
Name: geometry, dtype: geometry

You can use the following function to extract x-y coordinate values of a given polygon, In this case, the first polygon:

  
  def coord_lister(geom): 
  coords = list(geom.exterior.coords) 
  return(coords) 
  coordinates = ET_basin.geometry.apply(coord_lister)
  Abbay_coord = coordinates[0]

Let’s look at the first ten x-y coordinate values of the first polygon:

  
  Abbay_coord[:10] 

[(35.756690888078126, 12.671797097608444),
 (35.75921179031758, 12.673172165175462),
 (35.76174531616754, 12.674460456844615),
 (35.7643064966319, 12.675563955460834),
 (35.76690347742137, 12.676383429901467),
 (35.76954908459146, 12.676824219629998),
 (35.77224026615727, 12.676922403038853),
 (35.774962412059814, 12.676764711775403),
 (35.77771230490099, 12.676412056011529),
 (35.780480076318945, 12.675917364792335)]`

You can use the geom type function to get information about the geometry type (polygon, line, or point) of each data recorded in the dataframe:

  
  ET_basin.geom_type 

0     Polygon
1     Polygon
2     Polygon
3     Polygon
4     Polygon
5     Polygon
6     Polygon
7     Polygon
8     Polygon
9     Polygon
10    Polygon
11    Polygon
dtype: object As you can see, all of the vector types in the dataframe are polygons.

To view the spatial extent or coverage of the shapefile, use the total_bounds function:

  
  ET_basin.total_bounds 

array([33.00541327, 3.40510906, 47.99567693, 14.88539753])

The shapefile ranges from 33.00541327 to 47.99567693 longitudes & 3.40510906 to 14.88539753 latitudes

You may get a quick overview of the shape file by using the geometry values. For instance, to view the first data, you can use the code below:

  
  ET_basin.at[0,'geometry'] 

Shapefile plotting

Let’s now make a simple plot of the shapefile using plot() function:

  
  ET_basin.plot() 

Let’s colorize the map with the cmap function:

  
  ET_basin.plot(cmap ='jet') 

In place of jet, you can use one of the Matplotlib color options given below, or you can visit the following link to learn more about the colormaps:

'Accent', 'Accent_r', 'Blues', 'Blues_r', 'BrBG', 'BrBG_r', 'BuGn', 'BuGn_r', 'BuPu', 'BuPu_r', 'CMRmap', 'CMRmap_r', 'Dark2', 'Dark2_r', 'GnBu', 'GnBu_r', 'Greens', 'Greens_r', 'Greys', 'Greys_r', 'OrRd', 'OrRd_r', 'Oranges', 'Oranges_r', 'PRGn', 'PRGn_r', 'Paired', 'Paired_r', 'Pastel1', 'Pastel1_r', 'Pastel2', 'Pastel2_r', 'PiYG', 'PiYG_r', 'PuBu', 'PuBuGn', 'PuBuGn_r', 'PuBu_r', 'PuOr', 'PuOr_r', 'PuRd', 'PuRd_r', 'Purples', 'Purples_r', 'RdBu', 'RdBu_r', 'RdGy', 'RdGy_r', 'RdPu', 'RdPu_r', 'RdYlBu', 'RdYlBu_r', 'RdYlGn', 'RdYlGn_r', 'Reds', 'Reds_r', 'Set1', 'Set1_r', 'Set2', 'Set2_r', 'Set3', 'Set3_r', 'Spectral', 'Spectral_r', 'Wistia', 'Wistia_r', 'YlGn', 'YlGnBu', 'YlGnBu_r', 'YlGn_r', 'YlOrBr', 'YlOrBr_r', 'YlOrRd', 'YlOrRd_r', 'afmhot', 'afmhot_r', 'autumn', 'autumn_r', 'binary', 'binary_r', 'bone', 'bone_r', 'brg', 'brg_r', 'bwr', 'bwr_r', 'cividis', 'cividis_r', 'cool', 'cool_r', 'coolwarm', 'coolwarm_r', 'copper', 'copper_r', 'cubehelix', 'cubehelix_r', 'flag', 'flag_r', 'gist_earth', 'gist_earth_r', 'gist_gray', 'gist_gray_r', 'gist_heat', 'gist_heat_r', 'gist_ncar', 'gist_ncar_r', 'gist_rainbow', 'gist_rainbow_r', 'gist_stern', 'gist_stern_r', 'gist_yarg', 'gist_yarg_r', 'gnuplot', 'gnuplot2', 'gnuplot2_r', 'gnuplot_r', 'gray', 'gray_r', 'hot', 'hot_r', 'hsv', 'hsv_r', 'inferno', 'inferno_r', 'jet', 'jet_r', 'magma', 'magma_r', 'nipy_spectral', 'nipy_spectral_r', 'ocean', 'ocean_r', 'pink', 'pink_r', 'plasma', 'plasma_r', 'prism', 'prism_r', 'rainbow', 'rainbow_r', 'seismic', 'seismic_r', 'spring', 'spring_r', 'summer', 'summer_r', 'tab10', 'tab10_r', 'tab20', 'tab20_r', 'tab20b', 'tab20b_r', 'tab20c', 'tab20c_r', 'terrain', 'terrain_r', 'turbo', 'turbo_r', 'twilight', 'twilight_r', 'twilight_shifted', 'twilight_shifted_r', 'viridis', 'viridis_r', 'winter', 'winter_r'

You can colorize the map based on a specific column, in this case, let’s use the BASINNAME column:

  
  ET_basin.plot(cmap ='tab20c', column='BASINNAME') 

You can also use the figsize option to change the plot size:

  
  ET_basin.plot(cmap ='tab20c', column='BASINNAME', figsize=(6,6)) 

You can apply an edge color to each polygon in the shapefile using edgecolor function:

  
  ET_basin.plot(cmap ='tab20c', column='BASINNAME', figsize=(6,6), edgecolor='black') 

You may also put a legend in the plot:

  
  ET_basin.plot(cmap ='tab20c', column='BASINNAME', figsize=(9,7), edgecolor='black', legend=True) 

However, using the matplotlib functions is a more convenient way to make plots:

  
  fig, ax = plt.subplots(figsize=(9,7)) 
  ET_basin.plot(alpha=1.0, cmap ='tab20c', column='BASINNAME', edgecolor='black', legend=True, ax=ax ) 
  ax.set_title('Major river basins of Ethiopia', fontsize=25) 
  ax.set_xlabel("Longitude (Degrees)", fontsize=12) 
  plt.show() 
   

You can use the script underneath to add grids to the plot:

  
  fig, ax = plt.subplots(figsize=(9,7)) 
  ET_basin.plot(alpha=1.0, cmap ='tab20c', column='BASINNAME', edgecolor='black', legend=True, ax=ax ) 
  ax.set_title('Major river basins of Ethiopia', fontsize=25) 
  ax.set_axisbelow(True) 
  ax.yaxis.grid(color='gray', linestyle='dashdot') 
  ax.xaxis.grid(color='gray', linestyle='dashdot')   
  ax.set_xlabel("Longitude (Degrees)", fontsize=12) 
  ax.set_ylabel("Latitude (Degrees)", fontsize=12) 
  plt.show() 

Here, instead of dashdot you can use dotted, dashed, solid

You can remove the grids & boundary lines by using the code below:

  
  fig, ax = plt.subplots(figsize=(9,7)) 
  ET_basin.plot(alpha=1.0, cmap ='tab20c', column='BASINNAME', edgecolor='black', legend=True, ax=ax ) 
  ax.set_title('Major Basins of Ethiopia', fontsize=25) 
  ax.set_axis_off() 
  plt.show() 
   

You can also change the legend’s position:

  
  fig, ax = plt.subplots(figsize=(9,7)) 
  ET_basin.plot(alpha=1.0, cmap ='tab20c', column='BASINNAME', edgecolor='black', ax=ax, legend=True )
  ax.set_title('Major river basins of Ethiopia', fontsize=25)
  ax.set_axisbelow(True)
  ax.yaxis.grid(color='gray', linestyle='dashdot')
  ax.xaxis.grid(color='gray', linestyle='dashdot')
  ax.set_xlabel("Longitude (Degrees)", fontsize=12)
  ax.set_ylabel("Latitude (Degrees)", fontsize=12)
  leg = ax.get_legend()
  leg.set_bbox_to_anchor((1.25,1.01))
  plt.show()

Plots can be saved as image files using the plt. savefig() function:

  
  fig, ax = plt.subplots(figsize=(9,7)) 
  ET_basin.plot(alpha=1.0, cmap ='tab20c', column='BASINNAME', edgecolor='black', legend=True, ax=ax )
  ax.set_title('Major river basins of Ethiopia', fontsize=25)
  ax.set_axisbelow(True)
  ax.yaxis.grid(color='gray', linestyle='dashdot')
  ax.xaxis.grid(color='gray', linestyle='dashdot')
  ax.set_xlabel("Longitude (Degrees)", fontsize=12)
  ax.set_ylabel("Latitude (Degrees)", fontsize=12)
  plt.savefig('basins.jpeg', transparent=True,  bbox_inches='tight', dpi=800)
  plt.show()

Making subplots:

  
fig, ((ax1, ax2, ax3, ax4), (ax5, ax6, ax7, ax8), (ax9, ax10, ax11, ax12)) = plt.subplots(3, 4, figsize=(15,15)) 
fig.suptitle("Major Basin's in Ethiopia", fontsize=24)
  ET_basin.loc[ET_basin.BASINNAME == "ABBAY"].plot(ax=ax1, facecolor='Blue',  edgecolor='black')
ax1.set_title("ABBAY") 
  ET_basin.loc[ET_basin.BASINNAME == "AWASH"].plot(ax=ax2, facecolor='Blue',  edgecolor='black')
ax2.set_title("AWASH")
  ET_basin.loc[ET_basin.BASINNAME == "AYSHA"].plot(ax=ax3, facecolor='Blue',  edgecolor='black')
ax3.set_title("AYSHA")
  ET_basin.loc[ET_basin.BASINNAME == "BARO AKOBO"].plot(ax=ax4, facecolor='Blue',  edgecolor='black')
ax4.set_title("BARO AKOBO")
  ET_basin.loc[ET_basin.BASINNAME == "DENAKIL"].plot(ax=ax5, facecolor='Blue',  edgecolor='black')
ax5.set_title("DENAKIL")
  ET_basin.loc[ET_basin.BASINNAME == "GENALE DAWA"].plot(ax=ax6, facecolor='Blue',  edgecolor='black')
ax6.set_title("GENALE DAWA")
  ET_basin.loc[ET_basin.BASINNAME == "MEREB GASH"].plot(ax=ax7, facecolor='Blue',  edgecolor='black')
ax7.set_title("MEREB GASH")
  ET_basin.loc[ET_basin.BASINNAME == "OGADEN"].plot(ax=ax8, facecolor='Blue',  edgecolor='black')
ax8.set_title("OGADEN")
  ET_basin.loc[ET_basin.BASINNAME == "OMO GIBE"].plot(ax=ax9, facecolor='Blue',  edgecolor='black')
ax9.set_title("OMO GIBE")
  ET_basin.loc[ET_basin.BASINNAME == "RIFT VALLY"].plot(ax=ax10, facecolor='Blue',  edgecolor='black')
ax10.set_title("RIFT VALLY")
  ET_basin.loc[ET_basin.BASINNAME == "TEKEZE"].plot(ax=ax11, facecolor='Blue',  edgecolor='black')
ax11.set_title("TEKEZE")
  ET_basin.loc[ET_basin.BASINNAME == "WABI SHEBELE"].plot(ax=ax12, facecolor='Blue',  edgecolor='black')
ax12.set_title("WABI SHEBELE")
  plt.show()

Overlaying shapefiles

As an example, consider overlaying the ET_basin over the African continent shapefile, Let’s begin by importing the African continent shapefile:

  
  zipfile_Africa = "https://yonsci.github.io/yon_academic//files/Geopandas_data/Africa.zip" 
  AFRICA = gpd.read_file(zipfile_Africa) 

It is critical to remember that the coordinate systems of the two shapefiles must be identical. The .crs command can be used to verify the coordinate system of the two shapefiles:

  
  AFRICA.crs 

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
  
  ET_basin.crs 

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

You can see that they both use the WGS-84 geographic coordinate system (EPSG:4326)

Now, you can plot the overlaid map using the following code:

  
  fig, ax = plt.subplots(figsize=(6,6)) 
  AFRICA.plot(ax=ax, color ='None', edgecolor='black') 
  ET_basin.plot(ax=ax, cmap ='tab20c', column='BASINNAME') 
  ax.grid(True) 
  ax.set_title('Ethiopian basins overlaid on African continent', fontsize=15) 
  ax.set(xlabel="Longitude (Degrees)", ylabel="Latitude (Degrees)") 
  plt.show()

Shapefile re-projectioning

Some examples of widely used spatial reference systems:

Reference systems nameEPSG code
WGS-84(Geographic)EPSG:4326
WGS_UTM_37N (projected)EPSG:32637
Spherical Mercator (projected)EPSG:3857
World Equidistant cylindrical/sphericalEPSG:4088
World Equidistant cylindricalEPSG:4087
WGS84/ World MercatorEPSG:3395
WGS84/ World Pseudo MercatorEPSG:3857
World RobinsonEPSG:54030

If you need more information or want to access several reference coordinate systems, go to the following website.

Let’s re-project the ET_basin shapefile from WGS-84 geographic coordinate system to the Universal Transverse Mercator (UTM) projected coordinate system.

Re-projecting from EPSG:4326 to EPSG:32637:

  
  ET_basin_utm = ET_basin.to_crs({'init': 'epsg:32637'}) 
  Or simply type:
  ET_basin_utm = ET_basin.to_crs(32637) 

Let’s review the coordinate system of the modified shapefile:

  
  ET_basin_utm.crs 

<Projected CRS: EPSG:32637>
Name: WGS 84 / UTM zone 37N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: World - N hemisphere - 36°E to 42°E - by country
- bounds: (36.0, 0.0, 42.0, 84.0)
Coordinate Operation:
- name: UTM zone 37N
- method: Transverse Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

As you can see, it has been converted to the UTM projection.

You can also confirm by inspecting the shapefile boundaries:

  
  ET_basin_utm.total_bounds 

array([-161882.56221284, 376388.12499996, 1495282.8685069 , 1645935.62500012])

Let’s have a look at the new map under the projected UTM reference system:

  
  fig, ax = plt.subplots(figsize=(8, 8)) 
  ET_basin_utm.plot(cmap='flag_r', ax=ax, alpha=.5) 
  ax.set_title("Ethiopian basin map: UTM zone-37N projected reference system", fontsize=14)
  ax.set_xlabel("X Coordinates (meters)", fontsize=12)
  ax.set_ylabel("Y Coordinates (meters)", fontsize=12)
  for axis in [ax.xaxis, ax.yaxis]:
  formatter = ScalarFormatter()
  formatter.set_scientific(False)
  axis.set_major_formatter(formatter)

You can also override the existing CRS of a given shapefile by using the .set_crs function:

  
  ET_basin = ET_basin.set_crs(4088, allow_override=True) 

Shapefile Dissolving

Dissolving polygon (removes the interior geometry). For example, let’s join the basin’s within the ET_basin shapefile to generate the Ethiopian boundary map. To do that, you need to pick which columns you want to maintain & the criteria for dissolving the shapefile.

In this example, the Basin_type will be used as a criteria & we want to maintain the geometry field only. To do this, use the code below:

  
  ET_basin_selected = ET_basin[['Basin_type', 'geometry']] 

Then, use the .dissolve() function, which works based on a unique attribute value

  
  ET_dissolved = ET_basin_selected.dissolve(by='Basin_type') 

Let’s view the resulting geodataframe:

  
  ET_dissolved 

geometry
Basin_type
1POLYGON ((43.28204 4.71709, 43.25243 4.70156, ...

As you can see, the dataframe contains the geometry & Basin_type fields only.

Now we can plot & see the resulting boundary map:

  
  ET_dissolved.plot() 

Shapefile clipping

You may easily clip each polygon by selecting the desired column name & it’s unique value. In this situation, let’s use the BASINNAME as a column & (ABBAY) as a desired value from the ET_basin geodataframe.

To do that type the following:

  
  Abbay_basin = ET_basin.loc[ET_basin.BASINNAME == "ABBAY"] 

Now, let’s plot the clipped Abbay basin:

  
  fig, ax = plt.subplots(figsize=(6,6)) 
  Abbay_basin.plot(alpha=1.0, cmap ='tab20c', column='BASINNAME', edgecolor='black', legend=True, ax=ax ) 
  ax.set_title('Map of ABBAY basin', fontsize=25) 
  ax.grid(True) 
  plt.show() 

You can also use the intersection tool to clip data to a specified region, for instance, let’s load the major world river shapefile, & clip it to the Ethiopian region.

First, load the major world rivers:

  
  zip_World_river = "https://yonsci.github.io/yon_academic//files/Geopandas_data/world_rivers.zip" 
  World_river = gpd.read_file(zip_World_river) 

Let’s have a look at the loaded map:

  
  World_river.plot(figsize=(8,8)) 

Let’s use the ET_dissolved shapefile to clip the world rivers to the area of interest:

  
  Intersecting_rivers = gpd.overlay(ET_dissolved, World_river, how ='intersection', keep_geom_type=False) 

Let’s plot the clipped major rivers over Ethiopia:

  
  fig, ax = plt.subplots(figsize=(8, 8)) 
  ET_dissolved.plot(alpha=1, facecolor="none", edgecolor="black", zorder=10, ax=ax) 
  Intersecting_rivers.plot(ax=ax, legend=True) 
  ax.yaxis.grid(color='gray', linestyle='dashdot') 
  ax.xaxis.grid(color='gray', linestyle='dashdot') 
  ax.set_xlabel("Longitude (Degrees)", fontsize=12) 
  ax.set_ylabel("Latitude (Degrees)", fontsize=12) 
  ax.set_title('Major rivers of Ethiopia', fontsize=25) 
  plt.show()

Exporting shapefile

You can export the geopandas dataframe mailny to ESRI shapefile format (.shp) & GeoJSON (.json) file formats.

To export your geopandas dataframe as a shapefile:

  
  ET_basin_utm.to_file("Basins_ethio.shp") 

To export your geopandas dataframe as a GeoJSON file:

  
  ET_basin_utm.to_file("ET_basin_utm_js.json", driver="GeoJSON") 

Well done! 🥇🥇🥇 Good luck!

I hope this tutorial has given you the basics of geopandas package; if you find it useful in your work, please share your thoughts in the comments section!

Leave a Comment