# Libraries for general analysis
import numpy as np
import pandas as pd
# Libraries for geospatial data
import geopandas as gpd
import rioxarray as rioxr
# Libraries for plotting
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches # For creating custom legend
from matplotlib_scalebar.scalebar import ScaleBar # For adding scalebar
# Import Landsat data
= rioxr.open_rasterio('data/landsat8-2018-01-26-sb-simplified.nc')
landsat
# Import California fire perimeters
= gpd.read_file('data/thomas_fire_boundary.geojson') thomas_fire
About
The Thomas Fire, which burned over 280,000 acres in Ventura and Santa Barbara counties in December 2017, was one of California’s largest wildfires at the time. It caused widespread ecological damage, displaced communities, and left lasting environmental impacts.
Using NASA’s Landsat data and California Fire Perimeter data, I create a map that visualizes the extent of the Thomas Fire. In conjunction, I use Air Quality Index (AQI) data from the Environmental Protection Agency (EPA) to visualize the AQI surrounding the fire. Together, these visualizations showcase the impact that the Thomas Fire had on the community.
![]() |
![]() |
For the full analysis, see my GitHub repository.
Highlights
- Raster manipulation using
rioxarray
- Vector data manipulation using
GeoPandas
- False color imagery to highlight wildfire impact
- Data visualization with
matplotlib
Data
Landsat: I use a simplified collection of bands (red, green, blue, near-infrared and shortwave infrared) from the Landsat Collection 2 Level-2 atmosperically corrected surface reflectance data, collected by the Landsat 8 satellite. The data was retrieved from the Microsoft Planetary Computer data catalogue and pre-processed to remove data outside land and coarsen the spatial resolution. This data is intended for visualization and educational purposes only.
- Citation: Microsoft Planetary Computer data catalogue (2024), Landsat Collection 2 Level-2 (simplified) [Data set] Available from: https://planetarycomputer.microsoft.com/dataset/landsat-c2-l2. Access date: November 18, 2024.
Fire perimeters: I use California Fire Perimeter data from the State of California’s Data Catalog to subset to the Thomas Fire boundary. In this analysis, I will be using the file that I created from the full dataset. The dataset is updated annually and includes fire perimeters dating back to 1878.
- Citation: State of California Data Catalog (2024), California Fire Perimeters (all) [Data set] Available from: https://catalog.data.gov/dataset/california-fire-perimeters-all-b3436. Access date: November 18, 2024.
Air Quality Index (AQI): The EPA’s AirData tool has pre-generated files of data available for download. The files are updated twice per year: once in June to capture the complete data for the prior year and once in December to capture the data for the summer. AQI is calculated each day for each monitor for the Criteria Gases and PM10 and PM2.5. For this analysis, I use two files, one containing daily AQI data for 2017 and one for 2018.
- Citation: Environmental Protection Agency AirData (2024), Daily AQI by County [Data Set] Available from: https://www.epa.gov/outdoor-air-quality-data/download-daily-data. Access date: October 20, 2024.
Mapping the fire
Setup
To start, I set up my analysis by loading all necessary libraries and data files.
Prepare Data
Now, I need to prepare the Landsat data. For processing the Landsat data, I will be primarily working with the rioxarray
package. rioxarray
is an extension of xarray
that focuses on geospatial raster data. By loading the Landsat data in using rioxarray
, I load the file as an xarray.Dataset
, an object that includes both the raster data and the associated geospatial metadata, including CRS, affine transformations, and spatial coordinates.
# Look at the Landsat raster
landsat
<xarray.Dataset> Size: 25MB Dimensions: (band: 1, x: 870, y: 731) Coordinates: * band (band) int64 8B 1 * x (x) float64 7kB 1.213e+05 1.216e+05 ... 3.557e+05 3.559e+05 * y (y) float64 6kB 3.952e+06 3.952e+06 ... 3.756e+06 3.755e+06 spatial_ref int64 8B 0 Data variables: red (band, y, x) float64 5MB ... green (band, y, x) float64 5MB ... blue (band, y, x) float64 5MB ... nir08 (band, y, x) float64 5MB ... swir22 (band, y, x) float64 5MB ...
Taking a look at the Landsat raster, I notice that we have a dimension named band that contains only one layer.
Using squeeze()
and drop_vars()
, I drop this unnecessary band dimension and it’s associated coordinates, resulting in a simpler, 2-dimensional raster. This will make plotting easier down the line.
# Drop redundant band dimension
= landsat.squeeze().drop_vars('band') landsat
Now, the Landsat raster is ready to be plotted.
To accentuate the Thomas fire scar, I overlay the Landsat raster with a polygon representing the perimeter of the Thomas Fire. In my full analysis, I prepared the California fire perimeter data by creating a new geospatial object containing only the Thomas Fire perimeter and saved it as thomas_fire_boundary.geojson
. When manipulating the fire perimeter data, I used GeoPandas
. Building off of the pandas.DataFrame
, the core data structure in GeoPandas
is the geopandas.GeoDataFrame
, which can store geometry columns and perform spatial operations!
I knew that the California fire perimeter data contains the columns YEAR_ and FIRE_NAME, which were useful for subsetting to the 2017 Thomas Fire. Since geopandas.GeoDataFrames
are pandas.DataFrames
at their core, I used basic dataframe subsetting to pull out the area of interest.
For the purposes of this post, I have only included my subset file. For the full analysis, see my GitHub repository.
Plot the Landsat raster using false color and the Thomas Fire perimeter to highlight the extent of the burn
Remote sensing instruments collect data from wavelengths both within and outside of the visible spectrum. False color imagery uses these non-visible wavelengths to reveal unique aspects that may not be visible otherwise. False color imagery has a wide range of applications, including acting as a useful tool for monitoring wildfire impacts. By assigning infrared bands to visible colors, these images highlight vegetation health, burn severity, and the extent of fire scars.
In this case, I use false color imagery to highlight the 2017 Thomas Fire’s burn scar. I use Landsat’s shortwave infrared as red, near infrared as green, and green bands as blue to visualize the burn. Newly burned land reflects strongly in SWIR bands, making the burn scar appear red in my map. The bright green shows vegetation, as it reflects near infrared light very strongly.
To do so, I select the shortwave infrared, near infrared, and red variables, convert to array, and plot. By setting the parameter robust = True
in the imshow()
method, I adjust the display of the image to handle color scaling appropriately by ignoring the outlier RBG values caused by clouds. It fixes contrast issues that cause images to appear bright white.
Show code for the false color image
# Before two spatial object can interact, I must match the CRSs
= thomas_fire.to_crs(landsat.rio.crs)
thomas_fire assert thomas_fire.crs == landsat.rio.crs
# Create an object containing the aspect ratio for the landsat map
= landsat.rio.width / landsat.rio.height
ratio
# Initialize plot
= plt.subplots(figsize = (9, 9 * ratio)) # Update figure size and aspect
fig, ax
# Remove axis for cleaner map
'off')
ax.axis(
# Plot false color image highlighting the burn scar
'swir22', 'nir08', 'red']].to_array().plot.imshow(robust = True, ax = ax, zorder = 1)
landsat[[
# Add custom legend items for false color image bands
= mpatches.Patch(color = "#FA957D", label = 'Shortwave Infrared (SWIR) \n - Burned Area')
legend_swir = mpatches.Patch(color = "#62DF59", label = 'Near Infrared \n - Vegetation')
legend_nir
# Add legend
= [legend_swir, legend_nir], loc = 'upper right', fontsize = 10)
ax.legend(handles
# Add Thomas Fire perimeter
= ax,
thomas_fire.plot(ax = 'firebrick',
edgecolor = 'none',
color = 2,
linewidth = 2,
zorder = True)
legend
# Add fire perimeter label
= 291870, y = 3831700, # Position coordinates
ax.text(x = "Thomas Fire \n Perimeter", # Label text
s = 10,
fontsize = 'bold',
weight = 'firebrick',
color = dict(facecolor = 'white', edgecolor = 'firebrick', alpha = 0.8, pad = 4)) # Box behind text for visibility
bbox
# Add plot title
'2017 Thomas Fire (California) Burn Scar using False Color Imagery', fontsize = 16)
ax.set_title(
# Add scale bar
= ScaleBar(1, units='m', location='lower left', length_fraction=0.25, color='black')
scalebar
ax.add_artist(scalebar)
# Display the plot
plt.show()
This false color image uses the shortwave infrared and near infrared to easily visualize bare ground / burned areas (shown in red) and vegetation (shown in bright green).
Visualizing AQI
Now that I have a map highlighting the extent of the fire, I will make a supplimentary visualization showcasing the AQI surrounding the event. The U.S. Air Quality Index (AQI) is EPA’s tool for communicating about outdoor air quality and health. The AQI includes six categories, each corresponding to a range of index values. The higher the AQI value, the greater the level of air pollution and the greater the health concern. For example, an AQI value of 50 or below represents good air quality, while an AQI value over 300 represents hazardous air quality.
For this, I create a line plot showing both the daily AQI and the 5-day average in Santa Barbara County in 2017 and 2018.
Setup
First, I need to download the AQI data. I am accessing the data straight from its its ZIP file link, so I use the pd.read_csv
function with the compression='zip'
parameter added.
# Load in county level AQI data from 2017 and 2018
= pd.concat([pd.read_csv('https://aqs.epa.gov/aqsweb/airdata/daily_aqi_by_county_2017.zip',
aqi = 'zip'),
compression 'https://aqs.epa.gov/aqsweb/airdata/daily_aqi_by_county_2018.zip',
pd.read_csv(= 'zip')]) compression
Prepare data
Next, I will tidy the data frame by converting the column names to lower snake case, subsetting to Santa Barbara county, and changing the data type of the date column to datetime.
# Convert column names to lower snake case
= (aqi.columns
aqi.columns str.lower()
.str.replace(' ','_'))
.
# Make new data frame containing only AQI data for Santa Barbara County
= ((aqi[aqi['county_name'] == "Santa Barbara"])
aqi_sb = ['state_name', 'county_name', 'state_code', 'county_code'])
.drop(columns
)
# Convert date column to datetime object and set as the index
= pd.to_datetime(aqi_sb.date)
aqi_sb.date = aqi_sb.set_index('date') aqi_sb
Plot daily AQI against the 5-day average AQI from 2017-2018
Rolling averages make it easy to identify short-term trends by smoothing out daily fluctuations. pandas
makes calculating rolling averages very simple with the pandas.DataFrame.rolling()
method. Since I already have a datetime column in my dataframe that includes day, I can pass the argument window = '5D'
to .rolling()
to specify a 5-day window, that I can then chain mean()
to get the 5-day average!
# Add new column containing a rolling 5-day AQI average
'five_day_average'] = aqi_sb['aqi'].rolling(window = '5D').mean() aqi_sb[
Now, I am ready to create the AQI plot.
Show code for the AQI plot
# Plot daily AQI against 5-day average ----
# Initialize figure
= plt.subplots(figsize=(9,5))
fig, ax
# Add daily and 5-day average AQI
=ax, color = 'firebrick', zorder = 3)
aqi_sb.five_day_average.plot(ax=ax, color = 'cornflowerblue', zorder = 2)
aqi_sb.aqi.plot(ax
# Add AQI labels for unhealthy levels
150, 151.5, facecolor = "dimgrey", alpha = 0.8)
ax.axhspan(= pd.to_datetime('2017-01-10'), y = 155, s = 'Unhealthy', color = 'dimgrey')
ax.text(x 200, 202, facecolor = "dimgrey", alpha = 0.8)
ax.axhspan(= pd.to_datetime('2017-01-10'), y = 205, s = 'Very Unhealthy', color = 'dimgrey')
ax.text(x 300, 301, facecolor = "dimgrey", alpha = 0.8)
ax.axhspan(= pd.to_datetime('2017-01-10'), y = 305, s = 'Hazardous', color = 'dimgrey')
ax.text(x
# Update axis labels and title
'Date')
plt.xlabel('AQI')
plt.ylabel('AQI during the 2017 Thomas Fire in Santa Barbara County')
plt.title(
# Add legend
= ['5-day average AQI', 'Daily AQI'])
ax.legend(labels
# Add label indicating the Thomas Fire
= pd.to_datetime('2017-12-01'), color = 'dimgrey', linestyle = 'dashed')
ax.axvline(x = pd.to_datetime('2017-08-25'), y = 255, s = 'Thomas Fire', color = 'dimgrey')
ax.text(x
# Update grid lines
= 'y', linewidth = 0.2)
ax.grid(axis
plt.show()
Conclusion
For this analysis, I bring my newly aquired skills working with tabular and spatial data in Python together to visualize the impact of the 2017 Thomas Fire on Santa Barbara county. Using false color imagery, I highlight the burn scar. Then, I use daily AQI data to create a plot that accompanies the burn scar map.