{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pandas"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[![colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/davemlz/spyndex/blob/main/docs/tutorials/pandas.ipynb)\n",
"![level3](https://raw.githubusercontent.com/davemlz/spyndex/main/docs/_static/level3.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After passing levels 1 and 2, you are ready to start this: Level 3 - `spyndex + pandas`!\n",
"\n",
"Remember to install `spyndex`!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install -U spyndex"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, let's start!\n",
"\n",
"First, import `spyndex` and `pandas`:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"import spyndex\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## `pandas.Series`\n",
"\n",
"We have all worked with `pandas`. Well, `spyndex` also works with `pandas` so you can continue using it! :)\n",
"\n",
"Let's use a `pandas.DataFrame` that is stored in the `spyndex` datasets: `spectral`:"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [],
"source": [
"df = spyndex.datasets.open(\"spectral\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Each column of this dataset is the Surface Reflectance from Landsat 8 for 3 different classes. The samples were taken over Oporto:"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" SR_B1 | \n",
" SR_B2 | \n",
" SR_B3 | \n",
" SR_B4 | \n",
" SR_B5 | \n",
" SR_B6 | \n",
" SR_B7 | \n",
" ST_B10 | \n",
" class | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0.089850 | \n",
" 0.100795 | \n",
" 0.132227 | \n",
" 0.165764 | \n",
" 0.269054 | \n",
" 0.306206 | \n",
" 0.251949 | \n",
" 297.328396 | \n",
" Urban | \n",
"
\n",
" \n",
" 1 | \n",
" 0.073859 | \n",
" 0.086990 | \n",
" 0.124404 | \n",
" 0.160979 | \n",
" 0.281264 | \n",
" 0.267596 | \n",
" 0.217917 | \n",
" 297.107934 | \n",
" Urban | \n",
"
\n",
" \n",
" 2 | \n",
" 0.072938 | \n",
" 0.086028 | \n",
" 0.120994 | \n",
" 0.140203 | \n",
" 0.284220 | \n",
" 0.258384 | \n",
" 0.200098 | \n",
" 297.436064 | \n",
" Urban | \n",
"
\n",
" \n",
" 3 | \n",
" 0.087733 | \n",
" 0.103916 | \n",
" 0.135981 | \n",
" 0.163976 | \n",
" 0.254479 | \n",
" 0.259580 | \n",
" 0.216735 | \n",
" 297.203638 | \n",
" Urban | \n",
"
\n",
" \n",
" 4 | \n",
" 0.090593 | \n",
" 0.109306 | \n",
" 0.150350 | \n",
" 0.181260 | \n",
" 0.269535 | \n",
" 0.273234 | \n",
" 0.219554 | \n",
" 297.097680 | \n",
" Urban | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" SR_B1 SR_B2 SR_B3 SR_B4 SR_B5 SR_B6 SR_B7 \\\n",
"0 0.089850 0.100795 0.132227 0.165764 0.269054 0.306206 0.251949 \n",
"1 0.073859 0.086990 0.124404 0.160979 0.281264 0.267596 0.217917 \n",
"2 0.072938 0.086028 0.120994 0.140203 0.284220 0.258384 0.200098 \n",
"3 0.087733 0.103916 0.135981 0.163976 0.254479 0.259580 0.216735 \n",
"4 0.090593 0.109306 0.150350 0.181260 0.269535 0.273234 0.219554 \n",
"\n",
" ST_B10 class \n",
"0 297.328396 Urban \n",
"1 297.107934 Urban \n",
"2 297.436064 Urban \n",
"3 297.203638 Urban \n",
"4 297.097680 Urban "
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here you can see the classes stored in the `class` column:"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['Urban', 'Water', 'Vegetation'], dtype=object)"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[\"class\"].unique()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Each column of the data frame is a `pandas.Series` data type:"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pandas.core.series.Series"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type(df[\"SR_B2\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Well, we can use that to compute Spectral Indices with `spyndex`!\n",
"\n",
"Since we have `vegetation`, `water` and `urban` classes, let's compute 3 different indices, each one highlighting an specific class: `NDVI`, `NDWI` and `NDBI`:"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"NDVI: Normalized Difference Vegetation Index (attributes = ['bands', 'contributor', 'date_of_addition', 'formula', 'long_name', 'reference', 'short_name', 'type'])"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spyndex.indices.NDVI"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"NDWI: Normalized Difference Water Index (attributes = ['bands', 'contributor', 'date_of_addition', 'formula', 'long_name', 'reference', 'short_name', 'type'])"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spyndex.indices.NDWI"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"NDBI: Normalized Difference Built-Up Index (attributes = ['bands', 'contributor', 'date_of_addition', 'formula', 'long_name', 'reference', 'short_name', 'type'])"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spyndex.indices.NDBI"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What bands do we need?"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('N', 'R')"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spyndex.indices.NDVI.bands"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('G', 'N')"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spyndex.indices.NDWI.bands"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('S1', 'N')"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spyndex.indices.NDBI.bands"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Green, Red, NIR and SWIR1 bands... easy!"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"parameters = {\n",
" \"G\": df[\"SR_B3\"],\n",
" \"R\": df[\"SR_B4\"],\n",
" \"N\": df[\"SR_B5\"],\n",
" \"S1\": df[\"SR_B6\"],\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With our `dict` of parameters ready we can compute the indices!"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"idx = spyndex.computeIndex([\"NDVI\",\"NDWI\",\"NDBI\"],parameters)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And, what's the data type of the result?"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"idx type: \n"
]
}
],
"source": [
"print(f\"idx type: {type(idx)}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That's right! A `pandas.DataFrame`! Why? Because each computed spectral index is now a column (`pandas.Series`) of a new dataframe:"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" NDVI | \n",
" NDWI | \n",
" NDBI | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0.237548 | \n",
" -0.340973 | \n",
" 0.064584 | \n",
"
\n",
" \n",
" 1 | \n",
" 0.271989 | \n",
" -0.386671 | \n",
" -0.024902 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.339326 | \n",
" -0.402815 | \n",
" -0.047615 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.216278 | \n",
" -0.303482 | \n",
" 0.009923 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.195821 | \n",
" -0.283852 | \n",
" 0.006815 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" NDVI NDWI NDBI\n",
"0 0.237548 -0.340973 0.064584\n",
"1 0.271989 -0.386671 -0.024902\n",
"2 0.339326 -0.402815 -0.047615\n",
"3 0.216278 -0.303482 0.009923\n",
"4 0.195821 -0.283852 0.006815"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"idx.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you want them diectly on the original dataframe as new columns, you just have to play with the code a little bit!"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [],
"source": [
"indicesToCompute = [\"NDVI\",\"NDWI\",\"NDBI\"]\n",
"df[indicesToCompute] = spyndex.computeIndex(indicesToCompute,parameters)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, if you check you original dataframe, you should have the new indices there!"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" SR_B1 | \n",
" SR_B2 | \n",
" SR_B3 | \n",
" SR_B4 | \n",
" SR_B5 | \n",
" SR_B6 | \n",
" SR_B7 | \n",
" ST_B10 | \n",
" class | \n",
" NDVI | \n",
" NDWI | \n",
" NDBI | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0.089850 | \n",
" 0.100795 | \n",
" 0.132227 | \n",
" 0.165764 | \n",
" 0.269054 | \n",
" 0.306206 | \n",
" 0.251949 | \n",
" 297.328396 | \n",
" Urban | \n",
" 0.237548 | \n",
" -0.340973 | \n",
" 0.064584 | \n",
"
\n",
" \n",
" 1 | \n",
" 0.073859 | \n",
" 0.086990 | \n",
" 0.124404 | \n",
" 0.160979 | \n",
" 0.281264 | \n",
" 0.267596 | \n",
" 0.217917 | \n",
" 297.107934 | \n",
" Urban | \n",
" 0.271989 | \n",
" -0.386671 | \n",
" -0.024902 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.072938 | \n",
" 0.086028 | \n",
" 0.120994 | \n",
" 0.140203 | \n",
" 0.284220 | \n",
" 0.258384 | \n",
" 0.200098 | \n",
" 297.436064 | \n",
" Urban | \n",
" 0.339326 | \n",
" -0.402815 | \n",
" -0.047615 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.087733 | \n",
" 0.103916 | \n",
" 0.135981 | \n",
" 0.163976 | \n",
" 0.254479 | \n",
" 0.259580 | \n",
" 0.216735 | \n",
" 297.203638 | \n",
" Urban | \n",
" 0.216278 | \n",
" -0.303482 | \n",
" 0.009923 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.090593 | \n",
" 0.109306 | \n",
" 0.150350 | \n",
" 0.181260 | \n",
" 0.269535 | \n",
" 0.273234 | \n",
" 0.219554 | \n",
" 297.097680 | \n",
" Urban | \n",
" 0.195821 | \n",
" -0.283852 | \n",
" 0.006815 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" SR_B1 SR_B2 SR_B3 SR_B4 SR_B5 SR_B6 SR_B7 \\\n",
"0 0.089850 0.100795 0.132227 0.165764 0.269054 0.306206 0.251949 \n",
"1 0.073859 0.086990 0.124404 0.160979 0.281264 0.267596 0.217917 \n",
"2 0.072938 0.086028 0.120994 0.140203 0.284220 0.258384 0.200098 \n",
"3 0.087733 0.103916 0.135981 0.163976 0.254479 0.259580 0.216735 \n",
"4 0.090593 0.109306 0.150350 0.181260 0.269535 0.273234 0.219554 \n",
"\n",
" ST_B10 class NDVI NDWI NDBI \n",
"0 297.328396 Urban 0.237548 -0.340973 0.064584 \n",
"1 297.107934 Urban 0.271989 -0.386671 -0.024902 \n",
"2 297.436064 Urban 0.339326 -0.402815 -0.047615 \n",
"3 297.203638 Urban 0.216278 -0.303482 0.009923 \n",
"4 297.097680 Urban 0.195821 -0.283852 0.006815 "
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Beautiful! Right?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, just for the sake of life, let's make some visualizations!"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [],
"source": [
"import seaborn as sns\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define some colors for each one of the classes:"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [],
"source": [
"colors = [\"#E33F62\",\"#3FDDE3\",\"#4CBA4B\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, let's create a gorgeous pair grid!"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"