Tutorial 3: Data Visualizations#
This notebook will show you a few basic visualizations you can make with your own observation data.
We’ll do this with Pandas and Altair. Don’t worry if you’re not familiar with those tools, this is just to demonstrate the kinds of things you can do with your data.
[1]:
from datetime import datetime, timedelta
import altair as alt
import pandas as pd
from dateutil.relativedelta import relativedelta
from pyinaturalist import Observation, enable_logging, get_observations, pprint
from rich import print
enable_logging()
Observation data#
We’ll start with all of your own observation data from the last 3 years:
[2]:
# Replace with your own username
USERNAME = 'jkcook'
start_date = datetime.now() - timedelta(365 * 3)
response = get_observations(user_id=USERNAME, d1=start_date, page='all')
my_observations = Observation.from_json_list(response)
pprint(my_observations[:10])
ID Taxon ID Taxon Observed on User Location ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30688807 78444 🌱 Species: Peritoma Aug 12, 2019 jkcook Johnston, IA, USA serrulata (Rocky Mountain beeplant) 30688955 47912 🌱 Species: Asclepias Aug 12, 2019 jkcook Johnston, IA, USA tuberosa (butterfly milkweed) 30689111 60251 🌱 Species: Verbena Aug 12, 2019 jkcook Johnston, IA, USA hastata (blue vervain) 30689221 121968 🌽 Species: Andropogon Aug 12, 2019 jkcook Johnston, IA, USA gerardi (big bluestem) 30689306 121968 🌽 Species: Andropogon Aug 12, 2019 jkcook Johnston, IA, USA gerardi (big bluestem) 30689425 128701 🌱 Species: Desmanthus Aug 12, 2019 jkcook Johnston, IA, USA illinoensis (Illinois bundleflower) 30689463 121976 🌻 Species: Silphium Aug 12, 2019 jkcook Johnston, IA, USA laciniatum (compass plant) 30689506 136376 🌻 Species: Rudbeckia Aug 12, 2019 jkcook Johnston, IA, USA triloba (Brown-eyed Susan) 30689603 121976 🌻 Species: Silphium Aug 12, 2019 jkcook Johnston, IA, USA laciniatum (compass plant) 30689780 81594 🌾 Species: Elymus Aug 12, 2019 jkcook Johnston, IA, USA hystrix (bottlebrush grass)
Basic historgam#
Next, let’s make a simple histogram to show your observations over time.
Start by putting your observations into a DataFrame to make them easier to work with:
[3]:
source = pd.DataFrame([{'date': o.observed_on.isoformat()} for o in my_observations])
And then display it as a bar chart:
[4]:
(
alt.Chart(source)
.mark_bar()
.properties(width=700, height=500)
.encode(
x='yearmonth(date):T',
y=alt.Y(
'count()',
scale=alt.Scale(type='log'),
axis=alt.Axis(title='Number of observations'),
),
)
)
alt.Chart(...)
[4]:
Histogram by iconic taxon#
To show a bit more information, let’s break down the observations by category (iconic taxon):
[5]:
source = pd.DataFrame(
[
{'date': o.observed_on.isoformat(), 'iconic_taxon': o.taxon.iconic_taxon_name}
for o in my_observations
]
)
(
alt.Chart(source)
.mark_bar()
.properties(width=700, height=500)
.encode(
x='yearmonth(date):T',
y=alt.Y(
'count()',
scale=alt.Scale(type='symlog'),
axis=alt.Axis(title='Number of observations'),
),
color='iconic_taxon',
)
)
alt.Chart(...)
[5]:
Observation map#
Next, we can show the observations on a map. Note: This example only shows observations in the United States.
First, get the coordinates for all your observations, skipping any that are missing locatino info:
[6]:
source = pd.DataFrame(
[
{
'latitude': o.location[0],
'longitude': o.location[1],
'iconic_taxon': o.taxon.iconic_taxon_name,
}
for o in my_observations
if o.location
]
)
Then add the base layer. This example uses the us_10m
dataset from vega-datasets:
[7]:
from vega_datasets import data
states = alt.topo_feature(data.us_10m.url, feature='states')
background = (
alt.Chart(states)
.mark_geoshape(fill='lightgray', stroke='white')
.properties(width=850, height=500)
.project('albersUsa')
)
And finally, add your observation locations:
[8]:
points = (
alt.Chart(source)
.mark_circle()
.encode(
longitude='longitude:Q',
latitude='latitude:Q',
)
)
# Show the combined background + points
background + points
alt.LayerChart(...)
[8]: