Tutorial 3: Data Visualizations¶
This notebook will show you a few basic visualizations you can make with your own observation data.
We’ll do this with Pandas and Altair. Don’t worry if you’re not familiar with those tools, this is just to demonstrate the kinds of things you can do with your data.
import altair as alt
import pandas as pd
from pyinaturalist import iNatClient, pprint
# enable_logging()
client = iNatClient()
Observation data¶
We’ll start with all of your own observation data:
# Replace with your own username
USERNAME = 'jkcook'
my_observations = client.observations.search(user_id=USERNAME).all()
pprint(my_observations[:5])
ID Taxon ID Taxon Observed on User Location ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30688807 1415100 Cleomella serrulata (Rocky Mountain Beeplant) Aug 12, 2019 jkcook Johnston, IA, USA 30688955 47912 Asclepias tuberosa (Butterfly Milkweed) Aug 12, 2019 jkcook Johnston, IA, USA 30689111 60251 Verbena hastata (Blue Vervain) Aug 12, 2019 jkcook Johnston, IA, USA 30689221 121968 Andropogon gerardi (Big Bluestem) Aug 12, 2019 jkcook Johnston, IA, USA 30689306 121968 Andropogon gerardi (Big Bluestem) Aug 12, 2019 jkcook Johnston, IA, USA
Basic historgam¶
Next, let’s make a simple histogram to show your observations over time.
Start by putting your observations into a DataFrame to make them easier to work with:
source = pd.DataFrame([{'date': o.observed_on.isoformat()} for o in my_observations])
And then display it as a bar chart:
(
alt.Chart(source)
.mark_bar()
.properties(width=700, height=500)
.encode(
x='yearmonth(date):T',
y=alt.Y(
'count()',
scale=alt.Scale(type='log'),
axis=alt.Axis(title='Number of observations'),
),
)
)
Histogram by iconic taxon¶
To show a bit more information, let’s break down the observations by category (iconic taxon):
source = pd.DataFrame(
[
{'date': o.observed_on.isoformat(), 'iconic_taxon': o.taxon.iconic_taxon_name}
for o in my_observations
]
)
(
alt.Chart(source)
.mark_bar()
.properties(width=700, height=500)
.encode(
x='yearmonth(date):T',
y=alt.Y(
'count()',
scale=alt.Scale(type='symlog'),
axis=alt.Axis(title='Number of observations'),
),
color='iconic_taxon',
)
)
Observation map¶
Next, we can show the observations on a map. Note: This example only shows observations in the United States.
First, get the coordinates for all your observations, skipping any that are missing locatino info:
source = pd.DataFrame(
[
{
'latitude': o.location[0],
'longitude': o.location[1],
'iconic_taxon': o.taxon.iconic_taxon_name,
}
for o in my_observations
if o.location
]
)
Then add the base layer. This example uses the us_10m dataset from vega-datasets:
from vega_datasets import data
states = alt.topo_feature(data.us_10m.url, feature='states')
background = (
alt.Chart(states)
.mark_geoshape(fill='lightgray', stroke='white')
.properties(width=850, height=500)
.project('albersUsa')
)
And finally, add your observation locations:
points = (
alt.Chart(source)
.mark_circle()
.encode(
longitude='longitude:Q',
latitude='latitude:Q',
)
)
# Show the combined background + points
background + points