Regional activity time series visualizations

This example shows how to create visualizations of iNaturalist activity over time in a given region. See https://www.inaturalist.org/places to find place IDs.

Visualization are made using Altair, with the following metrics: * Number of observations * Number of taxa observed * Number of observers * Number of identifiers

[1]:
from datetime import datetime
from time import sleep

from dateutil.relativedelta import relativedelta
from IPython.display import Image
from typing import Any, BinaryIO, Dict, Iterable, List, Optional, Tuple

import altair as alt
import pandas as pd

from pyinaturalist.node_api import (
    get_observations,
    get_observation_histogram,
    get_observation_species_counts,
    get_observation_observers,
    get_observation_identifiers,
)
from pyinaturalist.request_params import ICONIC_TAXA, get_interval_ranges

# Adjustable values
PLACE_ID = 6
PLACE_NAME = 'Alaska'
YEAR = 2020

THROTTLING_DELAY = 1.0  # Time to wait in between subsequent requests

Observations per year

[2]:
observations_by_year = get_observation_histogram(
    place_id=PLACE_ID,
    interval='year',
    d1='2008-01-01',
    d2=f'{YEAR}-12-31',
    verifiable=True,
)
observations_by_year = pd.DataFrame([
    {'date': k, 'observations': v}
    for k, v in observations_by_year.items()
])

# Including the rendered image so the chart will display outside Jupyter, e.g. on GitHub's notebook viewer
Image('images/observations_by_year.png')
alt.Chart(observations_by_year).mark_bar().encode(x='year(date):T', y='observations:Q')
[2]:

Observations per month

[3]:
observations_by_month = get_observation_histogram(
    place_id=PLACE_ID,
    interval='month',
    d1='2020-01-02',
    d2='2020-12-31',
    verifiable=True,
)
observations_by_month = pd.DataFrame([
    {'metric': 'Observations', 'date': k, 'count': v}
    for k, v in observations_by_month.items()
])
Image('images/observations_by_month.png')
alt.Chart(observations_by_month).mark_bar().encode(x='month(date):T', y='count:Q')
[3]:

Histograms with custom metrics

The API does not have a histogram endpoint for taxa observed, observers, or identifiers, so we first need to determine our date ranges of interest, and then run one search per date range.

Here are a couple helper functions to make this easier:

[4]:
def count_date_range_results(function, start_date, end_date):
    """Get the count of results for the given date range and search function"""
    # Running this search with per_page=0 will (quickly) return only a count of results, not complete results
    response = function(
        place_id=PLACE_ID,
        d1=start_date,
        d2=end_date,
        verifiable=True,
        per_page=0,
    )
    print(f'Total results for {start_date.strftime("%b")}: {response["total_results"]}')
    return response['total_results']
    if start_date.month != 12:
        sleep(THROTTLING_DELAY)


def get_monthly_counts(function, label):
    """Get the count of results per month for the given search function"""
    month_ranges = get_interval_ranges(datetime(YEAR, 1, 1), datetime(YEAR, 12, 31), 'monthly')
    counts_by_month = {
        start_date: count_date_range_results(function, start_date, end_date)
        for (start_date, end_date) in month_ranges
    }
    return pd.DataFrame([{'metric': label, 'date': k, 'count': v} for k, v in counts_by_month.items()])

Unique taxa observed per month

[5]:
taxa_by_month = get_monthly_counts(get_observation_species_counts, 'Taxa')
Image('images/taxa_by_month.png')
alt.Chart(taxa_by_month).mark_bar().encode(x='month(date):T', y='count:Q')
Total results for Jan: 184
Total results for Feb: 176
Total results for Mar: 318
Total results for Apr: 790
Total results for May: 1334
Total results for Jun: 1504
Total results for Jul: 1684
Total results for Aug: 1570
Total results for Sep: 1250
Total results for Oct: 639
Total results for Nov: 408
Total results for Dec: 550
[5]:

Observers per month

[6]:
observers_by_month = get_monthly_counts(get_observation_observers, 'Observers')
Image('images/observers_by_month.png')
alt.Chart(observers_by_month).mark_bar().encode(x='month(date):T', y='count:Q')
Total results for Jan: 36
Total results for Feb: 42
Total results for Mar: 71
Total results for Apr: 141
Total results for May: 361
Total results for Jun: 458
Total results for Jul: 530
Total results for Aug: 563
Total results for Sep: 404
Total results for Oct: 174
Total results for Nov: 86
Total results for Dec: 51
[6]:

Identifiers per month

[7]:
identifiers_by_month = get_monthly_counts(get_observation_identifiers, 'Identifiers')
Image('images/identifiers_by_month.png')
alt.Chart(identifiers_by_month).mark_bar().encode(x='month(date):T', y='count:Q')
Total results for Jan: 135
Total results for Feb: 152
Total results for Mar: 187
Total results for Apr: 349
Total results for May: 619
Total results for Jun: 602
Total results for Jul: 662
Total results for Aug: 616
Total results for Sep: 492
Total results for Oct: 314
Total results for Nov: 219
Total results for Dec: 208
[7]:

Combine all monthly metrics into one plot

[8]:
combined_results = observations_by_month.append([taxa_by_month, observers_by_month, identifiers_by_month])

Image('images/combined_activity_stats.png')
alt.Chart(
    combined_results,
    title=f'iNaturalist activity in {PLACE_NAME} ({YEAR})',
    width=750,
    height=500,
).mark_line().encode(
    alt.X('month(date):T', axis=alt.Axis(title="Month")),
    alt.Y('count:Q', axis=alt.Axis(title="Count")),
    color='metric',
    strokeDash='metric',
).configure_axis(
    labelFontSize=15,
    titleFontSize=20,
)
[8]: