Advanced Usage¶
This page describes some advanced features of pyinaturalist.
Authentication¶
See Authentication for details on using authenticated endpoints.
Pagination¶
Most endpoints support pagination, using the parameters:
page
: Page number to getper_page
: Number of results to get per pagecount_only=True
: This is just a shortcut forper_page=0
, which will return only the total number of results, not the results themselves.
The default and maximum per_page
values vary by endpoint, but it’s 200 for most endpoints.
To get all pages of results and combine them into a single response, use page='all'
.
Note that this replaces the get_all_*()
functions from pyinaturalist<=0.12.
Sessions¶
If you want more control over how requests are sent, you can provide your own ClientSession
object using the session
argument for any API request function:
>>> from pyinaturalist import ClientSession
>>> session = ClientSession(...)
>>> request_function(..., session=session)
Caching¶
All API requests are cached by default. These expire in 30 minutes for most endpoints, and 1 day for some infrequently-changing data (like taxa and places). See requests-cache: Expiration for details on cache expiration behavior.
You can change this behavior using ClientSession
. For example, to keep cached requests for 5 days:
>>> from datetime import timedelta
>>> from pyinaturalist import ClientSession, get_taxa
>>> session = ClientSession(expire_after=timedelta(days=5))
>>> get_taxa(q='warbler', locale=1, session=session)
To store the cache somewhere other than the default cache directory:
>>> session = ClientSession(cache_file='~/data/api_requests.db')
To manually clear the cache:
>>> session.cache.clear()
Or as a shortcut, without a session object:
from pyinaturalist import clear_cache
clear_cache()
Timeouts¶
If you are seeing frequent timeouts (TimeoutError
) due to iNat server problems or a slow internet
connection, you can increase the timeout (default: 20 seconds):
>>> from pyinaturalist import ClientSession
>>> session = ClientSession(timeout=40)
Retries¶
Similarly, if you are seeing intermittent non-timeout errors due to server issues, you can adjust the number of times to retry failed requests (default: 5):
>>> from pyinaturalist import ClientSession
>>> session = ClientSession(retries=7)
Rate Limiting¶
Rate limiting is applied to all requests so they stay within the rates specified by iNaturalist’s API Recommended Practices.
You can modify these rate limits using ClientSession
.
For example, to reduce the rate to 50 requests per minute:
>>> from pyinaturalist import ClientSession, get_taxa
>>> session = ClientSession(per_minute=50)
>>> get_taxa(q='warbler', locale=1, session=session)
Float values also work, for example to slow it down to less than 1 request per second):
>>> session = ClientSession(per_second=0.5)
Distributed Application Rate Limiting¶
The default rate-limiting backend is thread-safe, and persistent across application restarts. If
you have a larger application running from multiple processes, you will need an additional locking
mechanism to make sure these processes don’t conflict with each other. This is available with
FileLockSQLiteBucket
, which can be passed as the session’s bucket_class
:
>>> from pyinaturalist import ClientSession, FileLockSQLiteBucket
>>> session = ClientSession(bucket_class=FileLockSQLiteBucket)
This requires installing one additional dependency, py-filelock:
pip install filelock
Logging¶
You can configure logging for pyinaturalist using the standard Python logging
module, for example
with logging.basicConfig()
:
>>> import logging
>>> logging.basicConfig()
>>> logging.getLogger('pyinaturalist').setLevel('INFO')
For convenience, an enable_logging()
function is included that will apply some recommended
settings, including colorized output (if viewed in a terminal) and better traceback formatting,
using the rich library.
>>> from pyinaturalist import enable_logging
>>> enable_logging()
Dry-run mode¶
While developing and testing, it can be useful to temporarily mock out HTTP requests, especially requests that add, modify, or delete real data. Pyinaturalist has some settings to make this easier.
Dry-run individual requests¶
All API request functions take an optional dry_run
argument. When set to True
, requests will not
be sent but will be logged instead.
Note
You must enable at least INFO-level logging to see the logged request info
>>> from pyinaturalist import get_taxa
>>> get_taxa(q='warbler', locale=1, dry_run=True)
{'results': [], 'total_results': 0}
[07-26 18:55:50] INFO Request: GET https://api.inaturalist.org/v1/taxa?q=warbler&locale=1
User-Agent: pyinaturalist/0.15.0
Accept: application/json
Dry-run all requests¶
To enable dry-run mode for all requests, set the DRY_RUN_ENABLED
environment variable:
>>> import os
>>> os.environ['DRY_RUN_ENABLED'] = 'true'
export DRY_RUN_ENABLED=true
set DRY_RUN_ENABLED="true"
$Env:DRY_RUN_ENABLED="true"
Dry-run only write requests¶
If you would like to send real GET
requests but mock out any requests that modify data
(POST
, PUT
, and DELETE
), you can use the DRY_RUN_WRITE_ONLY
variable instead:
>>> import os
>>> os.environ['DRY_RUN_WRITE_ONLY'] = 'true'
export DRY_RUN_WRITE_ONLY=true
set DRY_RUN_WRITE_ONLY="true"
$Env:DRY_RUN_WRITE_ONLY="true"
User Agent¶
If you’re using the API as part of a project or application, it’s good practice to add that info to the user-agent. You can optionally set this on the session object used to make requests:
>>> from pyinaturalist import ClientSession
>>> session = ClientSession(user_agent='my_app/1.0.0')