Data Structure¶
Your data must be structured in a specific way to be used in the package.
Phenology Observation Data¶
Observation data consists of the following
- doy: These are the julien date (1-365) of when a specific phenological event happened.
- site_id: A site identifier for each doy observation
- year: A year identifier for each doy observation
These should be structured in columns in a pandas data.frame, where every row is a single observation. For example the built in vaccinium dataset looks like this:
from pyPhenology import models, utils
observations, temp = utils.load_test_data(name='vaccinium')
obserations.head()
species site_id year doy phenophase
0 vaccinium corymbosum 1 1991 100 371
1 vaccinium corymbosum 1 1991 100 371
2 vaccinium corymbosum 1 1991 104 371
3 vaccinium corymbosum 1 1998 106 371
4 vaccinium corymbosum 1 1998 106 371
There are extra columns here for the species and phenophase, those will be ignored inside the pyPhenology package.
Phenology Environmental Data¶
The current models only support daily mean temperature as a driver. Models require the daily temperature for every day of the winter and spring leading up to the phenophase event
- site_id: A site identifier for each location.
- year: The year of the temperature timeseries
- temperatuer: The observed daily mean temperature in degrees Celcius.
- doy: The julien date of the mean temperature
These should columns in a data.frame like the observations. The example vaccinium dataset has temperature observations:
temp.head()
site_id temperature year doy
0 1 -3.86 1989.0 0.0
1 1 -4.71 1989.0 1.0
2 1 -1.56 1989.0 2.0
3 1 -7.88 1989.0 3.0
4 1 -15.24 1989.0 4.0
On the Julien Date¶
TODO Jan. 1 is 0, but prior dates of the same winter are negative numbers.
Notes¶
- If you have only a single site, make a “dummy” site_id column set to 1 for both temperature and observation dataframes.
- If you have only a single year