pyPhenology.models.WeightedEnsemble¶

class pyPhenology.models.WeightedEnsemble(core_models)[source]¶

Fit an ensemble of many models with associated weights

This model can combine multiple models into an ensemble where predictions are the weighted average of the predictions from each model. The weights are derived via “stacking” as described in Dormann et al. 2018. The steps are as followed:

Subset the data into random training/testing sets.

Fit each core model on the training set.

Make predictions on the testing set.

Find the weights which minimize RMSE of the testing set.

Repeat 1-4 for H iterations.

Take the average weight for each model from all iterations as final weight used in the ensemble. These will sum to 1.

Fit the core models a final time on the full dataset given to the fit() method. Parameters derived from this final iterations will be used to make predictions.

Note that the core models must be passed initialized. They will be fit within the Weighted Ensemble model:

from pyPhenology import models, utils
observations, predictors = utils.load_test_data(name='vaccinium')

m1 = models.Thermaltime(parameters={'T':0})
m2 = models.Thermaltime(parameters={'T':5})
m3 = models.Thermaltime(parameters={'T':-5})
m4 = models.Thermaltime(parameters={'T':10})
m5 = models.Uniforc(parameters={'t1':1})
m6 = models.Uniforc(parameters={'t1':30})
m7 = models.Uniforc(parameters={'t1':60})

ensemble = models.WeightedEnsemble(core_models=[m1,m2,m3,m4,m5,m6,m7])
ensemble.fit(observations, predictors)

Notes:: Dormann, Carsten F., et al. Model averaging in ecology: a review of Bayesian, information‐theoretic and tactical approaches for predictive inference. Ecological Monographs. https://doi.org/10.1002/ecm.1309

__init__(core_models)[source]¶

Weighted Ensemble model

core_models : list of pyPhenology models, or a saved model file

Methods

`__init__`(core_models)	Weighted Ensemble model
`ensemble_shape`([shape])	Returns a tuple signifying the layers of submodels ie.
`fit`(observations, predictors[, iterations, …])	Fit the underlying core models
`get_params`()
`get_weights`()
`predict`([to_predict, predictors, …])	Make predictions..
`save_params`(filename[, overwrite])	Save model parameters
`score`([metric, doy_observed, to_predict, …])	Get the scoring metric for fitted data Get the score on the dataset used for fitting (if fitting was done), otherwise set `to_predict`, and `predictors` as used in `model.predict()`.

fit(observations, predictors, iterations=10, held_out_percent=0.2, loss_function='rmse', method='DE', optimizer_params='practical', n_jobs=1, verbose=False, debug=False)[source]¶

Fit the underlying core models

Parameters:

observations : dataframe: pandas dataframe of phenology observations
predictors : dataframe: pandas dataframe of associated predictors
iterations : int: Number of stacking iterations to use.
held_out_percent : float: Percent of randomly held out data to use in each stacking iteration. Must be between 0 and 1.
n_jobs : int: number of parallel processes to use
kwargs :: Other arguments passed to core model fitting (eg. optimzer methods)

predict(to_predict=None, predictors=None, aggregation='mean', n_jobs=1, **kwargs)[source]¶

Make predictions..

Predictions will be made using each core models, then a final average model derrived using the fitted weights.

Parameters:

see core model description

aggregation : str: Either ‘weighted_mean’ to get a normal prediciton, or ‘none’ to get predictions for all models. If using ‘none’ this returns a tuple of (weights, predictions).
n_jobs : int: number of parallel processes to use