pyPhenology.models.WeightedEnsemble¶
-
class
pyPhenology.models.
WeightedEnsemble
(core_models)[source]¶ Fit an ensemble of many models with associated weights
This model can combine multiple models into an ensemble where predictions are the weighted average of the predictions from each model. The weights are derived via “stacking” as described in Dormann et al. 2018. The steps are as followed:
- Subset the data into random training/testing sets.
- Fit each core model on the training set.
- Make predictions on the testing set.
- Find the weights which minimize RMSE of the testing set.
- Repeat 1-4 for H iterations.
- Take the average weight for each model from all iterations as final weight used in the ensemble. These will sum to 1.
- Fit the core models a final time on the full dataset given to the fit() method. Parameters derived from this final iterations will be used to make predictions.
Note that the core models must be passed initialized. They will be fit within the Weighted Ensemble model:
from pyPhenology import models, utils observations, predictors = utils.load_test_data(name='vaccinium') m1 = models.Thermaltime(parameters={'T':0}) m2 = models.Thermaltime(parameters={'T':5}) m3 = models.Thermaltime(parameters={'T':-5}) m4 = models.Thermaltime(parameters={'T':10}) m5 = models.Uniforc(parameters={'t1':1}) m6 = models.Uniforc(parameters={'t1':30}) m7 = models.Uniforc(parameters={'t1':60}) ensemble = models.WeightedEnsemble(core_models=[m1,m2,m3,m4,m5,m6,m7]) ensemble.fit(observations, predictors)
- Notes:
- Dormann, Carsten F., et al. Model averaging in ecology: a review of Bayesian, information‐theoretic and tactical approaches for predictive inference. Ecological Monographs. https://doi.org/10.1002/ecm.1309
-
__init__
(core_models)[source]¶ Weighted Ensemble model
core_models : list of pyPhenology models, or a saved model file
Methods
__init__
(core_models)Weighted Ensemble model ensemble_shape
([shape])Returns a tuple signifying the layers of submodels ie. fit
(observations, predictors[, iterations, …])Fit the underlying core models get_params
()get_weights
()predict
([to_predict, predictors, …])Make predictions.. save_params
(filename[, overwrite])Save model parameters score
([metric, doy_observed, to_predict, …])Get the scoring metric for fitted data Get the score on the dataset used for fitting (if fitting was done), otherwise set to_predict
, andpredictors
as used inmodel.predict()
.-
fit
(observations, predictors, iterations=10, held_out_percent=0.2, loss_function='rmse', method='DE', optimizer_params='practical', n_jobs=1, verbose=False, debug=False)[source]¶ Fit the underlying core models
- Parameters:
- observations : dataframe
- pandas dataframe of phenology observations
- predictors : dataframe
- pandas dataframe of associated predictors
- iterations : int
- Number of stacking iterations to use.
- held_out_percent : float
- Percent of randomly held out data to use in each stacking iteration. Must be between 0 and 1.
- n_jobs : int
- number of parallel processes to use
- kwargs :
- Other arguments passed to core model fitting (eg. optimzer methods)
-
predict
(to_predict=None, predictors=None, aggregation='mean', n_jobs=1, **kwargs)[source]¶ Make predictions..
Predictions will be made using each core models, then a final average model derrived using the fitted weights.
- Parameters:
see core model description
- aggregation : str
- Either ‘weighted_mean’ to get a normal prediciton, or ‘none’ to get predictions for all models. If using ‘none’ this returns a tuple of (weights, predictions).
- n_jobs : int
- number of parallel processes to use