Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate main into branch #118

Merged
merged 15 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 9 additions & 11 deletions docs/source/how_to_add_new_mixing.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
How to add a new mixing method to Taweret
=========================================

Taweret is meant to be extensible and is willing to accept any mixing methods the communities develops.
Taweret is meant to be extensible and is willing to accept any mixing methods the community develops.
These notes serve as instructions on how you can add your mixing methods to the Taweret repository.
All mixing methods in Taweret must inherit from the base class in ``Taweret/core``.
To add a new mixing method (or model), you need to:

- Step 1: Fork the repository, and clone it

.. code-block:: bash
Expand All @@ -21,9 +22,9 @@ To add a new mixing method (or model), you need to:
def __init__(self, ...):
...

The ``BaseMixer`` is an abstract base class which has certain methods that need to be defined for its interpretation by the Python interpreter to succeed. Which methods, and their descriptions, can be found in the API documentation for the ``BaseMixer``
The ``BaseMixer`` is an abstract base class which has certain methods that need to be defined for its interpretation by the Python interpreter to succeed. These methods, and their descriptions, can be found in the API documentation for the ``BaseMixer``

- Step 3: Add unit tests for mixing method to the the pytest directory. To make sure the python interpreter sees the add modules, the first several lines of your test file shoud read
- Step 3: Add unit tests for mixing method to the the pytest directory. To make sure the python interpreter sees the added modules, the first several lines of your test file should read

.. code-block:: python

Expand All @@ -37,14 +38,11 @@ The ``BaseMixer`` is an abstract base class which has certain methods that need
from Taweret.mix.<your_module> import *
import pytest

# All functions starting with `test_` will be register by pytest

- Step 4: You need to document your code well, following the examples you see in existing mxing methods, this includes type annotations and RST style code comments. The documentation generations should automatically identify your code
# All functions starting with `test_` will be registered by pytest

- Step 5: Format your code using the ``autopep8`` code formatter. We recommend using the following command in the base directory of the repository

.. code-block:: bash
- Step 4: You need to document your code well, following the examples you see in existing mixing methods, which includes type annotations and RST style code comments. The documentation generations should automatically identify your code

autopep8 --recursive --in-place --aggresive --aggresive .
- Step 5: Clean your code using the output of the ``flake8`` style guide tool. See the ``check`` tox task for one possible means to do this.

- Step 5: Create a pull request your addition to the `develop` branch. This should trigger a github action. Should the action fail, please try to diagnose the failure. Always make sure the test execute successfully, locally before opening a pull request
- Step 6: Create a pull request of your addition into the `develop` branch. This should trigger a GitHub action. Should the action fail, please try to diagnose
the failure. Always make sure the tests execute successfully locally before opening a pull request
5 changes: 3 additions & 2 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,10 @@ If you prefer to use conda for your package management, you can still pip instal

Alternative Installation
------------------------
.. _repository: https://github.com/bandframework/Taweret.git

Alternatively, you can clone the `repository <https://github.com/bandframework/Taweret.git>`.
Open cloning, the dependencies for Taweret dependencies by running the command
Alternatively, you can clone the `repository`_ and install Taweret into your
Python environment in developer or editable mode from the clone by running

.. code-block:: bash

Expand Down
26 changes: 14 additions & 12 deletions docs/source/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,21 +14,21 @@ and may not be valid in certain sub-regions of the input domain. In practice, th
model is not contained in the set of candidate models. Thus, selecting a single model to describe the true phenomena \
across the entire input domain is inappropriate. As an alternative, one may elect to combine the information within \
the model set in some systematic manner. A common approach involves combining the individual \
mean predictions or predictive densities from the indivdual models using a linear combination or weighted average. \
mean predictions or predictive densities from the individual models using a linear combination or weighted average. \
The weights in this linear combination may or may not depend on the inputs. When the models under consideration \
exhibit varrying levels of predictive accuracy depending on the sub-region of the input domain, an input-dependent \
weighting scheme is more appropriate. A memeber of the class of input-dependent weighting schemes is \
exhibit varying levels of predictive accuracy depending on the sub-region of the input domain, an input-dependent \
weighting scheme is more appropriate. A member of the class of input-dependent weighting schemes is \
Bayesian Model Mixing (BMM). BMM is a data-driven technique which combines the predictions from a set of N candidate models in a \
Bayesian manner using input-dependent weights. Mixing can be performed using one of two strategies described below: \
(1) A two-step approach: Each model is fit prior to mixing. \
The weight functions are then learned conditional on the predictions from each model. \
(2) A joint analysis: When the models have unknown parameters, one could elect to perform calibration while simultaneously \
learning the weight functions.

Taweret is a python package which provides a variety of BMM methods. Each method combines the information across a set of N models \
in a Bayesian manner using an input-dependent weighting scheme. The BMM methods in Taweret are designed to esitmate the \
Taweret is a Python package which provides a variety of BMM methods. Each method combines the information across a set of N models \
in a Bayesian manner using an input-dependent weighting scheme. The BMM methods in Taweret are designed to estimate the \
true mean of the underlying system (mean-mixing) or the true predictive density of the underlying system (density-mixing). \
Selecting a mixing objectve (mean vs. density mixing) and associated method is problem dependent.
Selecting a mixing objective (mean vs. density mixing) and associated method is problem dependent.

The typical workflow of Bayesian Model Mixing includes:

Expand All @@ -50,7 +50,7 @@ Models
^^^^^^
The user has to provide models that they would like to mix. Currently Taweret supports mixing of two \
or more models with a 1,...,p-dimensional input space (depending on the method of mixing chosen) and a single output. \
The models are required to have an "evaluate" a method which should return a mean and a standard deviation for each input parameter value.
The models are required to have an "evaluate" method that should return a mean and a standard deviation for each input parameter value.

Mixing Method
^^^^^^^^^^^^^
Expand Down Expand Up @@ -81,13 +81,15 @@ p-dimensional input spaces.

Estimating the Weight Functions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Taweret provides a variety of BMM methods, each which utilize an input-dependent weighting scheme. \
.. _Jupyter Book: https://bandframework.github.io/Taweret/landing.html

Taweret provides a variety of BMM methods, each of which utilizes an input-dependent weighting scheme. \
The weighting scheme may vary substantially across the different methods. For example, Linear mixing \
defines the weights using a parametric model, while the Bayesian Trees approach uses a non-parametric model. \
Another weighting scheme involves precision weighting, as seen in Multivariate BMM. Hence, the exact estimation \
of the weight functions may differ substantially across the various BMM methods. Despite this, the estimation \
process in each method is facilitated using Bayesian principles. Examples of each method can be found in the \
Python notebooks (docs/source/notebooks) and under the Examples tab on this page. In these examples, BMM is \
`Jupyter Book`_ as well as the Python notebooks (``book/notebooks``) used to create the book. In these examples, BMM is \
applied to the SAMBA, Coleman, and Polynomial models.

Working with Multiple Models
Expand All @@ -96,16 +98,16 @@ Working with Multiple Models
**A Two-step approach**: \
In some cases, the models under consideration may have been previously calibrated. \
Consequently, the predictions from each model are easily ascertained across a new set of input locations. This calibration \
phase is the first step in the two-step process. The second step invloves mixing the predictions from each model \
phase is the first step in the two-step process. The second step involves mixing the predictions from each model \
to estimate the true system. Thus, conditional on the individual predictions across a set of inputs along with observational data, \
the weight functions are learned and the overall mean or predictive density of the underlying system is estimated in a Bayesian manner. \
Examples of this two-step analysis can be found in a variety of the notebooks provided in the Examples section.
Examples of this two-step analysis can be found in several of the aforementioned examples.


**Mixing and Calibration**: \

This joint analysis is advantageous because it enables each model to be calibrated predominantly based on the sub-regions \
of the domain where its predictions align well with the observational data. These sub-regions will be simultaneously identified \
by the weight functions. This should lead more reliable inference than then case where each model is calibrated individually and \
by the weight functions. This should lead to inference that is more reliable than for the case where each model is calibrated individually and \
thus forced to reflect a global fit to the data. For example, the joint analysis would avoid situations where a model is calibrated \
using experimental data that is outside its applicability. Examples of this joint analysis are applied to the Coleman models.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Taweret = ["tests/samba_results.txt", "tests/bart_bmm_test_data/2d_*.txt"]

[project]
name = "Taweret"
version = "1.0.2"
version = "1.1.0"
authors = [
{ name="Kevin Ingles", email="[email protected]"},
{ name="Dananjaya (Dan) Liyanage", email="[email protected]"},
Expand Down
4 changes: 3 additions & 1 deletion src/Taweret/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
__version__ = "1.0.1"
from importlib.metadata import version

__version__ = version("Taweret")
__author__ = "Kevin Ingles, Dananjaya (Dan) Liyanage, \
Alexandra Semposki, John Yannotty"
__credits__ = "The Ohio State University, Ohio University, \
Expand Down
10 changes: 4 additions & 6 deletions src/Taweret/core/base_mixer.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,7 @@ def prior(self):
@abstractmethod
def set_prior(self):
'''
User must provide function that sets a member varibale called
User must provide function that sets a member variable called
``_prior``.
Dictionary of prior distributions. Format should be compatible with
sampler.
Expand Down Expand Up @@ -230,12 +230,10 @@ def train(self):
'''
Run sampler to learn parameters. Method should also create class
members that store the posterior and other diagnostic quantities
import for plotting
MAP values should also caluclate and set as member variable of
class
important for plotting MAP values.

Return:
-------
Returns:
--------
_posterior : np.ndarray
the mcmc chain return from sampler
'''
5 changes: 2 additions & 3 deletions src/Taweret/core/base_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,7 @@ def evaluate(self, model_parameters):
@abstractmethod
def log_likelihood_elementwise(self):
r'''
Calculate log_likelihood for array of points given, and return with
array with same shape[0]
Calculate log_likelihood for array of points given

Returns:
--------
Expand All @@ -61,7 +60,7 @@ def log_likelihood_elementwise(
@abstractmethod
def set_prior(self):
'''
User must provide function that sets a member varibale called _prior.
User must provide function that sets a member variable called _prior.
Dictionary of prior distributions. Format should be compatible with
sampler.

Expand Down
54 changes: 28 additions & 26 deletions src/Taweret/mix/bivariate_linear.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ def __init__(self,
BMMcor: bool = False,
mean_mix: bool = False):
'''
Parameters
----------
Parameters:
-----------
models_dic : dictionary {'name1' : model1, 'name2' : model2}
Two models to mix, each must be derived from the base_model.
method : str
Expand Down Expand Up @@ -144,8 +144,8 @@ def evaluate(self,
'''
Evaluate the mixed model for given parameters at input values x

Parameters
----------
Parameters:
-----------
mixture_params : np.1darray
parameter values that fix the shape of mixing function
x : np.1daray
Expand All @@ -154,8 +154,8 @@ def evaluate(self,
list of model parameter values for each model


Returns
---------
Returns:
--------
evaluation : np.2darray
the evaluation of the mixed model at input values x
Has the shape of len(x) x Number of observables in the model
Expand Down Expand Up @@ -222,15 +222,15 @@ def evaluate_weights(self,
'''
return the mixing function values at the input parameter values x

Parameters
----------
Parameters:
-----------
mixture_params : np.1darray
parameter values that fix the shape of mixing function
x : np.1darray
input parameter values

Returns
-------
Returns:
--------
weights : list[np.1darray, np.1darray]
weights for model 1 and model 2 at input values x

Expand All @@ -247,8 +247,8 @@ def predict(self,
'''
Evaluate posterior to make prediction at test points x.

Parameters
----------
Parameters:
-----------
x : np.1darray
input parameter values
CI : list
Expand Down Expand Up @@ -317,15 +317,16 @@ def predict_weights(self,
'''
Calculate posterior predictive distribution for first model weights

Parameters
----------
Parameters:
-----------
x : np.1darray
input parameter values
CI : list
confidence intervals
samples: np.ndarray
If samples are given use that instead of posterior\
for predictions.

Returns:
--------
posterior_weights : np.ndarray
Expand Down Expand Up @@ -375,8 +376,8 @@ def prior_predict(self,
'''
Evaluate prior to make prediction at test points x.

Parameters
----------
Parameters:
-----------
x : np.1darray
input parameter values
CI : list
Expand Down Expand Up @@ -410,8 +411,8 @@ def set_prior(self,
Set prior for the mixing function parameters.
Prior for the model parameters should be defined in each model.

Parameters:
-----------
Parameters
----------
bilby_prior_dic : bilby.core.prior.PriorDict
The keys should be named as following :
'<mix_func_name>_1', '<mix_func_name>_2', ...
Expand Down Expand Up @@ -451,8 +452,8 @@ def mix_loglikelihood(self,
"""
log likelihood of the mixed model given the mixing function parameters

Parameters
----------
Parameters:
-----------
mixture_params : np.1darray
parameter values that fix the shape of mixing function
model_params: list[model_1_params, mode_2_params]
Expand Down Expand Up @@ -597,11 +598,12 @@ def train(self,
'''
Run sampler to learn parameters. Method should also create class
members that store the posterior and other diagnostic quantities
important for plotting
MAP values should also calculate and set as member variable of
class
important for plotting MAP values, and finds the MAP values for
each parameter, and sets them equal to a class variable for easy
access.

Parameters:
----------
-----------

x_exp: np.1darray
Experimentally measured input values
Expand All @@ -622,8 +624,8 @@ def train(self,
If a previous training has been done, load that chain instead of
retraining.

Return:
-------
Returns:
--------
result : bilby posterior object
object returned by the bilby sampler
'''
Expand Down
2 changes: 1 addition & 1 deletion src/Taweret/mix/trees.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,7 @@ def set_prior(
:param float base:
The base parameter in the tree prior.
:param float overallsd:
An initial estimate of the erorr standard deviation.
An initial estimate of the error standard deviation.
This value is used to calibrate the scale parameter in
variance prior.
:param float overallnu:
Expand Down
11 changes: 7 additions & 4 deletions src/Taweret/utils/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,17 @@


def normed_mvn_loglike(y, cov):
"""
r"""
Evaluate the multivariate-normal log-likelihood for difference vector `y`
and covariance matrix `cov`:

log_p = -1/2*[(y^T).(C^-1).y + log(det(C))] + const.
.. math::
log_p = -\frac{1}{2}[y^T C^{-1} y + \mathrm{log}(\mathrm{det}(C))]
+ const.

This likelihood IS NORMALIZED.
The normalization const = -n/2*log(2*pi), where n is the dimensionality.
The normalization const :math:`= -\frac{n}{2}\mathrm{log}(2\pi)`,
where :math:`n` is the dimensionality.

Arguments `y` and `cov` MUST be np.arrays with dtype == float64 and shapes
(n) and (n, n), respectively. These requirements are NOT CHECKED.
Expand Down Expand Up @@ -274,7 +277,7 @@ def mixture_function(


def switchcos(g1, g2, g3, x):
"""Switchcos function in Alexandras Samba module
"""Switchcos function in Alexandra's Samba module
link https://github.com/asemposki/SAMBA/

Parameters:
Expand Down
Loading