Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add MontePythonReader and MontePythonSamples classes #14

Merged
merged 24 commits into from
Jul 10, 2019

Conversation

lukashergt
Copy link
Collaborator

  • Created a new file read/montepythonreader.py containing the class
    MontePythonReader which reads and processes MontePython chain files
    using montepython.analyze.

    • Extracts parameter names from MontePython's log.param file.
    • Removes burn-in and non-markovian points from the data.
    • Extracts tex names of the parameters.
    • Extracts parameter limits from MontePython's log.param file.
  • The new class MontePythonSamples in sampler.py calls
    MontePythonReader and turns the data into an MCMCSamples instance.

Changes to be committed:

  • modified: anesthetic/init.py
  • new file: anesthetic/read/montepythonreader.py
  • modified: anesthetic/samples.py

Checklist:

  • I have performed a self-review of my own code
  • My code is PEP8 compliant (flake8 anesthetic tests)
  • My code contains compliant docstrings (pydocstyle --convention=numpy anesthetic)
  • New and existing unit tests pass locally with my changes (python -m pytest)
  • I have added tests that prove my fix is effective or that my feature works

* Created a new file `read/montepythonreader.py` containing the class
  `MontePythonReader` which reads and processes MontePython chain files
  using `montepython.analyze`.
  - Extracts parameter names from MontePython's log.param file.
  - Removes burn-in and non-markovian points from the data.
  - Extracts tex names of the parameters.
  - Extracts parameter limits from MontePython's log.param file.

* The new class `MontePythonSamples` in sampler.py calls
  `MontePythonReader` and turns the data into an `MCMCSamples` instance.

Changes to be committed:
* modified:   anesthetic/__init__.py
* new file:   anesthetic/read/montepythonreader.py
* modified:   anesthetic/samples.py
@lukashergt lukashergt added the enhancement New feature or request label Jul 9, 2019
@codecov
Copy link

codecov bot commented Jul 9, 2019

Codecov Report

Merging #14 into master will increase coverage by 1.21%.
The diff coverage is 95.28%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #14      +/-   ##
=========================================
+ Coverage   84.48%   85.7%   +1.21%     
=========================================
  Files          14      15       +1     
  Lines         986    1070      +84     
=========================================
+ Hits          833     917      +84     
  Misses        153     153
Impacted Files Coverage Δ
anesthetic/read/chainreader.py 66.66% <ø> (ø)
anesthetic/read/multinestreader.py 100% <100%> (ø)
anesthetic/read/polychordreader.py 100% <100%> (ø)
anesthetic/read/getdistreader.py 84.31% <100%> (ø)
anesthetic/read/montepythonreader.py 100% <100%> (ø)
anesthetic/samples.py 87.34% <100%> (-0.08%) ⬇️
anesthetic/read/samplereader.py 79.16% <79.16%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 440218a...79933a6. Read the comment docs.

@williamjameshandley williamjameshandley added this to the 1.1 milestone Jul 9, 2019
@williamjameshandley
Copy link
Collaborator

I very much approve of a monte python reader.

Could you give me a bit more background on how a MontePythonSamples class differs from MCMCSamples? I would have thought that these were pretty much the same

@lukashergt
Copy link
Collaborator Author

I modelled MontePythonSamples to be very similar to MCMCSamples and potentially it could be incorporated in it, but then we'd need to decide on how to tell it to use the MontePythonReader instead of the GetDistReader.

@williamjameshandley
Copy link
Collaborator

My query is whether one needs an entire new class for MontePythonSamples. At the moment there are MCMCSamples, which covers getdist format (I believe that @Pablo-Lemos may have a cosmosis format, which if so should be offered as a PR when they have time), and NestedSamples, which covers MultiNest and PolyChord formatting.

I had envisaged that the role of the readers was to convert samples into one of these two classes to provide a unified 'inference interface'. NestedSamples are distinct from MCMCSamples, because they provide additional functionality such as evidence calculation and run-time replaying, but it's not clear to me what MontePython adds over GetDist in its samples which couldn't just be incorporated into the MCMCSamples class

@lukashergt
Copy link
Collaborator Author

The difference between MontePython and GetDist is similar to the difference between MultiNest and PolyChord I think. So, yes, we probably could have the MCMCSamples use the MontePythonReader.

I guess I am not sure how to make MCMCSamples differentiate between Montepython and GetDist.

@williamjameshandley
Copy link
Collaborator

williamjameshandley commented Jul 9, 2019

Ah, I see. OK. You'll have to implement an MCMCReader, similar to a NestedReader, which handles recognising whether samples are in montepython format or getdist format. Have a look at anesthetic/read/nested.py, which organises the different cases for MultiNestReader and PolyChordReader

MCMCSamples would then use MCMCReader, rather than GetDistReader

Lukas Hergt added 2 commits July 9, 2019 14:48
* Created file samplereader.py which replaces nested.py and takes over
  its job of inferring from the root which sampler was used and calls
  the appropriate reader out of the following list:
  - MultiNestReader
  - PolyChordReader
  - GetDistReader
  - MontePythonReader

* MontePythonReader was expanded by the properties `log_param_file`,
  `paramnames_file` and `log_file` which are used in `SampleReader` to
  determine whether the provided root is a MontePython root.

* Deferred the proper calls to `montepython.analyze` to the `_init`
  method in `MontePythonReader` (previously already done in `__init__`)
  such that an instance can be created and its properties used even if
  the provided root actually is not a MontePython root. This allows the
  necessary checks in `SampleReader` to be done.

* Removed `MontePythonSamples` as this is now handled by `MCMCSamples`.

Changes to be committed:
* modified:   anesthetic/__init__.py
* modified:   anesthetic/read/montepythonreader.py
* new file:   anesthetic/read/samplereader.py
* modified:   anesthetic/samples.py
@lukashergt
Copy link
Collaborator Author

Ok, I've managed to merge MontePythonSamples into MCMCSamples. I have replaced the NestedReader by a more general SampleReader which distinguishes between MultiNest, PolyChord, GetDist, and MontePython.

What do you think?

@williamjameshandley
Copy link
Collaborator

This looks good. I like how you're using montepython's native chain reading. Makes me wonder whether a similar strategy should be applied for GetDist.

Are you able to rename montepythonreader.py to just montepython.py, or does that introduce a name conflict?

Ideally you should also add a couple of tests to confirm that the code works on a variety of machines. I assume that we want to keep montepython as an optional dependency, so it might be worth implementing an 'optional install' for the .travis system, as is done for fgivenx

@williamjameshandley
Copy link
Collaborator

How essential is it to use MontePython's native functionality (in case people just have chains, but don't want to install montepython)?

@lukashergt
Copy link
Collaborator Author

  • montepythonreader.py needs to be named that way else we have a conflict with the module (took me quite some time to figure that one out, rather silent error...)

  • How would I test for different machines? And how do you do an 'optional install'?

  • Instead of passing the folder as root you can pass the chain files as root which will lead to GetDist reading the files. However then montepython's removal of burn-in and non-markovian points is not performed and no limits are provided.

@williamjameshandley
Copy link
Collaborator

montepythonreader.py needs to be named that way else we have a conflict with the module (took me quite some time to figure that one out, rather silent error...)

I thought as much. Would it be worth suffixing the other files with 'reader' for consistency?

How would I test for different machines? And how do you do an 'optional install'?

Link to fgivenx above should give you the .travis.yml file modifications. Basically you have two different requirements files that it chooses to read from, one 'minimal' and the other with all optional dependencies installed. To do this you'd need to create the second requirements file, and extend the env section of .travis.yml a-la fgivenx.

Instead of passing the folder as root you can pass the chain files as root which will lead to GetDist reading the files. However then montepython's removal of burn-in and non-markovian points is not performed and no limits are provided.

It's worth remarking that the getdist ones should also potentially have burn-in removed, but anesthetic does not do this.

@lukashergt
Copy link
Collaborator Author

Would it be worth suffixing the other files with 'reader' for consistency?

I think that would be good. I actually generally think it is good when the module name reflects the class name therein.

It's worth remarking that the getdist ones should also potentially have burn-in removed, but anesthetic does not do this.

What do you mean? Add a comment in the docs of MCMCSamples that no burn-in was removed for getdist samples?

@williamjameshandley
Copy link
Collaborator

What do you mean? Add a comment in the docs of MCMCSamples that no burn-in was removed for getdist samples?

That would be for another PR. For now you could raise an issue that our system doesn't provide an interface for removing burn in.

@lukashergt
Copy link
Collaborator Author

montepython unfortunately is not pip installable, so I think we shouldn't put it in the requirements.txt file, right? How is this to work with the .travis.yml file then?

.travis.yml Outdated Show resolved Hide resolved
.travis.yml Show resolved Hide resolved
anesthetic/read/montepythonreader.py Show resolved Hide resolved
tests/test_samples.py Outdated Show resolved Hide resolved
…not installed

* `MontePythonReader._init` raises an ImportError when montepython
  package is not installed.
  - `test_read_montepython` now xfails (instead of skipping) when
    montepython is not installed.

Changes to be committed:
* modified:   anesthetic/read/montepythonreader.py
* modified:   tests/test_samples.py
@lukashergt
Copy link
Collaborator Author

lukashergt commented Jul 10, 2019

That would be for another PR. For now you could raise an issue that our system doesn't provide an interface for removing burn in.

Then again, getdist only performs burn-in removal by discarding a certain fraction of the first steps. This would be easy enough to do with a pandas dataframe after the sample has been read with MCMCSamples...
Possibly all that is required here is warning about this in the documentation somewhere?

@williamjameshandley
Copy link
Collaborator

Then again, getdist only performs burn-in removal by discarding a certain fraction of the first steps. This would be easy enough to do with a pandas dataframe after the sample has been read with MCMCSamples... Possibly all that is required here is warning about this in the documentation somewhere?

Not if the chains come from multiple files which are then combined. If anesthetic labelled which chain the points had come from (not a bad idea anyway), then this would be easy enough to do, but it would be most easy if there were an additional keyword argument passed to MCMCSamples and the reader. Definitely worth raising as an issue (or two) to remind somebody to do this at some point.

mcmc = MCMCSamples(root='./tests/example_data/mp')
mcmc.plot_2d(['x0', 'x1', 'x2', 'x3'])
mcmc.plot_1d(['x0', 'x1', 'x2', 'x3'])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the use of this decorator to implement the conditional passing -- not something I've used before.

@williamjameshandley williamjameshandley merged commit b53a707 into handley-lab:master Jul 10, 2019
@williamjameshandley
Copy link
Collaborator

Many thanks @lukashergt . This kind of PR is essential for increasing the usage of this package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants