Skip to content

Commit

Permalink
Enable gzip compression for reactionfile analysis (#280)
Browse files Browse the repository at this point in the history
* Enable gzip compression for reactionfile analysis

If .gz extension is specified, gzip compression is
used.

* Update documentation
  • Loading branch information
mlund authored May 17, 2020
1 parent f1dc8df commit a74d0eb
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 24 deletions.
40 changes: 24 additions & 16 deletions docs/_docs/analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,46 +366,54 @@ atomic species can be saved.

## Reaction Coordinate

This saves a given reaction coordinate (see Penalty Function in Energy) as a function of steps.
The output file has three columns with steps; the value of the reaction coordinate; and
the cummulative average of all preceding values.
This saves a given [reaction coordinate](energy.html#reaction-coordinates)
as a function of steps. The generated output `file` has three columns:

The folowing example prints the mass center $z$ coordinate of the first molecule
to disk every 100th steps:
1. step number
2. the value of the reaction coordinate
3. the cummulative average of all preceding values.

Optional [gzip compression](https://en.wikipedia.org/wiki/Gzip)
can be enabled by suffixing the filename with `.gz`, thereby reducing the output file size significantly.
The folowing example reports the mass center $z$ coordinate of the first molecule every 100th steps:

~~~ yaml
- reactioncoordinate:
{nstep: 100, file: cmz.dat, type: molecule, index: 0, property: com_z}
{nstep: 100, file: cmz.dat.gz, type: molecule, index: 0, property: com_z}
~~~

In the next example, the Angle between the principal molecular axis and the $xy$-plane
In the next example, the angle between the principal molecular axis and the $xy$-plane
is reported by diagonalising the gyration tensor to find the principal moments:

~~~ yaml
- reactioncoordinate:
{nstep: 100, file: angle.dat, type: molecule, index: 0, property: angle, dir: [0,0,1]}
{nstep: 100, file: angle.dat.gz, type: molecule, index: 0, property: angle, dir: [0,0,1]}
~~~

### Processing

In the example above we saved two properties as a function of steps. To join the two
files and generate the average angle as a function of _z_, the following python code
may be used:
In the above examples we stored two properties as a function of steps. To join the two
files and calculate the _average angle_ as a function of the mass center coordinate, _z_,
the following python code may be used:

~~~ python
import numpy as np
from scipy.stats import binned_statistic

def joinRC(xfile, yfile, bins):
x = np.loadtxt(xfile, usecols=[1])
y = np.loadtxt(yfile, usecols=[1])
means, edges, bins = binned_statistic(x,y,'mean',bins)
def joinRC(filename1, filename2, bins):
x = np.loadtxt(filename1, usecols=[1])
y = np.loadtxt(filename2, usecols=[1])
means, edges, bins = binned_statistic(x, y, 'mean', bins)
return (edges[:-1] + edges[1:]) / 2, means

cmz, angle = joinRC('cmz.dat', 'angle.dat', 100)
cmz, angle = joinRC('cmz.dat.gz', 'angle.dat.gz', 100)
np.diff(cmz) # --> cmz resolution; control w. `bins`
~~~

Note that Numpy automatically detects and decompresses `.gz` files.
Further, the command line tools `zcat`, `zless` etc. are useful for handling
compressed files.


## System Sanity

Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# sys.path.insert(0, os.path.abspath('.'))

project = 'Faunus'
copyright = '2019, Mikael Lund'
copyright = '2020, Mikael Lund'
author = 'Mikael Lund'
source_suffix = ['.rst', '.md']
master_doc = 'index'
Expand Down
20 changes: 15 additions & 5 deletions src/analysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -451,24 +451,34 @@ void FileReactionCoordinate::_to_json(json &j) const {
}

void FileReactionCoordinate::_sample() {
if (file) {
if (*stream) {
double val = (*rc)();
avg += val;
file << fmt::format("{} {:.6f} {:.6f}\n", cnt * steps, val, avg.avg());
(*stream) << fmt::format("{} {:.6f} {:.6f}\n", cnt * steps, val, avg.avg());
}
}

FileReactionCoordinate::FileReactionCoordinate(const json &j, Space &spc) {
from_json(j);
name = "reactioncoordinate";
filename = MPI::prefix + j.at("file").get<std::string>();
file.open(filename); // output file
if (auto suffix = filename.substr(filename.find_last_of(".") + 1); suffix == "gz") {
faunus_logger->trace("{}: GZip compression enabled for {}", name, filename);
stream = std::make_unique<zstr::ofstream>(filename);
} else {
stream = std::make_unique<std::ofstream>(filename);
}
if (not*stream) {
throw std::runtime_error("could not open create "s + filename);
}
type = j.at("type").get<std::string>();
rc = ReactionCoordinate::createReactionCoordinate({{type, j}}, spc);
}

void FileReactionCoordinate::_to_disk() {
if (file)
file.flush(); // empty buffer
if (*stream) {
stream->flush(); // empty buffer
}
}

void WidomInsertion::_sample() {
Expand Down
4 changes: 2 additions & 2 deletions src/analysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,15 +64,15 @@ class FileReactionCoordinate : public Analysisbase {
private:
Average<double> avg;
std::string type, filename;
std::ofstream file;
std::unique_ptr<std::ostream> stream = nullptr;
std::shared_ptr<ReactionCoordinate::ReactionCoordinateBase> rc = nullptr;

void _to_json(json &j) const override;
void _sample() override;
void _to_disk() override;

public:
FileReactionCoordinate(const json &j, Space &spc);
FileReactionCoordinate(const json &, Space &);
};

/**
Expand Down

0 comments on commit a74d0eb

Please sign in to comment.