Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i.cca: use 0 based array indexing #3239

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 15 additions & 15 deletions imagery/i.cca/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -156,10 +156,10 @@ int main(int argc, char *argv[])
cov[i] = G_alloc_matrix(bands, bands);
}

outbandmax = (CELL *)G_calloc(nclass, sizeof(CELL));
outbandmin = (CELL *)G_calloc(nclass, sizeof(CELL));
datafds = (int *)G_calloc(nclass, sizeof(int));
outfds = (int *)G_calloc(nclass, sizeof(int));
outbandmax = (CELL *)G_calloc(bands, sizeof(CELL));
outbandmin = (CELL *)G_calloc(bands, sizeof(CELL));
datafds = (int *)G_calloc(bands, sizeof(int));
outfds = (int *)G_calloc(bands, sizeof(int));

/*
Here is where the information regarding
Expand All @@ -169,13 +169,14 @@ int main(int argc, char *argv[])
*/

samptot = 0;
for (i = 1; i <= nclass; i++) {
nsamp[i] = sigs.sig[i - 1].npoints;
for (i = 0; i < nclass; i++) {
nsamp[i] = sigs.sig[i].npoints;
samptot += nsamp[i];
for (j = 1; j <= bands; j++) {
mu[i][j] = sigs.sig[i - 1].mean[j - 1];
for (k = 1; k <= j; k++)
cov[i][j][k] = cov[i][k][j] = sigs.sig[i - 1].var[j - 1][k - 1];
for (j = 0; j < bands; j++) {
mu[i][j] = sigs.sig[i].mean[j];
for (k = 0; k < j; k++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try

Suggested change
for (k = 0; k < j; k++) {
for (k = 0; k <= j; k++) {

because j is not the upper bound of k to be excluded but to be included, also with zero-based indexing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting my mistake!

Still even with this being fixed, output is not identical to the old version. Should it be identical? Suggestions how to test?

Copy link
Contributor

@metzm metzm Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions in stats.c assume zero-based indices, thus I would expect different results in the sense that results with your PR (zero based indices) are correct whereas previous results (one based indices) were not correct, particularly because the indexing to cov[][][] did not change.

You would need some independent, trusted alternative for Canonical components analysis in order to decide which results are correct. Regarding the code, your PR seems like a bugfix, previous results are probably not correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarification, cov has been indexed previously in main.c as cov[i][j][k] which was wrong for one-based i, j, k because zero-based i, j, k are used in stats.c. With your PR, cov is consistently indexed with zero-based i, j, k, that could explain the differences.

cov[i][j][k] = cov[i][k][j] = sigs.sig[i].var[j][k];
}
}
}

Expand All @@ -200,14 +201,13 @@ int main(int argc, char *argv[])
}

/* open the cell maps */
for (i = 1; i <= bands; i++) {
for (i = 0; i < bands; i++) {
outbandmax[i] = (CELL)0;
outbandmin[i] = (CELL)0;

datafds[i] =
Rast_open_old(refs.file[i - 1].name, refs.file[i - 1].mapset);
datafds[i] = Rast_open_old(refs.file[i].name, refs.file[i].mapset);

sprintf(tempname, "%s.%d", out_opt->answer, i);
sprintf(tempname, "%s.%d", out_opt->answer, i + 1);
outfds[i] = Rast_open_c_new(tempname);
}

Expand All @@ -219,7 +219,7 @@ int main(int argc, char *argv[])
Rast_init_colors(&color_tbl);

/* close the cell maps */
for (i = 1; i <= bands; i++) {
for (i = 0; i < bands; i++) {
Rast_close(datafds[i]);
Rast_close(outfds[i]);

Expand Down
91 changes: 91 additions & 0 deletions imagery/i.cca/testsuite/test_i_cca.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
"""
Name: i.cca tests
Purpose: Test correctness of generated outputs

Author: Maris Nartiss
Copyright: (C) 2023 by Maris Nartiss and the GRASS Development Team
Licence: This program is free software under the GNU General Public
License (>=v2). Read the file COPYING that comes with GRASS
for details.
"""
from grass.script.core import tempname
neteler marked this conversation as resolved.
Show resolved Hide resolved

from grass.gunittest.case import TestCase
from grass.gunittest.main import test


class OutputMatchTest(TestCase):
"""
Compare values to output generated by pre-7.0 version of the module.
Comparison values were obtained with pre-ccmath rewrite version
664305e4b8de924c8dfb5385e8bf6c18fbf649be
"""

@classmethod
def setUpClass(cls):
"""Ensures expected computational region and generated data"""
cls.use_temp_region()
cls.runModule("g.region", raster="lsat7_2000_20")
cls.group_name = tempname(10)
cls.subgroup_name = "vis"
cls.runModule(
"i.group",
group=cls.group_name,
subgroup=cls.subgroup_name,
input="lsat7_2000_20,lsat7_2000_30,lsat7_2000_40",
)
Comment on lines +29 to +37
Copy link
Member

@neteler neteler Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cls.runModule("g.region", raster="lsat7_2000_20")
cls.group_name = tempname(10)
cls.subgroup_name = "vis"
cls.runModule(
"i.group",
group=cls.group_name,
subgroup=cls.subgroup_name,
input="lsat7_2000_20,lsat7_2000_30,lsat7_2000_40",
)
cls.runModule("g.region", raster="lsat7_2002_20")
cls.group_name = tempname(10)
cls.subgroup_name = "vis"
cls.runModule(
"i.group",
group=cls.group_name,
subgroup=cls.subgroup_name,
input="lsat7_2002_20,lsat7_2002_30,lsat7_2002_40",
)

There is an CI error: "b'ERROR: Raster map <lsat7_2000_20> not found\n'" - the available Landsat datasets are of 2002. Updated accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably affect the expected results, @marisn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Markus, I have had no time (and am not certain if my math level is up to task) to go over formulas from the original paper (or was it a technical document?) to create a synthetic dataset with known results as I did for the r.kappa module. This is the reason why this PR has not been merged as at the moment is possible only to test if module runs but not if it produces correct output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, funny understood.
So I'd suggest to commit the data name change and see what happens next.

cls.rasters = []
cls.signatures = []

@classmethod
def tearDownClass(cls):
"""Remove the temporary region and generated data"""
cls.del_temp_region()
cls.runModule(
"g.remove", flags="f", type="group", name=cls.group_name, quiet=True
)
for r in cls.rasters:
cls.runModule("g.remove", flags="f", type="raster", name=r, quiet=True)
for s in cls.signatures:
cls.runModule("i.signatures", type="sig", remove=s, quiet=True)

def test_output_values(self):
"""Test correctness of i.cca output"""
sig_name = tempname(10)
self.assertModule(
"i.cluster",
classes=4,
group=self.group_name,
subgroup=self.subgroup_name,
signaturefile=sig_name,
quiet=True,
)
self.signatures.append(sig_name)
out_prefix = tempname(10)
self.assertModule(
"i.cca",
group=self.group_name,
subgroup=self.subgroup_name,
signature=sig_name,
output=out_prefix,
quiet=True,
)
self.assertRasterExists(f"{out_prefix}.1")
self.rasters.append(f"{out_prefix}.1")
self.assertRasterExists(f"{out_prefix}.2")
self.rasters.append(f"{out_prefix}.2")
self.assertRasterExists(f"{out_prefix}.3")
self.rasters.append(f"{out_prefix}.3")
self.assertRasterMinMax(
map=f"{out_prefix}.1", refmin=-10, refmax=9, msg="Wrong calculated value"
)
self.assertRasterMinMax(
map=f"{out_prefix}.2", refmin=-34, refmax=-3, msg="Wrong calculated value"
)
self.assertRasterMinMax(
map=f"{out_prefix}.3", refmin=-24, refmax=5, msg="Wrong calculated value"
)


if __name__ == "__main__":
test()
Loading