-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework gabor filter and add log gabor filter #235
base: master
Are you sure you want to change the base?
Conversation
61196be
to
a30e697
Compare
Codecov Report
@@ Coverage Diff @@
## master #235 +/- ##
==========================================
+ Coverage 92.14% 92.45% +0.30%
==========================================
Files 12 14 +2
Lines 1642 1709 +67
==========================================
+ Hits 1513 1580 +67
Misses 129 129
Continue to review full report at Codecov.
|
a30e697
to
6812c7d
Compare
595eb00
to
b18229d
Compare
This commit introduces an lazy array interface for the Gabor filter, benchmark shows that it's 2-3x faster than the previous version. The documentation is heavly rewrote to explain this filter and all its parameters.
b18229d
to
7f29c9e
Compare
59d5bc8
to
a0f1e28
Compare
I'm marking this as ready as I don't plan to add any further major changes to this PR. I'm going to add phase congruency filter next, and then finally the FSIM quality index JuliaImages/ImageQualityIndexes.jl#25. I'll keep this PR pending until everything downstream is ready. ImagePhaseCongruency.jl is really a good reference but it has a strong MATLAB legacy, I believe we can get things faster and give it better composability by properly resolving and designing the function. |
a0f1e28
to
e0f182a
Compare
Benchmark shows that F32 on LogGabor is somewhat slow. I'll try to investigate it before merging this. BenchmarkGroup:
2-element BenchmarkTools.BenchmarkGroup:
tags: []
"Gabor" => 6-element BenchmarkTools.BenchmarkGroup:
tags: []
"Gabor_ComplexF32_19×19" => Trial(5.625 μs)
"Gabor_ComplexF32_256×256" => Trial(994.030 μs)
"Gabor_ComplexF32_512×512" => Trial(3.857 ms)
"Gabor_ComplexF64_19×19" => Trial(7.870 μs)
"Gabor_ComplexF64_256×256" => Trial(1.440 ms)
"Gabor_ComplexF64_512×512" => Trial(5.553 ms)
"LogGabor" => 6-element BenchmarkTools.BenchmarkGroup:
tags: []
"LogGabor_ComplexF64_19×19" => Trial(13.903 μs)
"LogGabor_ComplexF64_256×256" => Trial(2.422 ms)
"LogGabor_ComplexF64_512×512" => Trial(9.884 ms)
"LogGabor_ComplexF32_19×19" => Trial(68.269 μs)
"LogGabor_ComplexF32_256×256" => Trial(11.896 ms)
"LogGabor_ComplexF32_512×512" => Trial(51.113 ms) Writing benchmark isn't something pleasant, but when you get the report, oh that's a surprise! |
I am very pleased and impressed that this work is being done. Please help yourself to any code you think is useful in ImagePhaseCongruency I am sure much of it can be considerably improved and made more interoperable with the rest of the Images ecosystem. Another function from ImagePhaseCongruency that I think is deserving of being included in Images is Nate that I am now retired and find myself not doing very much coding these days. Indeed at the moment I am often off the grid for extended periods of time. Thus I am not in a position to provide code reviews but should you have any questions about my code I will be more than happy to try to answer them. In the meantime I will be cheering you on from the sidelines! |
if kern.normalize | ||
# normalize: from reference [1] of LogGabor | ||
# By changing division to multiplication gives about 5-10% performance boost | ||
x, y = (x, y) .* kern.freq_scale | ||
end | ||
ω = sqrt(x^2 + y^2) # this is faster than hypot(x, y) | ||
θ = atan(y, x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is often the case that we can pre-compute the frequency ω
and orientation θ
and then call kern[ω, θ]
. In this case, we might want to introduce a sphere index type to cache the computation and make the getindex
dispatch on it.
for p in SphereIndices(img)
# by doing this, we only need to compute the sphere index once
kern1[p] * kern2[p]
end
Because some kernels are defined orientation-invariant, they wouldn't need the θ
information at all. Thus each component of SphereIndex
must be computed lazily so that we don't compute θ
if we don't need it.
cc @Tokazama for a possible use case for JuliaArrays/SpatioTemporalTraits.jl#11. This is kind of the opposite of the CompositeSystem
you proposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've long wanted to be able to have a method for explicitly extracting spatial indices and this is a great practical example that motivates an implementation. We need some way of having a spatial system in the midst of color, time, etc. which users may end up creating a view of the array that splits up or permuted relevant dimensions. The CompositeSystem
proposal was an attempt to accommodate that, but if we did go in that direction we would still need to be able to extract a succinct representation of just the spatial system (e.g., SphericalSystem
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do already have coords_spatial
and ImageAxes.filter_space_axes
. I'm not a fan of color axes because having one destroys some of our most important abstractions (one array element = one pixel), but I recognize we may want to support such constructs slightly better than we do now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do already have
coords_spatial
andImageAxes.filter_space_axes
.
I thought those methods were only reliable if the type encoding the coordinate system is kept at the top level and it was a cartesian system, like we usually assume with AxisArray. Could we rely on that approach if the coordinate system was encoded in nested arrays or the spatial indices couldn't just be permuted or sliced (e.g., one dimension was for the radius and the other was for the azimuth)?
I'm not a fan of color axes because having one destroys some of our most important abstractions
I agree, but it would be good to play nice with the machine learning community on that one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can fix the nesting in the same way you've been doing it in ArrayInterface. Don't need a new trait that does the same thing, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah, we should provide better support for ML. Not by changing our algorithms, but by making it easier to represent their raw data in our framework. Making algorithms work for arrays that have a color channel is messy and error-prone, because there is no way to know how to interpret that channel unless it's annotated unambiguously. In which case, why not just change the representation to get rid of the color channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we wanted to facilitate what was proposed we could do something like this
SpatialIndices(x) = SpatialIndices(CoordinateSystem(x), x)
SpatialIndices(cs::SphericalSystem, x) = SphereIndices{SphereIndex}(cs, indices_spatial(x))
We still use existing traits (indices_spatial
), but we need more information to know which indices correspond to the radius, azimuth, and zenith so we can compose each SphereIndex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because some of the computation can be non-trivial, can this design handle the task "I only want to iterate over the first dimension of the grid so if possible, don't do the extra computation for the second dimension"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding everything correctly here, you have two unique sets of coordinate systems with corresponding spatial indices/coordinates.
I would think that's sufficient to have a function like this:
function map_spatial_indices(dest, dest_cs::CartesianSystem, src, src_cs::SphericalSystem)
...
end
That function certianly isn't pretty and it would be nice if we could make it work through getindex
.
Currently we don't have very native CUDA support for these two kernels, so the test set mainly serves as "okay, we can convert the lazy kernel array to CuArray by following the steps in tests and then use GPU to do multiplication and fft"
@peterkovesi Thanks for the information! I've made an initial implementation of the Profiling my initial implementation code shows that almost 80% of the computation is spent on the log Gabor filter, which is definitely a performance hotspot. That said, this PR is far from ready or satisfying. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very nice! One overall comment (multidimensional police reporting for duty, sir 😄): how about parametrizing this as in https://stackoverflow.com/a/25633868/1939814? You can of course rotate the coordinates before applying that formula, so it might make sense to also allow a rotation transformation.
The value `λ=2` should not be used in combination with phase offset `ψ=-π/2` or `ψ=π/2` | ||
because in these cases the Gabor function is sampled in its zero crossings. | ||
|
||
In order to prevent the occurence of undesired effects at the image borders, the wavelength |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to prevent the occurence of undesired effects at the image borders, the wavelength | |
In order to prevent the occurrence of undesired effects at the image borders, the wavelength |
because in these cases the Gabor function is sampled in its zero crossings. | ||
|
||
In order to prevent the occurence of undesired effects at the image borders, the wavelength | ||
value should be smaller than one fifth of the input image size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this specific number come from? Is this a categorical difference? I.e., do you get "undesired effects" for 4.9 but none for 5.1?
## `kernel_size::Dims{2}` or `kernel_axes::NTuple{2,<:AbstractUnitRange}` | ||
|
||
Specifies either the size or axes of the output kernel. The axes at each dimension will be | ||
`-r:r` if the size is odd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's r
? And what happens in the even case?
## `orientation::Real`(θ∈[0, 2pi]): | ||
|
||
This parameter specifies the orientation of the normal to the parallel stripes of a Gabor | ||
function [3]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add "0 would indicate vertical (first-axis) stripes." (If that's correct.)
if kern.normalize | ||
# normalize: from reference [1] of LogGabor | ||
# By changing division to multiplication gives about 5-10% performance boost | ||
x, y = (x, y) .* kern.freq_scale | ||
end | ||
ω = sqrt(x^2 + y^2) # this is faster than hypot(x, y) | ||
θ = atan(y, x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do already have coords_spatial
and ImageAxes.filter_space_axes
. I'm not a fan of color axes because having one destroys some of our most important abstractions (one array element = one pixel), but I recognize we may want to support such constructs slightly better than we do now.
|
||
@inline Base.size(kern::Gabor) = map(length, kern.ax) | ||
@inline Base.axes(kern::Gabor) = kern.ax | ||
@inline function Base.getindex(kern::Gabor, x::Int, y::Int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we index in y, x
format?
xr = x*c + y*s | ||
yr = -x*s + y*c | ||
yr = γ * yr | ||
return exp(-(xr^2 + yr^2)/(2σ^2)) * cis(xr*λ_scaled + ψ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we index this with out-of-bounds x
and y
? If lack of BoundsError
is deliberate, it should be documented.
What happened to this rework of Gabor function in ImageFiltering.jl? When I google this, it returns this stuff but not implemented in ImageFiltering.jl? |
The previous
Kernel.gabor
function has three drawbacks:Tuple{Matrix, Matrix}
as real and imaginary part; we should just returnMatrix{Complex{Float64}}
insteadbenchmark on gabor filter:
Docs Preview: https://juliaimages.org/ImageFiltering.jl/previews/PR235/demos
TODO:
The implementation of the log gabor filter heavily follows https://www.peterkovesi.com/matlabfns/PhaseCongruency/Docs/convexpl.html so I'm wondering if @peterkovesi could help review this.