Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certain combinations of filter expressions may produce unexpected query results #25565

Open
Tracked by #25091
hiltontj opened this issue Nov 18, 2024 · 0 comments
Open
Tracked by #25091
Labels

Comments

@hiltontj
Copy link
Contributor

hiltontj commented Nov 18, 2024

As of now this is officially only a hunch, as there is no reproducer yet. But, my hunch is that with how the last cache is handling predicates, if a query contains multiple predicates on a single column, it will not properly handle them.

For example,

SELECT * FROM last_cache('foo') WHERE bar = 'baz' OR bar = 'bop'

will only evaluate using one of the bar = predicates, instead of both in combination.

The fix should be to combine expressions in such a scenario to become:

SELECT * FROM last_cache('foo') WHERE bar IN ('baz', 'bop')

Other scenarios that need to be considered are where multiple incompatible predicates are provided, for example,

SELECT * FROM last_cache('foo') WHERE bar = 'baz' AND bar = 'bop'
SELECT * FROM last_cache('foo') WHERE bar = 'baz' OR bar != 'baz'

Should either be an error, or ignored, so DataFusion can handle it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant