Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix namespaced TreeElements #696

Closed
wants to merge 4 commits into from
Closed

Conversation

lognaturel
Copy link
Member

Closes #695 and fixes some additional issues I discovered around path expressions with namespaced nodes. I considered breaking these up into multiple PRs but they feel easier to understand together. If any single commit feels riskier or harder to understand, I can separate it out.

What has been done to verify that this works as intended?

Added a test for each of the fixes, ran the whole test suite. Have NOT tried in Collect yet.

Why is this the best possible solution? Were any other approaches considered?

The first commit fixes an issue with XPath path expressions that have namespace prefixes in them. In some case, the namespace prefix was not being considered correctly. The history there is that https://github.com/getodk/javarosa/pull/153/files added support for namespaced instance elements. Either that was incomplete or later performance-related changes broke this (related to #192). I didn't see any alternatives here and it feels safe to me.

The second commit fixes the original issue. After deserialization, there was no record of the namespace prefix. I was initially confused because #155 added testing related to serialization as discussed at #153 (comment) but after closer inspection I realized we had a misunderstanding about which serialization. The test verifies serializing the instance to XML, not caching the form definition. I saw two options to fix this: serialize the namespace prefix or serialize a map from prefix to namespace. I went with serializing the namespace prefix for each node. There's potential for redundancy but I don't expect namespaced instance nodes to be very common and we can revisit if we think it's worth taking on extra complexity. This also feels safe to me.

After fixing the first issue, I went looking for other comparisons against TreeElement names that might cause issues. I noticed that there were problems around attributes.

In the third commit, I added testing for getAttribute. I also discovered an existing test with a bad assertion that would have caught the issue I was noticing. Things got a little awkward here. getAttribute gets called by parsing code which wants to get attributes according to well-known node names and namespaces. But it also gets called when evaluating XPath path expressions. In that case, the namespace prefix and the node name are passed in and the namespace is always null. I considered breaking up those two cases but I wasn't confident enough that I could identify exactly when each case would be needed. Even though it's not very pretty, it feels safer to me to keep the existing single method with two cases. As it is now, this feels low risk but maybe a bit higher risk than the others. Another thing I found out doing this work is that when attributes on instance elements are in the default namespace, they're set to an empty string rather than null. I couldn't find where that happened so I added an explicit || (Objects.equals(attribute.namespace, "") && namespace == null) case for that. Again, not the prettiest but it has the advantage of being explicit and relatively easy to verify.

Finally, it occurred to me to write a test to get the value of a namespaced attribute on an instance element. That was not working either. That's because the namespace prefix was not being saved on attribute TreeElements. I was tempted to get into some deeper refactoring here but didn't. I kept the existing method signatures and made my changes additive except for setBindAttribute which really feels like an internal implementation detail. I think there's some risk here that the fix is incomplete.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

All intentional changes are bug fixes as described above and by the tests. I overall feel like this is additive and pretty low risk. That said, if people are using custom namespaces and attributes in their forms, I could imagine some changes in behavior or regressions. This also does touch some broadly-used things like bind attributes. There's also always a bit of risk when touching serialization.

Do we need any specific form for testing your changes? If so, please attach one.

See tests.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No.

An alternative would be to serialize the namespace map. Currently this is only available at parse time so it would be a bigger change. We generally don't expect many namespaced instance elements so this is a pragmatic approach for now. If we find namespace usage takes off, we can reconsider.
There was a test for this but the assertion was wrong. Fixed that and added more cases. Ideally the attribute namespace would never be a blank string but I can't figure out where that is being set.
@lognaturel lognaturel requested a review from seadowg October 26, 2022 22:58
@lognaturel
Copy link
Member Author

We've decided to close this for now. After discussion and reflection, we're no longer seeing as much value to using namespaces in the instance. See XLSForm/pyxform#105 (comment) for related discussion.

@lognaturel lognaturel closed this Dec 15, 2022
@lognaturel lognaturel deleted the bind-ns branch December 15, 2022 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bind calculates do not work for nodesets with prefixed nodes for cached FormDef
1 participant