Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization Refactor #4162

Draft
wants to merge 40 commits into
base: master
Choose a base branch
from
Draft

Serialization Refactor #4162

wants to merge 40 commits into from

Conversation

antonydellavecchia
Copy link
Collaborator

Refactor serialization to use a new function called type_params which helps build a stack of type dependencies which will allow for serializing type information before the data.

lkastner
lkastner previously approved these changes Nov 20, 2024
@lgoettgens
Copy link
Member

This PR seems to have diverged a lot from current master due to other changes to the serialization code in the meantime. Could you try to resolve the conflicts / rebase this onto a more current master, so that one can more easily follow what's happening and CI runs again?

I am a bit afraid that this just continues to diverge even further in the future, which makes it harder to resolve conflicts once you want to merge it.

@lgoettgens lgoettgens dismissed lkastner’s stale review January 15, 2025 09:14

many changes since approval, and PR has conflicts. please re-review once this is ready to merge

@@ -670,7 +670,7 @@ function automorphism_group(K::SimplicialComplex; action=:on_vertices)
end

@doc raw"""
on_simplicial_complex(K::SimplicialComplex, g::PermGroupElem)
on_simplicial_complex(K::SimplicialComplex, g::PermGroupElem; action=:on_vertices)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this maybe a conflict resolution artifact? it does not match the signature of the function this docstring is attached to

return G
end)

install_GAP_deserialization(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ThomasBreuer I think I need your help here, and it also makes sense for you to understand what I am doing here anyway.

I am currently try to get the serialization refactor working for gap objects, and the first one that causes an issue is when we try and save subgroups of free groups

If you see above I have a new function called type_params and this handles when params need to be added to the type description for example saving a subgroup of a free group currently looks like this

{
  "_ns": {
    "Oscar": [
      "https://github.com/oscar-system/Oscar.jl",
      "1.3.0-DEV-0324e87a589a2c2c5b6f4b0db4634b14477250ae"
    ]
  },
  "_refs": {
    "fef8f069-d2db-4c88-9118-f2b7ba1c3999": {
      "_type": "GapObj",
      "data": {
        "GapType": "IsFreeGroup",
        "names": [
          "f1",
          "f2"
        ],
        "wfilt": "IsSyllableWordsFamily"
      }
    }
  },
  "_type": {
    "name": "GapObj",
    "params": "fef8f069-d2db-4c88-9118-f2b7ba1c3999"
  },
  "data": {
    "GapType": "IsFreeGroup",
    "gens": [
      [
        "1",
        "1"
      ]
    ]
  },
  "id": "ab56be5f-1bf6-4fec-a3d5-35cb59b6096b"
}

Note how there is no longer a freeGroup key in the data section and that the type information of the GapObj now has a params key.

The issue I am having now is that I need to pass the free group down, but this means I need two different function signatures.

But I am guessing these cannot share the same namespace in GAP?

please let me know if you have any questions about some of the changes here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antonydellavecchia I am not sure whether I understand the aim of the proposed changes.
A subgroup of a full free group is best described in terms of generators relative to the full free group, hence the "old" approach first serializes the full free group and then the generators of the subgroup in question, and the deserialization first reconstructs the full free group and then the subgroup generators w.r.t. the full group.

The idea of the "new" approach seems to be that the function type_params shall describe the object(s) to be serialized together with the data of the given object.
The problem is that type_params for a GapObj has to deal with a lot of different GapObjs.
In the "old" serialization and deserialization of GAP objects, the approach using install_GAP_serialization and install_GAP_deserialization yields small Julia functions for different kinds of GAP objects. One could think of an analogous install_GAP_type_params in order to achieve this for type_params iin the "new" approach, but what is the advantage of using type_params, compared to the "old" (recursive) serialization approach?

Just for curiosity: How does the "new" approach handle objects that need more than one dependent object to be serialized?
(For example, the most natural way to serialize a map seems to be to serialize domain and codomain and information that defines the function in question.)
These dependent objects are stored in _refs, their keys in _refs shall be referenced somewhere in _type, but how does the deserialization code know which is which? In the "old" approach, the references to the keys in _refs are stored in fields inside data (like the freeGroup field), whose meanings are defined by the serialization.

Copy link
Collaborator Author

@antonydellavecchia antonydellavecchia Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that type params will do a shallow pass through the object and obtain all necessary type information for the object. This has many benefits, the main one being that this would allow us to put all type information at the top of the file and hence we would be able to read in all params for the object before we get to the data in a linear way.

here is another example of how a the new format looks for a MPolyAnyMap.
You can see that the params section can also have keys, i.e. we can have params that are dictionnaries if necessary
(this does not yet include the linear ordering)

{
  "_ns": {
    "Oscar": [
      "https://github.com/oscar-system/Oscar.jl",
      "1.3.0-DEV-98fc56d6f49a6eacecd525b01c0e543857608f58"
    ]
  },
  "_refs": {
    "bede6628-38bf-4803-88df-2ed7487fc8c8": {
      "_type": {
        "name": "MPolyRing",
        "params": {
          "_type": "QQField"
        }
      },
      "data": {
        "symbols": [
          "s",
          "t",
          "m"
        ]
      }
    },
    "fc49a9a8-91af-4b54-888b-58b4f2dd1958": {
      "_type": {
        "name": "MPolyRing",
        "params": {
          "_type": "QQField"
        }
      },
      "data": {
        "symbols": [
          "x",
          "y"
        ]
      }
    }
  },
  "_type": {
    "name": "MPolyAnyMap",
    "params": {
      "codomain": {
        "name": "MPolyRing",
        "params": "bede6628-38bf-4803-88df-2ed7487fc8c8"
      },
      "domain": {
        "name": "MPolyRing",
        "params": "fc49a9a8-91af-4b54-888b-58b4f2dd1958"
      }
    }
  },
  "data": {
    "images": [
      [
        [
          [
            "1",
            "1",
            "0"
          ],
          "1"
        ]
      ],
      [
        [
          [
            "0",
            "0",
            "1"
          ],
          "1"
        ]
      ]
    ]
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @antonydellavecchia .
Thus the rule is that it is recommended (or mandatory?) to "announce" the dependent objects via type_params.

Then the interface for the (de)serialization for a given object type T (here: GapObj) consists of

  • a save_object method with second argument T,
  • a load_object method with two or three arguments, the second argument being T,
  • a type_params method with one argument of type T, returning S, X where X is either nothing or an object of type S (a dictionary or a GapObj); in the latter case, there must be a three argument method for load_object that takes X as its third argument.

If this is correct then I propose a function install_GAP_type_params as sketched above for the case of GapObj. I can provide the code for that.

Copy link
Collaborator Author

@antonydellavecchia antonydellavecchia Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule will become mandatory.

the load object third parameters will exactly be the second entry of the output from type_params

the output of type_params should be a tuple where the first entry is the type and the second entry are the params for that type, for example above type_params with return the type

PolyAnyMap, Dict(:codomain => QQ[:s, :t, :m], :domain => QQ[:x, :y])

for a group element g It would be some thing like

GroupElem, parent(g)

and yes exactly, if the second output of type_params is nothing, then that particular load_object method only needs two parameters

antonydellavecchia and others added 4 commits January 16, 2025 12:24
- introduce `install_GAP_type_params` that implements
  `type_params` methods based on GAP filters
- adjust code for `IsFreeGroup` and `IsSubgroupFpGroup`;
  now their tests work again
@ThomasBreuer
Copy link
Member

The tests for pc groups are still disabled. Here the problem is that load_object(s, Vector, (Tuple, [Int, Int, [Vector, Int]]), :comm_rels) does not work anymore.

@antonydellavecchia
Copy link
Collaborator Author

The tests for pc groups are still disabled. Here the problem is that load_object(s, Vector, (Tuple, [Int, Int, [Vector, Int]]), :comm_rels) does not work anymore.

Thanks for that. I've made the loading of vectors a little more intuitive.
Please see the changes I've just pushed to get the GAP object serialization working

@ThomasBreuer
Copy link
Member

ThomasBreuer commented Jan 20, 2025

Please see the changes I've just pushed to get the GAP object serialization working

Thanks @antonydellavecchia.
I am going to adjust the serialization for GAP pc groups.
Now the tests for GAP pc groups pass.

@@ -112,36 +93,27 @@ end

@register_serialization_type WeylGroup uses_id

type_params(W::WeylGroup) = WeylGroup, root_system(W)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupid question maybe: It seems to me that the first return value of type_params is always the argument type. Is this a correct observation?
If yes, wouldn't it be possible to refactor that away into the calling function, so that type_params loses the first return value? Imo, this could make the code a bit more readable. If there is a reason for this, I am happy to learn about it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe there is a way to get a round it, but it was the only way I found I could get the new save_type_params to work. See main.jl

save_data_dict(s, :attrs) do
for attr in attrs_list(s, T)
has_attribute(obj, attr) && save_typed_object(s, get_attribute(obj, attr), attr)
save_type_params(s::SerializerState, obj::Any) = save_type_params(s, type_params(obj)...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
save_type_params(s::SerializerState, obj::Any) = save_type_params(s, type_params(obj)...)
save_type_params(s::SerializerState, obj::Any) = save_type_params(s, typeof(obj), type_params(obj)...)

I think this could do the trick

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is probably fine

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, a little trickier than expected, there is an issue with the containers.

I'll have to dig deeper, but I dont have time now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants