-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ToddCoxeterBand method #691
base: main
Are you sure you want to change the base?
Conversation
Update on the status of this pull requestFile which I reference later: Band-Todd-Coxeter.pdf (note: there are examples with diagrams in this document which will hopefully make the stuff explained in this comment clearer!) Most of the work here was done a while ago, so memory is a bit rusty, but the stuff here is still relevant although it needs a bit more work before being justified. I'm just going to write out what work led us here, and I'll explain what I'm still unsure about. The code written in this PR is a first attempt at answering the question:
For general presentations (not band presentations), we can use the Todd-Coxeter (TC) algorithm to answer this question. The rigorous backing for this that I have familiarised myself with is the work of @james-d-mitchell, @flsmith, @mariatsalakou and Tom Coleman, which describes coset tables as labelled digraphs (called "R-digraphs"). The node set of an R-digraph is a subset of A* (these are the "cosets" of coset tables), its edges are labelled by elements of A, and the restriction is that if two paths over the graph lead to the same node, then the elements given by those paths must be equal in the presentation. They explain that any TC-style algorithm - referred to as a congruence enumeration process - can be implemented as a combination of three basic digraph manipulations, starting with the digraph at step i, G(i), and outputting a new digraph at step i+1, G(i+1). The algorithm also keeps track of a set of coincidences K(i), a set of pairs of words which are known to be equal in the presentation, but currently give different paths in G(i). (the idea is that eventually in the algorithm, any coincidences should be eliminated so that the R-digraph is consistent with the presentation is is computing). The three steps are, roughly:
With some additional assumptions, the idea is that congruence enumeration processes for finite semigroups always terminate on a description of the semigroup. Our idea with @reiniscirpons and @MTWhyte was then to adapt this to work for bands too. How do we do this? If we're given an alphabet A of size n and a set of relations R, we want to compute the smallest band which satisfies the relations R. The first thing we could try is to add "the band relations" to R, and then run the usual Todd-Coxeter algorithm on A and R. If, say, we have A = {x,y} and R = {xy=yx}, then the "band relations" (a list of relations which give a presentation for the free band on A) are {xx=x, yy=y, xyxy=xy, yxyx=yx}. So, running TC on <A, {xx=x, yy=y, xyxy=xy, yxyx=yx, xy=yx}> will give the result we're looking for. However, for n equal to 4 and above, the list of band relations becomes incredibly long. The idea is then to see whether we can tweak the TC algorithm so that we don't have to explicitly pass the band relations, but still get the right result. My first idea (see section 2.1 of attached doc) was to change the definition of TC1 so that it labels nodes in the free band canonical form. This can actually be generalised to work in any variety of semigroups, so let V be any variety (with the aim of later taking V = variety of bands). Suppose we have a function which, for any word w in A*, returns the shortlex-smallest word in A* which is equal to w in (the free object of) the variety V (call this its canonical form and denote it bar(w)). Then our new TC1 (call it TC1V) does something like:
This was the first thing we naturally tried to do when devising a band Todd-Coxeter (BTC) algorithm: only label cosets using some canonical form. In the attached document I prove that this is valid. More specifically: Result from Section 2.1. Suppose we have a presentation <A,R> implicitly given in a variety V. Let R' be the union of the explicit relations R with the set of all implicit relations generating the free object in the variety V (remember, as mentioned before, this set of relations may be significantly larger than R, so we want to avoid computing R' and/or passing it as input to the Todd-Coxeter algorithm). Then, any sequence of TC1V, TC2 and TC3 has exactly the same properties as a congruence enumeration process for R': namely, these three steps output R'-digraphs when given R'-digraphs, and any sequence of applications eventually stabilises. The next step in the original paper is to devise specific combinations of TC1, TC2 and TC3 (algorithms or strategies) which not only stabilise, but stabilise exactly at a description of the semigroup. It is shown that this always happens provided the algorithm satisfies three conditions, in which case we call it a proper congruence enumeration process. These conditions essentially require that everything happens as you would hope; they roughly say:
Switching now to the band-specific setting (variety V=B), we have a canonical labelling function for free band elements and a clever band step TC1B from above. I have managed to show that a similar algorithm to the well-known HLT strategy satisfies properness conditions 1 and 3, but the issue I am having is that I am not sure whether it also satisfies properness condition 2. The reason for this is as follows. We have a presentation <A,R>, and a larger presentation R' which contains all of R as well as all the free band relations. The HLT approach guarantees that coincidences involving pairs in R are noticed for every coset, since these relations are passed explicitly. However, I am not sure this necessarily means all the implicit relations are also noticed (since they are never explicitly pushed through the cosets). This can be reduced to the following problem: given a canonical node w and a free band element u in canonical form, can we show that eventually either w is not a node or e(w,uu) = e(w,u)? I don't think we ever found a counterexample - but I haven't managed to show that the implicit relations will always end up being somehow implicitly pushed so as to satisfy condition 2. So, in section 2.2 of the attached document, I detail a modified HLT strategy where, as words v are added to the node set, we also push the relation vv=v quite a few times through all the nodes. This is a shoddy fix because it means that actually, we end up explicitly treating band relations which we were hoping we could keep implicit. However, this is still better than explicitly passing all the band relations from the outset. At the end of the attached document I seem to have been close to proving that with these additional relations gradually becoming explicit, the process is indeed proper (I did this last summer so I'm not sure how far we are from a full proof). This algorithm reflects the state of the document, in that we have a coset creation function which cleverly creates new table entries, with the relation pusher and coincidence processor functions relatively unchanged. The actual loop that implements the strategy currently contains, as explained, some extra steps which push band relations. Future work on this PR (likely not done by me since I am graduating and therefore my brain will be in decline) should aim to either prove that this shoddy approach is valid, then implement it in GAP and then C++ code - or, ideally, prove that the strategy is valid even if we never push implicit relations, and then implement that. There is also a Felsch analogue to be developed. Good luck! Description of functions implemented in this PR
Functions inside the main feature,
Then as mentioned, |
Edit: for an update, see enormous comment below
This PR builds on @MTWhyte's PR #684 adding a ToddCoxeterBand method, also joint work with @reiniscirpons under @james-d-mitchell. This version is consistent with the properness proof we have written, the changes from 684 being:
new_coset
now checks pre-existence of words via paths (i.e. checks value oftau(i, canon(wa))
rather thancanon(wa) in words
)words[n]words[n] = words[n]
.canon
function now ensures the canonical form is shortlex-minimal.This is WIP until output format is decided and tests and docs are written. This function should eventually be implemented in libsemigroups, preferably with an analogous Felsch implementation (the strategy here is broadly HLT).