-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: historicized membership data model #4542
Conversation
@@ -108,6 +108,22 @@ export class ContentFragmentResource extends BaseResource<ContentFragmentModel> | |||
}, | |||
}; | |||
} | |||
|
|||
async update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the update
method from the BaseResource
. Ideally, we don't want to have these update
methods on resources since it's just exposing the Sequelize API on resources (and we're actively trying to strangle the use of Sequelize API within resources)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the meantime, the implementation was copypastad onto the resource classes that use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok; so afaiu:
- we always want to update resources;
- long term, the best is to have a clean (non-sequelized based) base method for update;
- in the meantime we basically havwe a pass-through sequelize;
So at some point we'll refactor all the updates. In what way is it cleaner to refactor the copy-pasted version in each resource, vs refactor the base_resource version + change the call sites?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So at some point we'll refactor all the updates. In what way is it cleaner to refactor the copy-pasted version in each resource, vs refactor the base_resource version + change the call sites?
I don't think we will. Having a generic "update" is not desirable as it is a departure from the resource pattern. What we want is to have resource-specific methods that may perform updates. For example "revoke()" instead of "update({role: 'revoked'}".
} | ||
if (userIds && !userIds.length) return []; | ||
|
||
// Get all the memberships matching the criteria. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we may be doing a bit of over-fetching if some members were revoked and re-invited several times. I believe this is fine since:
- the number of over-fetched stuff will be quite low
- it is not on a hot path
- it makes the code / querying a lot simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a hot path since it's called from getLatestMembershipOfUserInWorkspace
which is called from login ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the over-fetch is fine, but noting this is quite hot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you don't get revoked memberships here do you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we do get the revoked memberships, it's this method's purpose vs getActiveMemberships.
The overfetching comes from getting the non-latest memberships, which IMO is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First batch of comments.
Need more time to review. Let's try to have WorkspaceType or LightWorkspaceType in all interfaces of ressources 🙏
Will require minimal additional gymnastic but will set us up for a much better world.
This is temporary as we'll have a workspace resource at some point. Once we get there we'll be fine but in the meantime we're going to have to either accept |
Agreed that we may have a big WorkspaceResource that spans Menbership management. But in the meantime I think it's a low cost change to Accept a LightWorkspaceType which has sId and id in these interface. We just need to call something like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey ! First pass, pushing coms now so you can check them, finishing the reveiw meanwhile
@@ -108,6 +108,22 @@ export class ContentFragmentResource extends BaseResource<ContentFragmentModel> | |||
}, | |||
}; | |||
} | |||
|
|||
async update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok; so afaiu:
- we always want to update resources;
- long term, the best is to have a clean (non-sequelized based) base method for update;
- in the meantime we basically havwe a pass-through sequelize;
So at some point we'll refactor all the updates. In what way is it cleaner to refactor the copy-pasted version in each resource, vs refactor the base_resource version + change the call sites?
); | ||
|
||
const where: WhereOptions<InferAttributes<MembershipModel>> = {}; | ||
if (roles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: wouldn't that work?
const where: WhereOptions<InferAttributes<MembershipModel>> = {
role: roles,
userId: userIds ? { [Op.in]: userIds } : undefined,
workspaceId: workspace ? workspace.id : undefined,
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and also a similar version for the function above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it might, but I am not convinced that inline ternaries are bringing a lot of extra readability here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I understand;
I advocate for it because it turns 13 lines into 3, and while I agree that ternaries are hard to read in some cases e.g. when nested, flat ternaries are the equivalent of saying "hi" vs "good morning"; legit part of the language (that we commonly use) and we should learn to read them as fluently as as we learned to read the if
s
Clearly a nit, will approve anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright you flipped me :) changing
where.workspaceId = workspace.id; | ||
} | ||
|
||
if (!workspace && !userIds?.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not check that also on the function above? or globally via the GetMembershipOptions
type (don't know if it is actually possible)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's partially done by the type system, but sadly the type cannot easily check that the array is non-empty (unless you force the caller to always cast the array as non-empty for non-literal cases.
Indeed I should add it in the other method 🙏
if (!workspace && !userIds?.length) { | ||
throw new Error("At least one of workspace or userIds must be provided."); | ||
} | ||
if (userIds && !userIds.length) return []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh ok. But if we accept userIds = [] with a workspace (and return [] in that case), why not accept it without a workspace and return [] too, and then remove the line above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workspace is either null or non-null, so it's easy to force it via type system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thiink we want to have a getUserActiveMembership
function to use in places of:
- getLatestMembershipOfUserInWorkspace (maybe except the membership update endpoint but TBD)
getActiveMemberships
with[userId]
as argument
This will simplify many code places by removing the [0] derefs of getActiveMemberships
and more importantly all the places where we check the endDate
existence and logic (error prone yet security critical) outside of the Resource.
} | ||
|
||
if (!workspace && !userIds?.length) { | ||
throw new Error("At least one of workspace or userIds must be provided."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type system does not catch that given RequireAtLeastOne
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot check for array emptiness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(at least not without making it quite painful on the caller side with a bunch of manual casts)
for (const m of memberships) { | ||
const key = `${m.userId}__${m.workspaceId}`; | ||
const latest = latestMembershipByUserAndWorkspace.get(key); | ||
if (!latest || latest.startAt < m.startAt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we prioritize end = null first and startAt second?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically equivalent, since you cannot create a new membership if there is already an overlapping one. If you have an active membership (i.e non-ended), it is necessarily the one with the greatest startAt, so it simplifies the code to only check for startAt.
} | ||
if (userIds && !userIds.length) return []; | ||
|
||
// Get all the memberships matching the criteria. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a hot path since it's called from getLatestMembershipOfUserInWorkspace
which is called from login ?
} | ||
if (userIds && !userIds.length) return []; | ||
|
||
// Get all the memberships matching the criteria. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you don't get revoked memberships here do you?
@@ -108,6 +108,22 @@ export class ContentFragmentResource extends BaseResource<ContentFragmentModel> | |||
}, | |||
}; | |||
} | |||
|
|||
async update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
front/lib/auth.ts
Outdated
userId: user.id, | ||
workspaceId: workspace.id, | ||
}, | ||
const memberships = await MembershipResource.getActiveMemberships({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be in favour of having a getActiveMenbership
function to avoid the [0] deref. We'll likely need it in other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
userId: user.id, | ||
workspaceId: membershipInvite.workspaceId, | ||
}, | ||
const m = await MembershipResource.getLatestMembershipOfUserInWorkspace({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be easier to have a getActiveUserMenbership
function that returns a membership only if it has an active one for the workspace. Will abstract the error prone endAt checks inside the Resource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as per IRL, we need it to put users in the right flow. Refactored with isRevoked()
@@ -188,7 +187,10 @@ async function handleRegularSignupFlow( | |||
SSOEnforcedError | |||
> | |||
> { | |||
const activeMemberships = await getActiveMembershipsForUser(user.id); | |||
const activeMemberships = await MembershipResource.getActiveMemberships({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also use getActiveUserMenbership
kinda weird you use both getActiveMemberships
and getLatestMembershipOfUserInWorkspace
in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each case is different. Here we get all active memberships of a user (across all workspaces).
Other than the comments above generally LGTM. Will make another review once finalised given the security criticality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another little batch (would say 2/3)
transaction, | ||
}); | ||
// Then, we only keep the latest membership for each (user, workspace). | ||
const latestMembershipByUserAndWorkspace = new Map< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a while to gather what the code did. here we could
return memberships
.reduce( (arr, v) => if !(arr.find( (a) => a.userId === v.userId && a.workpaceId === a.memberId)) arr.push(v) )
.map( (resource) => new MembershipResource(MembershipModel, resource.get())
leveraging the fact that membership is already ordered
this would remove the need for orderedResourcesFromModels
and for an intermediary map; wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your version is less performant with O(n^2) complexity and IMHO less readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the perf side, I thought o(n^2) would not be a problem here, might have been a bit fast in that assesment I reckon 👍
On the readability, I'm genuinely surprised, this can make up for a good 80/20 debate :) 🍺
I get that you don't like "reduce", but to the point of creating a custom 6-line function, then 11 lines to turn an array into a map, that you will then turn back into an array?
16 lines (omitting the map line that is common to both solutions), whereas a 1-line reduce that directly says "add to the array if it isn't already there"
Obv will not block approval for this of course, but glad to discuss that on friday 👍
throw new Error( | ||
`Unreachable: Found multiple latest memberships for user ${userId} in workspace ${workspace.id}` | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we may be doing a bit of over-fetching if some members were revoked and re-invited several times. I believe this is fine since:
the number of over-fetched stuff will be quite low
it is not on a hot path
it makes the code / querying a lot simpler
Re this comment above => is this still not on a hot path here? (or expected to be quick anyways)
just checking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be a hot path after all with userIds.length === 1 . However the over fetching would be ridiculously small in such cases, and non-existent most of the time.
if (endAt < membership.startAt) { | ||
throw new Error("endAt must be after startAt"); | ||
} | ||
if (membership.endAt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so not possible to update the endDate of a membership. Good for me since we probably don't have "future" endDates in code yet, so out of the PR scope (mentioning to confirm it's intentional)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No currently indeed, but would be relatively straight forward to add support for. Wanted to avoid introducing such capability without a clear need for it.
return this.toJSON(); | ||
} | ||
|
||
toJSON() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: basically toJSON
= the business defined json, and toListJSON = the JSON with all the raw db fields?
=> would go with toRawJSON instead of toListJSON
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toJson = SomethingType
toListJson = LightSomethingType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to remove them actually they don't make a ton of sense for this resource since we never really serialize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh so wasn't it toLightJSON? and not toListJSON ?
but yeah if we don't use it removing is best
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cargo culted from other resources, but I agree toLightJson wouldn't be bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, this was a large one, thanks 🙏 Mostly good for me, cf a few coms split on 3 reviews
We'll be better with that, worth it.
front/poke/temporal/activities.ts
Outdated
workspaceId: workspace.id, | ||
}, | ||
const memberships = await MembershipResource.getLatestMemberships({ | ||
workspace: workspace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove 1 workspace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaned up
@@ -25,105 +19,6 @@ const { DUST_DATA_SOURCES_BUCKET = "", SERVICE_ACCOUNT } = process.env; | |||
// `cli` takes an object type and a command as first two arguments and then a list of arguments. | |||
const workspace = async (command: string, args: parseArgs.ParsedArgs) => { | |||
switch (command) { | |||
case "find": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we delete find / add-user? is it sideways from the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of sideways yes. Some of the commands required refactoring for the new setup, and since all of these are dead code I figured I might just delete them so the total net diff is a little bit less green :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
indexes: [ | ||
{ fields: ["userId", "role"] }, | ||
{ fields: ["startAt"] }, | ||
{ fields: ["endAt"] }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the queries we use, shouldn't this be ["endAt", "startAt"]
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right (although in practice it's almost the same).
Will do in separate PR since this one has no migration (just moved this mode)
* make startAt non-nullable * make startAt non-nullable * membership resource * fix bug * more checks * misc BL fixes * migration to delete the revoked role * remove comment * review: use lightWorkspaceType only as interface * support tx everywhere * use resource for poke as well * enh: isRevoked * active membership of user in ws * add check to other function * remove serialization methods * cleanup * pr syntax --------- Co-authored-by: Henry Fontanier <[email protected]>
Description
fixes https://github.com/dust-tt/tasks/issues/548
In #4521, we added
startAt
andendAt
on thememberships
table. The idea is to stop using therevoked
role, and instead have a history of memberships so we can tell who was a member of what workspace at any given time.Previous PR added the fields, the backfill script, and the BL to automatically fill
startAt
on membership creation andendAt
on membership revoke. But everywhere in the app, we still use therevoked
role instead of checking the endDate, and we still update memberships in-place.This PR changes the memberships business logic to switch to historicized memberships, removing the need for the
revoked
status. We now have aMembershipResource
that follows the resource pattern and centralizes usage of the memberships table (with one exception, the Poké workspace deletion flow).Risk
Security critical
Quite risky as it touches the memberships.
Deploy Plan
revoked
status from the DB