You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Goal: Ability to clone document groups between projects efficiently and stay organized
Edit: Ensure self-hosted version can clone/copy CropWizard (or arbitrary doc groups) into a self-hosted instance.
Definitions
Source - the document group being cloned.
Destination - the project doing the cloning.
e.g. Cropwizard (Source) is cloned into Industry Partner project (Destination).
No matter what, the Destination project will not have "full, unrestricted access to the source files." They will only see them in search results, not available for download. To protect the intellectual property of the source data. This can be configurable with Authorized downloaders of source data.
3 Implementation methods
Deciding between final options:
Copy by reference - receive all upstream changes instantly.
vs
Full copy, detached from the source. Mutable so it can accept changes. Optimized for no duplicate files.
Cloning options considered
Clone by Pointer/Reference: Clone w/ no downstream control AT ALL. They get whatever the source does. They won't even see the individual docs, just a summary of "imported from CropWizard, 409,000 docs." The Source docs appear in /chat results but not in the /materials table (i.e. they can't export the Cropwizard DB source PDFs with a single click).
Clone by full copy of the data: it's optimized so there's no duplication of files, but at it's core, it just a full copy of everything with complete ownership by the new project.
Github-style forking / merging - my favorite and wouldn't be too crazy to implement in theory. Customer adoption might be a challenge, especially with crummy UI.
Destination projects can CREATE and REMOVE shared projects
How do they know what projects can be shared? Table of available ones.
Features:
Destination projects can have multiple shared projects. They can delete them, too.
They have NO CONTROL over the source project. The destination sees all changes.
UI:
Table view with projects that are "available to be imported"
Some are "starred" to show up first. Otherwise, they can search for ANY public project (until we have unlisted as a concept...)
Source deletes a doc --- how does the Destination retain that doc?
Source adds a doc --- how do we prevent Destination from getting the new one?
SQL field for private in doc_groups table.
SQL foreign key to doc_groups for subscribed_doc_groups in projects table.
Update Qdrant filtering to allow docGroups.
The search conditions are as follows:
* Main query: (course_name AND doc_groups) OR (public_doc_groups)
* if 'All Documents' enabled, then add filter to exclude disabled_doc_groups
The text was updated successfully, but these errors were encountered:
Goal: Ability to clone document groups between projects efficiently and stay organized
Edit: Ensure self-hosted version can clone/copy CropWizard (or arbitrary doc groups) into a self-hosted instance.
Definitions
Source
- the document group being cloned.Destination
- the project doing the cloning.Source
) is cloned into Industry Partner project (Destination
).No matter what, the
Destination
project will not have "full, unrestricted access to thesource
files." They will only see them in search results, not available for download. To protect the intellectual property of thesource
data. This can be configurable withAuthorized downloaders
ofsource
data.3 Implementation methods
Deciding between final options:
vs
Cloning options considered
Shared Projects Table
Source project slug
Destination project slug
Excluded document groups
Functions:
shared projects
Features:
shared projects
. They can delete them, too.source project
. The destination sees all changes.UI:
Source deletes a doc --- how does the Destination retain that doc?
Source adds a doc --- how do we prevent Destination from getting the new one?
SQL field for
private
indoc_groups
table.SQL foreign key to
doc_groups
forsubscribed_doc_groups
inprojects
table.Update Qdrant filtering to allow docGroups.
The search conditions are as follows:
* Main query: (course_name AND doc_groups) OR (public_doc_groups)
* if 'All Documents' enabled, then add filter to exclude disabled_doc_groups
The text was updated successfully, but these errors were encountered: