Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data from Cellar #298

Open
kris927b opened this issue Oct 25, 2024 · 2 comments
Open

Add data from Cellar #298

kris927b opened this issue Oct 25, 2024 · 2 comments
Labels

Comments

@kris927b
Copy link
Collaborator

Cellar is a repo of publications from the European Union managed by the European Publications Office (Cellar).

What knowledge does Cellar contain?
EU legal knowledge
Information on EU Policy
Research & educational knowledge
Organizational view of the EU
Historical knowledge for EU
Public procurement documents (soon)
Documents from other knowledge domains

Note: The EURLex data is contained in the Cellar, so should be filtered out or removed as a separate dataset.

@saattrupdan
Copy link
Collaborator

Regarding the EURLex overlap: This will probably be handled automatically during deduplication anyway, I suppose?

@kris927b
Copy link
Collaborator Author

Yeah. Using deduplication there should be no problem in this.
I guess only reason to remove them prior would be to minimise preprocessing time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants