Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

createrepo_c silently parses bad metadata if the repository contains the same package (same pkgid, same NEVRA) multiple times #306

Open
dralley opened this issue Feb 16, 2022 · 1 comment

Comments

@dralley
Copy link
Contributor

dralley commented Feb 16, 2022

The parsing API used by createrepo_c (the examples at least) is prone to parsing incorrect metadata in some scenarios where the metadata being parsed is, itself incorrect. But in this case the parsed metadata is incorrect in a different way.

If pkgcb inserts a package object into a dictionary / hashmap type keyed by the pkgid, then the second occurrence of duplicate pkgid replaces the initial one.

But after that happens, newpkgcb is used to with the other metadata files, and it gets the package from the dictionary / hashmap and adds files and changelog metadata to it. The end result is that the dictionary / hashmap will contain only one of the original two packages listed in the metadata, but that package will contain extra copies of every file and changelog.

As described here this would also likely be true even with the parse_main_metadata_together() API.

This issue is compounded because createrepo_c doesn't prevent such incorrect repositories from being created: #307

@dralley
Copy link
Contributor Author

dralley commented Jun 27, 2022

The PackageIterator API allows you to avoid this and the old locate_and_load_xml() API allows you to configure what the behavior should be. The others are susceptible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants