Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: metadata repository visualization #312

Open
lukpueh opened this issue Feb 14, 2025 · 8 comments
Open

feature: metadata repository visualization #312

lukpueh opened this issue Feb 14, 2025 · 8 comments

Comments

@lukpueh
Copy link
Member

lukpueh commented Feb 14, 2025

TUF applications such as tuf-on-ci or RSTUF would benefit from a tool to produce a human readable / visual representation of the TUF metadata repository they drive, e.g. for auditing purposes or to review changes.

This issue is a place to discuss the different use cases and requirements for such a tool.

Useful links:

@dikshant182004
Copy link

we can convert the parsed metadata into an internal data model that will represents the trust relationships, version histories, and delegation trees.It will serve as the basis for both diffing (comparing versions) and visualization.
Then we will use a visualization library to map out the metadata relationships and changes over time such as zoomable trees for delegation hierarchies and diff views that highlight changes between metadata versions. After that we can update this into existing pipelines (e.g., with tuf-on-ci or RSTUF) to generates updated, human-readable reports automatically when metadata changes with export options available.

@trishankatdatadog
Copy link
Member

@cedricvanrompay-datadog wrote TUF Explorer, and it seems to be archived now, but maybe others can fork and update it.

@Ayush-Vish
Copy link

@lukpueh
I’m evaluating the best approach for visualizing TUF metadata repositories. Would Markdown files be a practical choice for structured visualization, or would leveraging JavaScript visualization libraries (such as D3.js, React-Flow, or Cytoscape.js) provide a more interactive and scalable solution?

@jku
Copy link
Member

jku commented Feb 15, 2025

Some features I would like to see -- note that it's very likely that all of these cannot be achieved at once in a single project so consider this a wishlist:

  • operates on a published version of a TUF repository: so only requires the TUF json files
  • shows the user a website describing the TUF repository: https://tuf-repo-cdn.sigstore.dev/ is an extremely minimal example of this
  • is easy to deploy: This has to be something repositories can just drop into their publishing processes
  • more specifically, this should not need an active server side component running after the repository version is published (a TUF repo is just files on a webserver so this should be too):
    • there could be process that is executed on every repository version publish: the result should be html+javascript that can be deployed as is
    • or maybe the whole thing can be client side html + javascript -- and we don't need a process executed during publishing? the html+js just reads current TUF json and vibes on that
  • likely should not store any data (should just visualize the TUF json at that moment) for the same ease of deployment
  • bonus feature: somehow allows showing non-standard data: see how https://tuf-repo-cdn.sigstore.dev/ shows a list of signers as github accounts... those are in custom fields in the metadata. I have no idea how to make this work but I would like to see it
  • bonus feature 2: is somehow usable from CLI tools. One issue we have is that we want repository visualization in the web... but we also need to do that in the command line for the folks using the signing tools (the signers should not trust a website, they should only trust their signing tools). One way I can see this working is that the CLI tool can use this visualization component locally and can open a browser on the local html+js files

I can try to get into the details of what data specifically in the TUF metadata is most important to show some time next week.

Would Markdown files be a practical choice for structured visualization, or would leveraging JavaScript visualization libraries (such as D3.js, React-Flow, or Cytoscape.js) provide a more interactive and scalable solution?

The way tuf-on-ci (tool that currently produces https://tuf-repo-cdn.sigstore.dev/) works is it creates a small markdown page when a new version of the repository is created and then uses pandoc to generate html from that, so this is possible. I admit that this design only happened because I know nothing of webdev.

The end result should be something that works in a browser -- the wishlist above maybe guides on choice of tooling to create that: low deployment friction is likely a key factor. If we could just have static html + javascript that does the job... that would be great -- but I'm not even close to a web developer, so I'm not going to advice on details.

we can convert the parsed metadata into an internal data model that will represents the trust relationships, version histories, and delegation trees.It will serve as the basis for both diffing (comparing versions) and visualization.

You likely cannot handle historical data (except for root metadata versions): Even if the repository maintains the old metadata you cannot really use that to produce old repository versions (because timestamp versions are not stored and they are not linked to root versions), you can only know what the current state of the repository is. The exception here is root metadata: In that case the key rotations for the top level roles could be an interesting thing to show historically

@dikshant182004
Copy link

dikshant182004 commented Feb 15, 2025

@jku ,after reading the above wishlist ,i think creating a static, client-side web application using JavaScript visualization libraries would be the optimal option .This method offers interactive, scalable visualization such as zooming, panning, and clickable nodes, which are crucial for exploring complex TUF metadata while keeping the deployment process as simple as dropping static files into our publishing workflow.
By generating a static bundle of HTML, CSS, and JavaScript, we can incorporate the visualization directly into our repository’s publishing process >No persistent server component is required
A static HTML+JS solution can be bundled with your CLI tools to provide seamless cli integration .It can be designed to parse both standard TUF metadata and any additional custom fields (like your GitHub signers).

However i was thinking to use streamlit or Fast Api to build a interactive dashboard with minimal setup that can be easily integrated into our publishing process and viewed both as a web application and from the CLI .Since they provide runtime visualization which can be ideal because we can only access the current versions metadata and becomes compatible with the repository code but its main problem is that it requires a active server.

we can also build a static web application using HTML, CSS, and JavaScript (with D3.js for visualization), and integrate a Python-based build step into our publishing process. This Python step will parse the TUF repository metadata and generate the static assets(ensuring that the final output is a self-contained visualization tool that requires no active server after deployment).

@DeshDeepakKant
Copy link

@jku @lukpueh @dikshant182004 I think building a static web application using HTML, CSS, and JavaScript with D3.js for visualization is the best approach. It keeps things lightweight, easy to deploy, and fully client-side. By adding a Python-based build step, we can preprocess the TUF metadata, structure it properly, and generate static assets that the frontend can just read and display. This way, once the repository is published, the visualization works without needing an active server. Python can handle parsing, validation, and extracting custom fields, making it super flexible while keeping the frontend focused on rendering the data.

For deployment, the goal is to keep it as simple as possible—just a static bundle (HTML, CSS, JS) that repositories can include in their publishing workflow. Once a new version of the TUF repository is published, the visualization updates automatically, without needing any extra backend setup. If needed, we can integrate a Python script in the publishing process that pre-generates data, but at the end of the day, the whole thing should just be static files that can be hosted anywhere, like a CDN or simple web server.

Handling custom metadata fields (like GitHub signers) is something I’d really like to solve. A flexible mapping system could let repositories define how extra metadata should be displayed—so, for example, signers could be shown with links to their GitHub profiles. This would make it easy to support different repository structures while still ensuring the main TUF data is visualized in a standardized way.

For CLI integration, I think the best approach is to bundle the visualization as a static file that CLI tools can open locally. That way, signers don’t need to trust an external website—they can just run a command, and it opens a local HTML file in their browser. Another cool option would be generating text-based visualizations (ASCII, SVG, or even JSON summaries) for those who prefer command-line output. This way, we cover both web-based users and CLI users, making the tool useful for everyone.

@DeshDeepakKant
Copy link

@cedricvanrompay-datadog wrote TUF Explorer, and it seems to be archived now, but maybe others can fork and update it.

TUF Explorer could be really useful for our project, especially in how it makes TUF metadata easy to understand. Features like JSON formatting, key info display, and diffing are things we also need. I think we can take inspiration from its approach and adapt the good parts while keeping our focus on a fully static, client-side visualization using HTML, JavaScript, and D3.js. If it makes sense, we could even fork and update it to remove any backend dependencies. But even if we decide to build from scratch, studying its design and UI choices will help us create a cleaner, more intuitive tool.

@DeshDeepakKant
Copy link

DeshDeepakKant commented Feb 15, 2025

Some features I would like to see -- note that it's very likely that all of these cannot be achieved at once in a single project so consider this a wishlist:

  • operates on a published version of a TUF repository: so only requires the TUF json files

  • shows the user a website describing the TUF repository: https://tuf-repo-cdn.sigstore.dev/ is an extremely minimal example of this

  • is easy to deploy: This has to be something repositories can just drop into their publishing processes

  • more specifically, this should not need an active server side component running after the repository version is published (a TUF repo is just files on a webserver so this should be too):

    • there could be process that is executed on every repository version publish: the result should be html+javascript that can be deployed as is
    • or maybe the whole thing can be client side html + javascript -- and we don't need a process executed during publishing? the html+js just reads current TUF json and vibes on that
  • likely should not store any data (should just visualize the TUF json at that moment) for the same ease of deployment

  • bonus feature: somehow allows showing non-standard data: see how https://tuf-repo-cdn.sigstore.dev/ shows a list of signers as github accounts... those are in custom fields in the metadata. I have no idea how to make this work but I would like to see it

  • bonus feature 2: is somehow usable from CLI tools. One issue we have is that we want repository visualization in the web... but we also need to do that in the command line for the folks using the signing tools (the signers should not trust a website, they should only trust their signing tools). One way I can see this working is that the CLI tool can use this visualization component locally and can open a browser on the local html+js files

I can try to get into the details of what data specifically in the TUF metadata is most important to show some time next week.

Would Markdown files be a practical choice for structured visualization, or would leveraging JavaScript visualization libraries (such as D3.js, React-Flow, or Cytoscape.js) provide a more interactive and scalable solution?

The way tuf-on-ci (tool that currently produces https://tuf-repo-cdn.sigstore.dev/) works is it creates a small markdown page when a new version of the repository is created and then uses pandoc to generate html from that, so this is possible. I admit that this design only happened because I know nothing of webdev.

The end result should be something that works in a browser -- the wishlist above maybe guides on choice of tooling to create that: low deployment friction is likely a key factor. If we could just have static html + javascript that does the job... that would be great -- but I'm not even close to a web developer, so I'm not going to advice on details.

we can convert the parsed metadata into an internal data model that will represents the trust relationships, version histories, and delegation trees.It will serve as the basis for both diffing (comparing versions) and visualization.

You likely cannot handle historical data (except for root metadata versions): Even if the repository maintains the old metadata you cannot really use that to produce old repository versions (because timestamp versions are not stored and they are not linked to root versions), you can only know what the current state of the repository is. The exception here is root metadata: In that case the key rotations for the top level roles could be an interesting thing to show historically

This wishlist outlines a solid foundation, and I think we can turn it into a structured development roadmap checklist. By breaking it down into core functionality, deployment considerations, and bonus enhancements, we can prioritize key features while keeping the project manageable.

Development Roadmap Checklist

Core Features:

  • Process TUF JSON Directly – Ensure the tool operates on published TUF repositories without requiring extra processing.
  • Client-Side Rendering – Build a web-based UI that dynamically visualizes repository metadata using HTML, CSS, and JavaScript.
  • Seamless Deployment – Implement a zero-setup workflow that repositories can integrate into their publishing process.
  • Static Output (No Server Required) – Either:
    • Generate static HTML+JS at publish time, or
    • Load TUF JSON dynamically in a pure client-side approach.
  • Lightweight & Read-Only – No need for persistent storage; just visualize metadata as-is.

Bonus Features & Enhancements:

  • Custom Metadata Parsing – Support non-standard fields (e.g., GitHub accounts of signers).
  • CLI Compatibility – Enable local visualization via CLI tools, possibly by opening a local browser instance with the rendered HTML+JS.

Implementation Steps:

  1. Set Up Core Structure: Create a basic front-end framework to handle and parse TUF JSON.
  2. Metadata Visualization: Implement UI components for root, targets, snapshot, and timestamp metadata.
  3. Refine Deployment Process: Optimize static site generation or client-side rendering approach.
  4. Extend Functionality: Add support for custom metadata and explore CLI tool integration.

To make this actionable, we can break it down into smaller tasks, like setting up a basic index page, implementing visualization for each metadata type (root, targets, snapshot, timestamp), and figuring out how to handle custom fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants