Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Today's ontology meeting note #81

Open
samwaseda opened this issue Nov 28, 2024 · 3 comments
Open

Today's ontology meeting note #81

samwaseda opened this issue Nov 28, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@samwaseda
Copy link
Member

I (aka @samwaseda) showed an example notebook of how to include ontological types via type hinting:

@uniton_class
class Simulation:
    class Input:
        temperature: u(float | None, units="kelvin")

def do_something_with_temperature(temperature: Simulation.Input.temperature):
    ....

Perspective / What I want

A functioning notebook containing all the steps below:

  1. Create a workflow (Inputs: element, temperature. Output: energy, volume per atom)
  2. Execute the workflow
  3. Upload/store the data and/or workflow using PMDco (or any ontology standard)
  4. Download/retrieve the data and/or workflow for a given set of arguments (in this case element and temperature)

This is in order to mimic a typical scientist's routine "oh I would like to know the volume of Al at 300 K - can I find it somewhere?"

The same routine might not be applicable to Murnaghan or other techniques, but I really would like to see one full cycle somewhere.

Does this make sense?

@samwaseda samwaseda added the enhancement New feature or request label Nov 28, 2024
@samwaseda
Copy link
Member Author

So I’ve got a few elements that could be put together:

  • Data class: dataclass and pydantic
  • Type hinting: uniton
  • Storing of data? Tara showed me a notebook but I lost the link. I guess atomRDF must be able to offer functionality.
  • Data extraction: SPARQL (Sarath’s notebook gives a pretty complete picture)

@samwaseda
Copy link
Member Author

Now I looked into atomRDF more in detail. I think the current class should be separated into:

  • Data extraction from the workflow - to be mostly taken over by uniton
  • Assignment of triples - Probably partly uniton, but we need a better strategy to read a workflow
  • Parsing of ASE structure - We should probably have stand-alone parsers
  • Storing and loading of the knowledge graph - rdflib

I’m not surprised to see it, but the notion of structure is very central to atomRDF. I would suggest to consider (at least for now) ASE structures as THE structure object for us, and we make sure whoever comes with an ASE structure can obtain the full ontology.

@srmnitc
Copy link
Member

srmnitc commented Dec 2, 2024

@samwaseda, just sharing my thoughts on this:

  • Data extraction from the workflow - to be mostly taken over by Uniton: Agree. I think we can plug it in whenever it’s ready. However, if it’s about detailed extraction of all relevant info, I’m not so sure. That might go against the idea of Uniton being lightweight software. This is where we should involve the scientific data team. That said, I’d be happy to offload the data extraction parts from atomRDF when suitable alternatives are available.

  • Assignment of triples: This could partly be done by Uniton, but we need a better strategy for reading workflows—this is what atomRDF handles. The key here is that all triples are validated, mapped to the ontology, etc., ensuring the necessary level of standardization. Supporting a generic object, like a dataclass, would depend on resolving the previous point.

  • Parsing of ASE structures: Stand-alone parsers might be a good idea. That said, I’m not entirely sure I understand what’s meant here—there’s no real “parsing,” as the ASE structure is simply taken as input, isn’t it?

  • Storing and loading the knowledge graph - rdflib: This is how it’s done anyway; KnowledgeGraph is just a wrapper around it to handle specific inputs.

Regarding a structure object: The point here is that atomRDF wasn’t developed as a general-purpose tool. It was designed to address a specific use case within the project. So, using ASE structures outside of atomRDF for our purposes seems perfectly fine to me.

Lastly, I’ve had to develop a lot of components because there’s a general lack of tools for interacting with ontologies. However, once we have robust tools that can take over parts of the job, I’d be happy to integrate them as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants