Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more clarity around the id field in the VC data model #973

Closed
andorsk opened this issue Nov 7, 2022 · 56 comments
Closed

more clarity around the id field in the VC data model #973

andorsk opened this issue Nov 7, 2022 · 56 comments
Labels
conversation pending close Close if no objection within 7 days

Comments

@andorsk
Copy link

andorsk commented Nov 7, 2022

The example https://www.w3.org/TR/vc-data-model/#identifiers gives the following description for id:

The value of the id property MUST be a single URI. It is RECOMMENDED that the URI in the id be one which, if
dereferenced, results in a document containing machine-readable information about the id.

with an example http://example.edu/credentials/3732 which doesn't dereference into anything meaningful.

As an implementer, I'm struggling to figure out exactly what type of data the ID field should dereference to and what the recommendation is on it.

It would be helpful to provide more clarity on the data model about:

  1. What type of data should the dereferenced id generally contain.
  2. A working example of a dereferenced id

I would be happy to raise a PR on this, given some direction.

@bumblefudge
Copy link

bumblefudge commented Nov 8, 2022

Note also this example of a URN (in this case a UUID) rather than a URL id prop. If you really wanted to get 🌶️ spicy 🌶️ you could even use an IPFS CID for that identifier, whether as a URN or as an ipfs:// URL , although that wouldn't really answer your question of what that [content-addressed] id should derefence to

@andorsk
Copy link
Author

andorsk commented Nov 8, 2022

yea..thanks @bumblefudge. To your point, if it just said: provide an ID for the VC, I wouldn't have raised this issue.

My 🌶️ take is that an id field makes sense, but the id field being used as a descriptor of the document to me makes less sense 🙇 . I would almost think you need to break this out into two fields:

  1. an id which is just a unique identifier. with a recommendation that the id be a did. ( after all, they have known referable methods ). But it could be a UUID for example.
  2. an optional description field: which contains information about the VC itself. With the option to either reference the description ( as a did or url ), or put the description directly as a string ( why not allow that? ).

I don't know. Would love to hear if this is a reasonable position or I'm over thinking this.

@melvincarvalho
Copy link

I may be wrong here, but from a simple reading of the text: In the example given there is some JSON and an HTTP URI

HTTP URI: http://example.edu/credentials/3732

json:

{
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://www.w3.org/2018/credentials/examples/v1"
  ],
  "id": "http://example.edu/credentials/3732",
  "type": ["VerifiableCredential", "UniversityDegreeCredential"],
  "issuer": "https://example.edu/issuers/565049",
  "issuanceDate": "2010-01-01T00:00:00Z",
  "credentialSubject": {
    "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
    "degree": {
      "type": "BachelorDegree",
      "name": "Bachelor of Science and Arts"
    }
  }
}

Would it make sense to just return the JSON above from that HTTP URI?

@aljones15
Copy link

aljones15 commented Nov 8, 2022 via email

@andorsk
Copy link
Author

andorsk commented Nov 8, 2022

@melvincarvalho thanks for the thoughts, but I agree with @aljones15 that it shouldn't be dereferencing to the actual document. I would suggest a change to the language of the document, and still possibly a new description field.

The language to me should be probably something like this:

Instead of:

The value of the id property MUST be a single URI. It is RECOMMENDED that the URI in the id be one which, if
dereferenced, results in a document containing machine-readable information about the id.

a suggestion would be something along the lines of this:

The value of the id property MUST be a single URI. It is RECOMMENDED that the URI should represent a globally unique identifier specific to the credential.

Unless there's some enforceable property, I'm not sure about the block location ( in the case of a ledger VC ) being an appropriate recommendation. I have a few reasons, but the main one is that it drives an inconsistency in the language when you're talking about what an ID means on ledger vs. off ledger.

Either way, even if it SHOULD reference to a block on a ledger, the current state of the specifications does not make it clear that's the preference and it should be updated IMO.

@RieksJ
Copy link

RieksJ commented Nov 22, 2022

@andorsk nice question. The standard says that an id property "is intended to unambiguously refer to an object, such as a person, product, or organization." The standard does not provide any ground whatsoever for recommending that it should be dereferenceable to some document/description, nor does it provide an example of why it might be useful.

I propose to remove this sentence in its entirety.

There is something else that would be useful though. Considering that there is a difference between 'dereferencing' and 'resolving' (converting an identifier to a descriptive document vs. using that same identifier to learn which entity it is actually referring to), there is a need for guidance on the latter which is currently not provided.

I consider this a serious omission, specifically when an id is being used as the subject identifier in one of the VC's claims, as it leaves verifiers (or anyone else for that matter) clueless about which entity a particular claim was made. Assuming that whoever controls the subject identifier (which in the case of DIDs is easy to establish) is in fact also the subject of that identifier (and hence of the claim), comes with serious problems.

Perhaps we should specify ways that would enable verifiers to identify (and possibly authenticate) the entities to which identifiers refer, e.g., as proposed in #760.

@andorsk
Copy link
Author

andorsk commented Nov 22, 2022

@RieksJ I think this is a good point. I will need to think about #760 and #959 in more detail, but for the scope of this issue, I agree with removing the dereferencing language w.r.t. the id field and swap it out for something like is intended to unambiguously refer to an object, such as a person, product, or organization or something else of the like.

I also think a description or purpose field still could be useful. Any thoughts there?

@RieksJ
Copy link

RieksJ commented Nov 23, 2022

I would say that every property in a VC should serve a specifically stated purpose, i.e., serve an explicitly stated objective.

For description, I can see there is merit, but not a generic purpose for having it. Adding it might induce a risk that different people will use it for different purposes which might result in weird behaviours (as the issuers had something different in mind as the verifiers assumed).

For purpose, that's pretty much the same. As VCs are merely a set of signed claims, I do not see how an issuer might state a purpose in a generic way that actually has some effects in practice. So I'm not in favor of that.

A specifically stated purpose for which there is currently no support is the identification (and authentication) of the entity that is the subject of identifiers specified in the id fields of claims.

@mwherman2000
Copy link

mwherman2000 commented Nov 28, 2022

UPDATED: I believe (the value of) an id field should be interpreted as a unique reference or identifier to a concrete something (aka subject): a person, an organization, a business document (purchase order, invoice, etc.), an education credential, a car, a boat, a house, a software module, a deployed instance of a software module, etc.

Everything else in a decentralized identifier-based software system is addressed by dereferencing or resolving the decentralized identifier to obtain something else (e.g. a DID Document, Revocation List entry, Service Endpoint addresses (via a second level of indirection through the DID Document)).

There are 2 id fields (typically) in a VC. From the VC spec https://www.w3.org/TR/vc-data-model/#identifiers ...

The first identifier is for the verifiable credential and uses an HTTP-based URL. The second identifier is for the subject of the verifiable credential (the thing the claims are about) and uses a decentralized identifier, also known as a DID.

@RieksJ
Copy link

RieksJ commented Nov 29, 2022

@mwherman2000: I do not think that the id field in a VC exists. There are multiple ones, of which only one represents the VC. Others represent, e.g., subjects of claims in the VC, or different entities. I also do not think that that a VC is an instantiation of a real-life object, because a VC is not a representation of a class (or: another abstraction) by a concrete individual (that is) an element (or: illustrative of that class.

What I do think is that there is not only a link between (the value of) any id field and its subject (i.e., the entity to which this value refers), but also that there is an equally important link between this value and its author (i.e., the party that has put the value into the id field), because it is that party that has assigned the value of the id-field to that entity. The DID spec recognizes this by saying that the controller of a DID gets to decide which entity is the subject of that DID.

I also think that parties other than the author SHOULD NOT assume that they know how to dereference the value of an id field, unless they have put some effort in finding out how (and/or verifying that) the author governs the semantics of the id-fields that it authors. For DIDs, there's currently no guidance whatsoever (see issue w3c/did-core#837).

For an id-field that is meant to identify the subject of a claim in a VC, there is also no guidance that other parties can rely on. There is lots of talk about 'holder binding', see e.g., #789, #882, #923, w3c/vc-imp-guide#70, #959, #960, w3c/vc-imp-guide#69. There are also discussions on adding roles such as issuee (#942), all of which can be resolved by defining proper mechanisms that verifiers can use to determine which entity is the subject that the author of an id field meant to refer to by the value of that id field, and of which authors can state which is appropriate for a verifier to use in a particular case.

@mwherman2000
Copy link

mwherman2000 commented Nov 29, 2022

I also do not think that that a VC is an instantiation of a real-life object, because a VC is not a representation of a class (or: another abstraction) by a concrete individual (that is) an element (or: illustrative of that class.

Perhaps, "real-life" object is too strong an adjective. Perhaps "concrete" object would be better. The main point is that (the value of) an id field is associated with or names the actual "thing" (aka subject) ...not its agent, not the service endpoint of its agent, not a VC, etc.

@TallTed
Copy link
Member

TallTed commented Nov 29, 2022

Perhaps, "real-life" object is too strong an adjective. Perhaps "concrete" object would be better. The main point is that (the value of) an id field is associated with or names the actual "thing" (aka subject) ...not its agent, not the service endpoint of its agent, not a VC, etc.

The word most often used (at least, in the W3C ecosphere) for this actual "thing" or "concrete" object or "real-life" object is entity. Some do use concept or thing for the same purpose. There are years' worth of reading on the philosophical underpinnings of how and why these different words came to be used for the same (or very similar) things.

The value of an id field identifies a specific entity, not an abstraction nor relative of that entity, as intended by the, cough, entity that populated that field — though the type of their chosen specific entity may itself be conceptual, an agent, a service endpoint, a VC, or any other (sub-)class of identifiable "thing" in the universe.

@David-Chadwick
Copy link
Contributor

David-Chadwick commented Nov 29, 2022

There is a bug in the current DM in that the id field in the VC is not actually the id of the verifiable credential, but is the id of the credential. This was brought home to me during the JFF Plugfest. I had always believed that the id was equivalent to the serial number of a PKC and was unique for each VC (which should be true if it was the id of the VC). But in the plugfest people were issuing multiple verifiable credentials for the same credential and keeping the id constant, because the only difference was the validity time of the cryptographic proof. The credential remained the same and therefore kept the same id.
If we want to have an id for a verifiable credential then it must be part of the proof property or part of the JWT, and not part of the credential

@jandrieu
Copy link
Contributor

There is a bug in the current DM in that the id field in the VC is not actually the id of the verifiable credential,

This is incorrect, but probably because there are multiple identifiers.

{
  "@context" : "https://w3id.org/credentials/v1",
  "id" : "did:ex:id1",
  "credentialSubject" : {
    "id" : "did:ex:id2",
    "hasCredential" : {
        "id" : "did:ex:id3",
        "credentialType" : "bachelor of science"
    }
}

In that example

  • did:ex:id1 is the identifier of the verifiable credential. Full stop.
  • did:ex:id2 is the identifier, in this vc, of a subject of the VC
  • did:ex:id3 is the identifier, of the credential that subject has earned, as memorialized by this VC

The problem is one of understanding the data model of claims, not that of the VC.

We need to better explain this, for sure, but bad data modeling is going to always be a thing. Better data modeling resolves this problem, without needing to change the VCDM at all.

I think @David-Chadwick you are imagining that a VC (with its ID) contains a fully formed credential (with a separate id). However, if you look at Figure 5 in Section 3 https://www.w3.org/TR/vc-data-model/#credentials, a VC contains metadata, claims, and proofs. It doesn't contain a credential. Rather the credential becomes verifiable because there is a proof. So to the notion of "credential" as defined by VCs, it is not a separate thing with its own identifier. (Although you can model such a separate credential as in my example.)

The problem, of course, is that many communities, including educators, see "credential" as a well-known and specific thing. By which they mean the degree or certification earned. That "credential" is not the same as "credential" as defined in the VCDM. It's an unavoidable name collision that we have to overcome by better examples and explanations of that distinction.

@David-Chadwick
Copy link
Contributor

@jandrieu

"did:ex:id1 is the identifier of the verifiable credential. Full stop."

No its not. Full stop. It's the identifier of the credential.
It is part of the metadata of the credential as figure 5 clearly shows. Thankyou for pointing that out. It is not metadata of the verifiable credential as you wrongly assert.

Each proof that makes the credential into a verifiable credential will have its own parameters and metadata, such as validity time (of the signature - which is very different from validity time of the credential) and an id (as in the serial number of a PKI). When the same credential is turned into a verifiable credential at different times, then the VCs will have different ids and different validity times as they are different objects. But the embedded credential will have the same validity time and identifier.

@jandrieu
Copy link
Contributor

jandrieu commented Nov 29, 2022

@David-Chadwick Why do you think there is a "credential" separate from the "verifiable credential"? I'm honestly curious where that notion is based.

If you look at example 4 https://www.w3.org/TR/vc-data-model/#example-usage-of-the-id-property, you'll see that the Credential, Verifiable Credential (with proof), and the Verifiable Credential (As JWT) all use the same identifier. Because there is no separate credential inside a VC. A credential is transformed into a verifiable credential by adding a proof.

This construct was created in recognition of the usefulness of JSON-LD credentials that don't have proofs. It was not intended, and has never been expressed or documented, to my knowledge, as a credential being a separate thing within the Verifiable Credential.

We did enable exactly that pattern in the Learning and Employment Record (LER) Wrapper. https://www.t3networkhub.org/resources/public-specification-for-learning-and-employment-record-ler-wrapper-and-wallet The id of the VC is most definitely NOT the id of the wrapped credential.

You say

But the embedded credential will have the same validity time and identifier.

There is no embedded credential; there are only claims expressed in JSON-LD. In those claims you may state that there is a credential that has an identifier and has been granted to the subject. You don't have to have such an identifier, but you can.

If you follow the JSON-LD data model, you might discern that the "id" of an object is the identifier of the object in which the property appears. The top-level identifier of a VC is always the identifier of that data object, i.e., of the VC.

If you want to state the identifier for a credential expressed in a VC, that is done in the claims, in the "credentialSubject" property, probably using something like the pattern I already described.

If you treat the top-level "id" property as anything other than the identifier of the VC, you would break JSON-LD semantics.

@David-Chadwick
Copy link
Contributor

David-Chadwick commented Nov 30, 2022

@jandrieu

"Why do you think there is a "credential" separate from the "verifiable credential"? I'm honestly curious where that notion is based."

Because we agreed during DM1.1 that if you take a credential, put different proofs on it, JWT or JSON-LD, then verify each VC and remove its proof, you will end up with the same credential that you started with.
Therefore it follows that credentials and verifiable credentials are different entities, and have their own lifetimes and metadata. Thus the id of the VC must be different from the id of the C. Consequently some of the existing mapping rules for JWT are wrong (including the date/time mappings).
I think this is a major discussion item that we should have at a VC WG meeting (very soon!).
I don't think this is a JSON-LD issue, but rather a conceptual one. Namely is the VC object a new and separate object from a C object (even though the former was created from the latter, they are not identical).

@mwherman2000
Copy link

mwherman2000 commented Nov 30, 2022

I believe I concur with @jandrieu. If we cast the credentialSubject as the "inner credential" or "business credential", then the entire credential (if it contains a proof) is the verifiable credential. If we consider the verifiable credential minus the "inner credential", this is the envelope containing the "inner credential".

Further, two envelopes can be used to encase 2 copies (one each) of the same "inner credential" (e.g. a purchase order or invoice). This results in two different verifiable credentials (with 2 different "outer ids") but the same "inner credential" - each copy with the same "inner id".

Here's an interesting tutorial that illustrates this concept: https://www.youtube.com/watch?v=kM30pd3w8qE&list=PLU-rWqHm5p445PbGKoc9dnlsYcWZ8X9VX&index=1

@David-Chadwick
Copy link
Contributor

@mwherman2000 Your argument applies equally well to whatever construct is the inner credential. You appear to want the credential subject property to be the inner credential, whereas I want the credential object to be the inner credential. The difference of course lies in whether the credential metadata properties issuanceData, type, @context etc are part of the inner credential or not. Jo appears to be saying they are not, whilst I am saying they are.

@mwherman2000
Copy link

mwherman2000 commented Dec 1, 2022

@David-Chadwick I'm not exactly following your terminology. Can you elaborate? ...perhaps with an example? Here's one example I can offer as a starting point...

Sample Verifiable Credential

{
  "id": "did:color:verifiable:red",
  "@context": [
    "https://www.w3.org/2018/credentials/v1"
  ],
  "type": [ "VerifiableCredential", "Color" ],
  "issuer": "did:org:111-222-333",
  "issuanceDate": "2017-01-12T00:00:00Z",
  "expires": "2017-04-22T00:00:00Z",
  "credentialSubject": {
    "id": "did:color:red",
    "claims": {
        "red": "255",
        "green": "0",
        "blue": "0"
    }
  },
  "proof": {
    "type": "RsaSignature2018",
    "created": "2017-01-12T21:19:10Z",
    "proofPurpose": "assertionMethod",
    "verificationMethod": "https://example.com/issuers/keys/1",
    "jws": "eyJhbGciOiJSUzI1NiIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..TCYt5XsITJX1CxPCT8yAV-TVkIEq_PbChOMqsLfRoPsnsgw5WEuts01mq-pQy7UJiN5mgRxD-WUcX16dUEMGlv50aqzpqh4Qktb3rk-BuQy72IFLOqV0G_zS245-kronKb78cPN25DGlcTwLtjPAYuNzVBAh4vGHSrQyHUdBBPM"
  }
}

Using my terminology, the "inner credential" or "business credential" or "payload" is...

{
    "id": "did:color:red",
    "claims": {
        "red": "255",
        "green": "0",
        "blue": "0"
    }
}

The Verifiable Credential Envelope is...

{
  "id": "did:color:verifiable:red",
  "@context": [
    "https://www.w3.org/2018/credentials/v1"
  ],
  "type": [ "VerifiableCredential", "Color" ],
  "issuer": "did:org:111-222-333",
  "issuanceDate": "2017-01-12T00:00:00Z",
  "expires": "2017-04-22T00:00:00Z",
  "credentialSubject": {

  },
  "proof": {
    "type": "RsaSignature2018",
    "created": "2017-01-12T21:19:10Z",
    "proofPurpose": "assertionMethod",
    "verificationMethod": "https://example.com/issuers/keys/1",
    "jws": "eyJhbGciOiJSUzI1NiIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..TCYt5XsITJX1CxPCT8yAV-TVkIEq_PbChOMqsLfRoPsnsgw5WEuts01mq-pQy7UJiN5mgRxD-WUcX16dUEMGlv50aqzpqh4Qktb3rk-BuQy72IFLOqV0G_zS245-kronKb78cPN25DGlcTwLtjPAYuNzVBAh4vGHSrQyHUdBBPM"
  }
}

@mwherman2000
Copy link

mwherman2000 commented Dec 1, 2022

Alternatively, following the Structured Credential model more closely (https://www.youtube.com/watch?v=kM30pd3w8qE&list=PLU-rWqHm5p445PbGKoc9dnlsYcWZ8X9VX&index=1), the envelope and proof can be separated:

Verifiable Credential Envelope

{
  "id": "did:color:verifiable:red",
  "@context": [
    "https://www.w3.org/2018/credentials/v1"
  ],
  "type": [ "VerifiableCredential", "Color" ],
  "issuer": "did:org:111-222-333",
  "issuanceDate": "2017-01-12T00:00:00Z",
  "expires": "2017-04-22T00:00:00Z",
  "credentialSubject": {

  },
  "proof": {
 
  }
}

Verifiable Credential Proof

{
    "type": "RsaSignature2018",
    "created": "2017-01-12T21:19:10Z",
    "proofPurpose": "assertionMethod",
    "verificationMethod": "https://example.com/issuers/keys/1",
    "jws": "eyJhbGciOiJSUzI1NiIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..TCYt5XsITJX1CxPCT8yAV-TVkIEq_PbChOMqsLfRoPsnsgw5WEuts01mq-pQy7UJiN5mgRxD-WUcX16dUEMGlv50aqzpqh4Qktb3rk-BuQy72IFLOqV0G_zS245-kronKb78cPN25DGlcTwLtjPAYuNzVBAh4vGHSrQyHUdBBPM"
}

@David-Chadwick
Copy link
Contributor

@mwherman2000 Thankyou for your example. Here are my comments

  1. Your model only works for JSON-LD proofs and not for JWT proofs as there is no proof property in the latter
  2. issuanceDate is not the date of issuing of the VC, but is the date that the credential was issued. So it is metadata of the credential and not of the verifiableCredential/proof
  3. the VC (proof) needs to have its own date of issuance, which you have in your example with "created"
  4. the same applies to the expiry date of the verifiableCredential, since no crypto lasts for ever. So your proof must have an "expires" property as well as a "created" property, which currently it does not have. The actual credential may or may not have an expiry date (e.g. degree certificate)

So now we come to the tricky bits
i) is type the type of the credential or verifiable credential? I would argue it is the credential. The type of verifiable credential may be JWT proofed credential, or JSON-LD proofed credential
ii) is id the id of the credential or verifiable credential? Given that all the other properties have been shown to be properties of the credential and not the verifiable credential, then the id must also be the id of the credential. The id of the verifiable credential must be added to the proof object as an extra parameter. (Note that the JFF plugfest has already used this interpretation since multiple sequential VCs created from the same underlying credential have all been given the same id.

@mwherman2000
Copy link

mwherman2000 commented Dec 1, 2022

@David-Chadwick can you mark up some of the examples to more precisely illustrate your points? For now, let's limit the scope to JSON/JSON-LD based VCs.
Some responses:
i) re: type. I agree that it is the type of the "inner credential" (let's agree to stay with one set of terminology ...the terms used in the examples)
i) re: "The type of verifiable credential may be JWT proofed credential, or JSON-LD proofed credential". I have not seen the "type" used for to specify "JWT proofed credential, or JSON-LD proofed credential" in any existing examples. Usually, I've interpreted the following pattern to say this JSON thing is a VC ...with a (verifiable credential) subtype (e.g. Color):
"type": [ "VerifiableCredential", "Color" ]
...putting the about 2 points together: the subtype is the type of "inner credential" and "type" property says "this is the type designation for the VC that embeds the "inner credential"".
ii) which "id" are you referring to? ...the "id" in the "inner credential" is the "id" for the "inner credential". ...the "id" in the "envelope" is the "id" for the verifiable credential.
ii) the "proof" is the proof for the VC (i.e. the envelope and embedded "inner credential") because the "id" is embedded inside the VC envelope. See the examples above.

@David-Chadwick
Copy link
Contributor

the type of proofing is already specified (for the JSON-LD proofs) Its "RsaSignature2018" in your example. So I am happy that this aspect is already covered.
there is only one 'id' in your examples so it should be obvious which one I am referring to.
What you are calling the envelope I am referring to as the metadata. Either way this is information about the credential and not about the proof.
"the "proof" is the proof for the VC". No, the proof is for the credential. credential + Proof = verifiable credential

@mwherman2000
Copy link

Screenshot_20221201-114124

Here's a screenshot of my post from above @David-Chadwick. There are two "id" properties in my example.

@David-Chadwick
Copy link
Contributor

That is weird. Because this is what I see in git and have copied below

{
  "id": "did:color:verifiable:red",
  "@context": [
    "https://www.w3.org/2018/credentials/v1"
  ],
  "type": [ "VerifiableCredential", "Color" ],
  "issuer": "did:org:111-222-333",
  "issuanceDate": "2017-01-12T00:00:00Z",
  "expires": "2017-04-22T00:00:00Z",
  "credentialSubject": {

  },
  "proof": {
 
  }
}

As you can see there is only one "id" above.
In your screen shot your also have credentialSubject.id - this is clearly the id of the credentialSubject, whereas the first "id" is the one we are debating. Is it the id of the credential or of the verifiable credential?

@mwherman2000
Copy link

mwherman2000 commented Dec 2, 2022

That is weird. Because this is what I see in git and have copied below

@David-Chadwick It's best to read my entire posts from the beginning to the end; else you will lose/miss the context. What you've shown above is an example of a VC Envelope ...it is also correct.

Reread:

@mwherman2000
Copy link

mwherman2000 commented Dec 2, 2022

In your screen shot your also have credentialSubject.id - this is clearly the id of the credentialSubject, whereas the first "id" is the one we are debating. Is it the id of the credential or of the verifiable credential?

The value first "id" is the identifier for the particular VC (shown in its entirety at the top of this post: #973 (comment)).

The second "id" (in the same VC definition in the referenced post) is the identifier for the "inner credential" or "business credential" or "payload" ...also known as the "credentialSubject" "id".

NOTE: I think it's worth mentioning that the JSON text we're talking is a textual serialization of a VC ...a technical JSON-based textual serialization of a particular VC. So terms (property names) like "credentialSubject" are a technical/implementation terms chosen at the time that the JSON serialization for a VC was agreed upon. "credentialSubject" is not the term I would use when I'm talking to someone (an architect or developer) using the King's English ;-) ...I use terms like "inner credential" or "business credential" or "payload". I hope this finally clarifies things.

@David-Chadwick
Copy link
Contributor

This is where we disagree. I assert it is the id of the credential, and has been used as such by the JFF Plugfest. Multiple VCs have been created from this credential, all with the same "id", but clearly each VC is different and a separate object. Its "id" is equivalent to the serial number of an X.509 PKC and should be in the proof section

@andrewhughes3000
Copy link

Although not precisely the same, ISO 18013-5 has a similar concept to what David described. There are data elements and a separate Mobile Security Object comprised of salted hashes, keys and other data integrity stuff. The "driving license" credential is the data elements including DL# and issuance/expiry dates. The MSO has its own issuance/expiry dates and is re-issued as needed independent of the DL expiry and DL#.

@David-Chadwick
Copy link
Contributor

@andrewhughes3000 Glad I am not on my own with my mental model. The current VCDM half supports this model and half doesn't as it takes metadata about the credential and then treats some of it as metadata about the VC.

@dlongley
Copy link
Contributor

dlongley commented Dec 3, 2022

I do think that's a useful model and concept for some issuers to employ -- but I think it should be done internally. In other words, it seems like that model would be better served via some internal ID / reference ID rather than exposing that information in the VC itself. It seems like a leakage of implementation details to me.

I imagine it would further complicate the VCDM when considering any number of metadata items that might then need duplication, e.g., credential status -- do we now need both credential status and verifiable credential status tracking? I think that whether or not certain elements of the VC are present or not (proof 1, proof 2, selectively disclosed fields, etc.) should not each result in different identifiers being assigned to the "new object" (each possible combination constituting a "new object").

Rather, I think we're always talking about the same object from the perspective of any party outside of the issuer (i.e., they cannot know the details of how the issuer is implemented). It's just a question of whether or not that object is verifiable, verifiable with proof type 1, or proof type N, whether fields X are revealed or not and so on. I think if there is important metadata for external (or internal) proofs, it should be present in the proof section -- and I'd expect there to be different metadata for each proof for VCs with multiple proofs (which is already true today).

@dlongley
Copy link
Contributor

dlongley commented Dec 3, 2022

As another point of reference -- there's a discussion in the W3C CCG VC-API group about having internal credential references of a slightly different sort: w3c-ccg/vc-api#126 -- particularly for cases where credential IDs should not be used at all (potentially for increased privacy cases). The point is that there are other needs for "reference IDs" to credentials that aren't expressed in the credential itself (for various reasons).

@David-Chadwick
Copy link
Contributor

@dlongley Sorry but I disagree with you. The issue comes down to this:

  • can or does the credential exist without a proof? (e.g. could an issuer issue a W3C credential i.e. a VC without a proof? or could a wallet (trusted by the verifier) strip off the proof and just give the credential to a verifier?)
  • if so, what is the metadata of the credential? This metadata must be independent of the proof.

@bumblefudge
Copy link

bumblefudge commented Dec 5, 2022

  • can or does the credential exist without a proof? (e.g. could an issuer issue a W3C credential i.e. a VC without a proof?

I think many in this thread are operating on different definitions of this "credential", so maybe we should tease out those differences. Maybe instead of debating the existence we should step back a little and state what we think a "Credential" does or can do. Can a credential be "issued" or does the issuer/holder/verifier model only make sense for a specific verifiable credential? If the latter, what are the differences between how a C and how a VC "exist" 😅

or could a wallet (trusted by the verifier) strip off the proof and just give the credential to a verifier?)

Here I was assuming a credential is a kind of platonic ideal that only exists inside the issuer's records or "before" the VC, and that the closest anyone else could come to "recreating" that pre-existence was the verifier. I was thinking of a credential as fundamentally local and non-portable, as it were (a record in a system), and a VC a kind of "export format"-- that might just be a bias from the use-cases I think of as the "real" use-cases for VCs?

To put it more bluntly, if the wallet (or a 4th party) can reconstitute the Credential, I feel like we're entering a different definition of the Credential than I thought I had before! I thought the whole point of VCs was that verifiers and 4th parties can recreate something that is almost the credential, but an approximation or guess. My humanities-brain is buzzing with Proustian Madeleines and Wellsian Rosebuds.

  • if so, what is the metadata of the credential? This metadata must be independent of the proof.

Going back to @dlongley 's example of credentialStatus, I feel like we get into spicy waters. Maybe some properties can exist in both credential and VC alike, but StatusList2022 feels like it was written specifically for the statuses of VCs only, and using StatusList2022 to track the status of the credential instead breaks that whole mental model. So that seems like a piece of metadata that the credential can't have, to me at least-- two VCs sharing one credentialStatus feels wrong to me.

Sidenote, maybe a tracking issue on StatusList2022 should be opened to make this assumption more explicit if the concept of "credential" is here to stay and survives the process of discernment I seem to be plus-oneing

@David-Chadwick
Copy link
Contributor

credentialStatus is clearly mis-named as it is the verifiableCredentialStatus and appertains only to VCs that can be revoked. Short lived ephemeral VCs do not have this property, neither do credentials.
I think it is very easy to differentiate between credential metadata and verifiable credential metadata.
VC metadata applies to everything to do with the cryptographic proofing (and revoking)
Credential metadata applies to everything to do with the credential.
Now it may be possible to carry some of the credential's metadata over into the VC's metadata, but in so doing the VC metadata properties should have different names to the C's metadata and there should be a bi-directional algorithm that defines this conversion (similar to what is done when a JWK proof is created).

As to what a credential is - these are the statements made by some entity (the issuer) about an entity (the subject) to an entity (the issuee). When they have been cryptographically protected they become a verifiable credential.

@bumblefudge
Copy link

Hehe, stating clearly that you're confident of your mental model isn't going to get us to the differences :D

Here's one question that can hopefully tease some out: is it a design goal (or a requirement) that Credentials be 100% reconstructable/roundtrippable from VCs if all the necessary additional resources like @context files and schemas have been dereferenced? My jokes about Proust and Citizen Kane were roundabout ways of asking this. Your reference to "bidirectional" made me think you are taking this as self-evident but I'm not sure everyone agrees to that requirement, necessarily?

@David-Chadwick
Copy link
Contributor

We already have one example of where the bi-directional transformation isn't possible - converting issuanceDate into JWT nbf claim and back again. Therefore the rules need to state how this should be handled. e.g. convert issuanceDate into the nearest nbf, keep the issuanceDate in the credential, then throw the nbf away after validating the signature.
Once we have all agreed on the model, then producing the rules should be relatively straight forward.
But currently we have not agreed on the (mental) model.

@jandrieu
Copy link
Contributor

jandrieu commented Dec 5, 2022

credentialStatus is clearly mis-named as it is the verifiableCredentialStatus and appertains only to VCs that can be revoked. Short lived ephemeral VCs do not have this property, neither do credentials. I think it is very easy to differentiate between credential metadata and verifiable credential metadata. VC metadata applies to everything to do with the cryptographic proofing (and revoking) Credential metadata applies to everything to do with the credential. Now it may be possible to carry some of the credential's metadata over into the VC's metadata, but in so doing the VC metadata properties should have different names to the C's metadata and there should be a bi-directional algorithm that defines this conversion (similar to what is done when a JWK proof is created).

As to what a credential is - these are the statements made by some entity (the issuer) about an entity (the subject) to an entity (the issuee). When they have been cryptographically protected they become a verifiable credential.

David, I think credentialStatus is not misnamed as much as you have settled into a different conceptual model than was used to write the specification. I think yours is a coherent model. That is, you're not wrong when discussing your sense of what a credential is. It's just that notion of a credential isn't the same as that which drove the VC work.

There is only a single "credential" (as defined in the spec) in a Verifiable Credential. It is not a separate digital object, it is the part of the verifiable object that contains the claims.

That is, VCs are not a composition of a credential and a verifiable credential; VCs don't wrap wholly formed credentials (as defined by the spec). Rather, VCs are a transmutation of a credential as expressed by a set of claims. Credentials become VCs when proofs are added. They do not retain a separate existence, identifier, or other metadata. The credential BECOMES the VC. As such, property names like credentialSubject and credentialStatus refer to the exact same credential: the set of claims. There's no separate verifiableCredentialSubject or verifiableCredentialStatus because there is only one credential and they have the same value and meaning with or without the proof. In fact, in most issuance pipelines the credential is generated in a separate step from signing. In that step, the credential is the "proto-" VC, which is passed to the signing software to actually issue the VC.

You can wrap credentials as we do in the LER Wrapper, where the credential is an exogenously created thing (like a transcript, degree, etc.,, in any format) that is literally wrapped by a VC. So, it's possible to memorialize the kind of credential you want, within the claims of a VC. But that is not making the VC that other credential; it is making a VC that represents that credential.

I think this is where the fundamental disconnect is.

In the mental model behind VCs, "credentials" are sets of claims that become "Verifiable Credentials" when a proof is added. The credential is the Verifiable Credential. As such, credentials and VCs are simply statements by an issuer about a subject. Without the proof, they are not verifiable, so they are called "credentials". With a proof, they are verifiable and, hence, called "Verifiable Credentials".

Domains that produce credentials (such as education) already have (and will continue to innovate about) domain-defined notions of what a credential is. In many cases, the "credential" is the physical object, in others, the "credential" is the attainment earned, like a Bachelor's degree. There's a rich and complicated semantic dance about whether or not the credential is the abstract thing represented, e.g., by the sheepskin, or is it the sheepskin itself the credential.

With VCs, there is no technical ambiguity, VCs are a serialized set of claims with some form of integrity proof. What is in the VC is a set of statements, represented as claims. Those statements are the credential whose authenticity, authorship, and timeliness is "verifiable".

We can't get away from the essential ambiguity with real-world credentials when people attempt to model them as VCs, but at least within the specification and our own work, there is a clear definition. Perhaps misunderstood by many, but still a definition within the context of Verifiable Credentials.

@mwherman2000
Copy link

mwherman2000 commented Dec 6, 2022

Credentials become VCs when proofs are added.

@jandrieu If I read this statement literally, it is indeed unfortunate for regular people because it says that any particular set of claims (only) is not considered a Credential. It implies a Credential is only a Credential if it has a bunch of VC decorations surrounding the set of claims.

It says the following is not a credential (which is contrary to everyday usage).

image

@David-Chadwick
Copy link
Contributor

@jandrieu Thankyou for pointing out the differences in our mental models. But I am having difficulty understanding your mental model when you write

Credentials become VCs when proofs are added. They do not retain a separate existence, identifier, or other metadata. The credential BECOMES the VC

Your first sentence is exactly the same as my mental model. Your second sentence is where we diverge and I have difficulty comprehending it. In our implementation the credential does have its own existence. The RP receives the VP from the wallet, passes it to our backend verifier service, which verifies the VP and VC and returns the credential to the RP for it to process at the application level. The proof has gone. The RP is not interested in it. (If the verification fails the RP is given nothing except an error code because we cannot believe anything that the VP/VC says.) So I would be interested to learn how your implementation implements your mental model.

@TallTed
Copy link
Member

TallTed commented Dec 6, 2022

[@jandrieu] Credentials become VCs when proofs are added.

[@mwherman2000] If I read this statement literally, it is indeed unfortunate for regular people because it says that any particular set of claims (only) is not considered a Credential. It implies a Credential is only a Credential if it has a bunch of VC decorations surrounding the set of claims.

Apparently you read a different English than I do, one that is neither US, UK, nor CA-localized, and one which I am fairly confident no-one else on this thread shares.

Fortunately for you, the rest of us, and all sorts of "regular people", @jandrieu said "[Non-Verifiable] Credentials become [Verifiable Credentials] when proofs are added."

In other words, [Non-Verifiable] Credentials are [Non-Verifiable] Credentials before/until they get a Proof (the "Verifiable Credential decorations", as you call them), at which point they become Verifiable Credentials.

@AFlowOfCode
Copy link

I was also hoping for some clarity on the top-level verifiableCredential.id property as mentioned in the first post. In my case it seems to contradict the herd privacy provided by a status list's bit array approach.

If the id is a URI specific to one particular VC and a server sees a call to that URI followed by a call to a status list it seems to me those 2 calls could justifiably be correlated to a verifier checking on a particular holder. I was under the impression that was intended to be avoided by the use of the status list format. Therefore since the ID property is optional I have removed it.

After reading through this issue I still do not understand why I would want a URI ID for a specific VC that was issued. It would make more sense to me if it identified a type of credential of which multiple holder-specific copies could be issued, but then again that seems to be covered by credentialSchema.

I can see why an issuer may want to assign a unique ID if it is maintaining knowledge of VCs it has issued, just like it would do with any database record, but I don't quite understand why that needs to be a public URI.

@brentzundel
Copy link
Member

If the id is a URI specific to one particular VC and a server sees a call to that URI followed by a call to a status list it seems to me those 2 calls could justifiably be correlated to a verifier checking on a particular holder. I was under the impression that was intended to be avoided by the use of the status list format. Therefore since the ID property is optional I have removed it.

this concern is precisely the reason the verifiableCredential.id property is not required.

@AFlowOfCode
Copy link

If the id is a URI specific to one particular VC and a server sees a call to that URI followed by a call to a status list it seems to me those 2 calls could justifiably be correlated to a verifier checking on a particular holder. I was under the impression that was intended to be avoided by the use of the status list format. Therefore since the ID property is optional I have removed it.

this concern is precisely the reason the verifiableCredential.id property is not required.

It's certainly reasoning that argues against using it, but not reasoning that contributes to understanding a proper use case. Yet if it's indeed precisely the reason, may I humbly suggest describing said reasoning explicitly in the VC Data Model spec?

Nowhere does it mention that a URI verifiableCredential.id should be omitted if implementing a status list since they are inherently in conflict (by increasing correlatability) when simultaneously implemented in this way. Only general correlation maxims are stated as a reason against its use, but this is a concrete example that could potentially be overlooked or at minimum increase confusion about a URI ID's use case (as it has for me). I did actually consider using the StatusList approach for this ID property as well, yet in the end I saw no useful reason to bother.

Nevertheless, it is beside the point of my contribution to this issue. Though that specific reasoning may be obvious to implementers and spec authors (even if it remains unstated in the spec), it does not clarify the questions brought up in this issue. It rather adds fuel to the logic behind the necessity of an issue which attempts to clarify appropriate usage of this property in the first place.

Personally I still have no understanding of why it would ever need to be a URI identifying a unique credential. But since one can easily exercise the "not required" nature of the property, it becomes something of a throwaway subject.

@RieksJ
Copy link

RieksJ commented Apr 5, 2023

If the id is a URI specific to one particular VC and a server sees a call to that URI followed by a call to a status list it seems to me those 2 calls could justifiably be correlated to a verifier checking on a particular holder. I was under the impression that was intended to be avoided by the use of the status list format. Therefore since the ID property is optional I have removed it.

That's only if you assume that every claim in the VC is about the same entity and that entity is holding the VC. When I read the VCDM, it (explicitly) states that neither of these assumptions are generally true. That's why a credential-id is not the same as a subject-id.

@AFlowOfCode
Copy link

@RieksJ Can you link to the explanation you reference, if you can recall where to find it? It could be helpful & possibly move the conversation toward clearing up some confusion about the positive cases in which you do want to use the credential ID.

If this property is specifically limited to cases such as you describe, then naturally it would make sense to state that the credential ID is actually not meant for cases where this "assumption" as you call it is known to be true. Perhaps this is where my personal lack of clarity stems from - not realizing this credential ID property never was intended to have a use case in such a single-holder context at all.

In my case I'm not assuming anything. As an implementer I can tell you that "every claim in the VC is about the same entity and that entity is holding the VC" in 100% of the cases in the system I'm working on. It's both a logical necessity and a hard requirement stemming from the type of credential the system deals with (similar to a driver's license). My line of inquiry is directly informed by the real-world context of a production system currently issuing VCs in the wild, not a casual theoretical reading of the specification while making assumptions about who might end up using it.

@RieksJ
Copy link

RieksJ commented Apr 6, 2023

@AFlowOfCode : Sure. It can be found in several places:

  1. The definition of 'issuer' implies this;
  2. The definition of credential (in the terminology section) has the phrase "The claims in a credential can be about different subjects."
  3. The two notes just below figure 6 are also illustrative: one is an example of a credential containing claims for unrelated subjects, the other says that a credential may not contain claims for the subject to which it was issued.

My idea of an assumption in this issue is any constraint in a particular context that does not hold in the general case, as described by the VCDM text. Examples of such constraints are "a credential holds exactly one claim", "all claims in a credential have the same subject" (which could be combined with: "the subject of a credential must be the holder"), etcetera. And it is obvious to me that in use-cases where particular constraints apply, you can make use of that and take appropriate shortcuts that would not work in the general case.

Where I have a problem is where you contrast your 'line of inquiry' with the VCDM spec, saying that you are doing the real practical work and VCDM is a casual theoretical reading. If that is your take, I suggest to either go away and don't get mixed up in making contributions to 'casual theoretical reading', or you make concrete proposals to improve its use, which starts by actually reading it (which I guess you haven't done because you would have easily found the references you asked me to provide), understanding it, and using the stuff as specified.

@AFlowOfCode
Copy link

@RieksJ I think you misunderstood my point, as that is not my take at all. I wanted to clarify that I am implementing a concrete situation against the spec (therefore I'm not making any assumptions about my use case) vs trying to argue in theory (which might very well involve assumptions about general use cases). In other words it was to clarify my approach to the topic, not contrast it against yours or anyone else's.

I have found that the explanation of use cases for this property as a URI is confusing or lacking and only have sought clarification. Naturally my emphasis is from my specific situation, because it's the one that matters to me right now. If it weren't for a desire to understand the intention here I wouldn't have bothered showing up at all.

I do believe you have been helpful in that I can now disregard the credential ID as simply unintended for the type of VCs I'm using. What you've cited is not something I ever disagreed with in any way, as I have always known it is possible that a VC can contain claims about multiple subjects, just as it is possible that it doesn't. None of that, other than perhaps its general nature as being optional, states that the URI credential ID is typically only intended for this type of VC (unless of course you don't care about facilitating correlatability).

I have read through the spec (last time was in Dec 2022, so I'm not sure what's been added in any updates) and I do not believe this is clear. I'm sorry but I just don't. Sometimes things are not clear even after having read something, it does happen to us lesser beings sometimes and I do offer my sincerest apologies for that. I never would have bothered seeking out this issue and attempted to gain more clarity on the intention behind this property if I thought it was clear. Perhaps you can at least grant me that?

In any case I believe I've finally gotten the clarity I came here for, thus my inquiry is at an end. Weathering some defensiveness and/or condescension is a common coin to pay with when approaching the elites sometimes, but worth the price to be sure I'm not overlooking anything important. Thanks for your contributions and please don't feel obligated to throw any more of your time away on such petty matters.

@msporny
Copy link
Member

msporny commented Apr 19, 2023

Feels like we might be able to add a diagram in an appendix about the "id"?

@brentzundel brentzundel added the pending close Close if no objection within 7 days label Apr 19, 2023
@iherman
Copy link
Member

iherman commented Apr 19, 2023

The issue was discussed in a meeting on 2023-04-19

  • no resolutions were taken
View the transcript

2.3. more clarity around the id field in the VC data model (issue vc-data-model#973)

See github issue vc-data-model#973.

Orie Steele: This is also related to those pictures, the graphical representations of what a credential looks like and whether a proof is related to it.
… I think this is another case where pictures could help a lot. It means something to have a credential that has an ID -- and it means a lot when you're joining a graph with other graphs which is a thing you do when you process the core data model.
… When you don't give it an ID, you make it basically impossible to do a join on that property. That makes visualizing what you're trying to do difficult and getting an identifier for that kind of thing in an application specific way -- all of that difficult. This all impacts app developers, data analysis, a number of things, it's important.

Orie Steele: +1 to graph visualization.

Manu Sporny: Thinking about what Orie said in the other issue and here. We have these tabs in the different types of securing mechanisms -- if we added another tab for graphical visualization I imagine that might address some of the concerns you're raising.

Orie Steele: I can provide some code for that if there is desire.

Manu Sporny: The fall back to that would be to hand create some examples and put them in the appendix. For visualizing the graphs and mental models, etc. I'm struggling to understand what we can concretely do to address the issues to just put into the minutes and given enough spare time people can work on creating those things. They will require a decent bit of time to get into a form that communicates what we want.

Ivan Herman: I must admit that whenever I work with RDF I am thinking about graphs and visuals myself, very much a +1 to what Manu says. The little problem that relates to the previous issue is that the tools that are around to create more proper visualization of these RDF graphs, to the best of my knowledge, are pretty poor. They certainly cannot handle data sets but only graphs.
… That additionally visual glue that is necessary is missing, usually and that's of course a problem.
… That means that I'm afraid that they will have to be made by hand using Google Draw or whatever.
… It's a small red flag in the direction of what Manu is saying -- maybe not systematically but maybe for some of the bigger examples doing it.
… One other comment -- this issue went in a lot of different directions, we might want to close it because it's a conversation anyway, but close with the additional agreement that we will do something with visualization or something.

Manu Sporny: +1 to Ivan's suggested path forward..

Brent Zundel: So raising an issue to add visualizations as a concrete way to solve this and close it.
… Can anyone open the new issue and link it to this one?.

Orie Steele: I can do that.

Brent Zundel: I'm going to mark this one pending close.
… Thank you.
… I'm going to resist the temptation to keep talking about issues even though I really want to be I really want to because we're past the time that was allotted for that.
… So, moving onto work item status and PRs.

@brentzundel
Copy link
Member

No objections raised since being marked pending close. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conversation pending close Close if no objection within 7 days
Projects
None yet
Development

No branches or pull requests