Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DDI 2.5 OtherMat and FileDesc #9489

Closed
guinslym opened this issue Mar 30, 2023 · 3 comments
Closed

DDI 2.5 OtherMat and FileDesc #9489

guinslym opened this issue Mar 30, 2023 · 3 comments

Comments

@guinslym
Copy link

guinslym commented Mar 30, 2023

What steps does it take to reproduce the issue?
When I exported the Metadata as DDI

  • When does this issue occur?
    The value of two metadata tag is different than what is on the DDI standard.
page 35 <otherMat> <notes> (level/type/subject)
page 33 <fileDscr> <notes> (level/type/subject)

  • What happens?
    Notes in DDI has attribute but in Dataversse the content is dsifferent than what's on the DDI Standard
    <notes level="file" type="DATAVERSE:CONTENTTYPE" subject="Content/MIME Type">text/csv</notes>

  • What did you expect to happen?
    My question is what is the defintion for those attributes Type, Level and Subject withtin otherMat>notes and FileDscr> notes? as the one generated from DV is different from the DDI standard?

Which version of Dataverse are you using?
Latest

Any related open or closed issues to this bug report?
No

Screenshots:
image

@landreev
Copy link
Contributor

Thank you for the report!
I just made a pull request (#9484) fixing a whole bunch of invalid/schema-violating content in the dataset metadata section (<stdyDscr>) of our DDI exports. I haven't touched anything in the <fileDscr> or <otherMat> sections there, but let me try and see if I can address the problems above in that same pull request as well.

@landreev
Copy link
Contributor

landreev commented Mar 30, 2023

Hi,
I may have misunderstood your report/question, sorry. I thought you were saying that the notes in our <fileDscr> <otherMat> sections were violating the 2.5 DDI schema... But I just double-checked, and they appear to be valid - at least according to the schema. The validator tool I like to use (https://cmv.cessda.eu/#!validation) does not complain about these sections in our DDI records.

So could you please clarify what you mean by "different than what is on the DDI standard"? Do you mean, different from the attribute values in some example provided by the DDI project?
Please keep in mind that I'm a developer at the Dataverse project, but not really an expert on the DDI standard. So I may be missing something obvious.

But reading the part of the schema (https://ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd) that describes what the attributes of a <note> section should look like:

<xs:complexType name="notesType" mixed="true">
   <xs:complexContent>
      <xs:extension base="tableAndTextType">
         <xs:attribute name="type" type="xs:string"/>
         <xs:attribute name="subject" type="xs:string"/>
         <xs:attribute name="level" type="xs:string"/>
         <xs:attribute name="resp" type="xs:string"/>
         <xs:attribute name="sdatrefs" type="xs:IDREFS"/>
         <xs:attribute name="parent" type="xs:IDREFS" use="optional"/>
         <xs:attribute name="sameNote" type="xs:IDREF" use="optional"/>
         </xs:extension>
   </xs:complexContent>
</xs:complexType>

it appears to be saying, that the 3 attributes we are using - level, type and subject - are allowed and their values can be arbitrary strings. So the values we are using are something we came up with to specify what each notes value represents.
So, <notes level="file" type="DATAVERSE:CONTENTTYPE" subject="Content/MIME Type">text/csv</notes> is our way of saying "DDI schema does not define a standard place to specify the mime type of a file in a dataset, but it does allow arbitrary text notes; so we are going to use this type of notes specifically for mime types".

One thing I'm seeing in the explanation text for the level attribute in the schema, is that they are recommending to use level=datafile in notes that describe file materials:

The "level" attribute is used to clarify the relationship of the other materials to 
components of the study. Suggested values for level include specifications of 
the item level to which the element applies: e.g., level= data; level=datafile; 
level=studydsc; level=study. The URI attribute need not be used in every case; 
it is intended for capturing references to other materials separate from the 
codebook itself. In Section 5, Other Material is recursively defined.

- but I'm not sure if this really means that level=file that we are using is necessarily wrong either (?).

Once again - it's quite likely I'm not aware of something important here, so please bear with me. :)

@landreev
Copy link
Contributor

I'm going to close this one quietly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants