Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internationalization - Facet Category/Facet Label #5207

Closed
JayanthyChengan opened this issue Oct 17, 2018 · 16 comments
Closed

Internationalization - Facet Category/Facet Label #5207

JayanthyChengan opened this issue Oct 17, 2018 · 16 comments

Comments

@JayanthyChengan
Copy link
Contributor

JayanthyChengan commented Oct 17, 2018

Noticed the FacetCategory and FacetLabel remains in English and not toggling with language.

So, thought of adding those terms as key=value pairs in the bundle property files and implement as below in search-include-fragment.xhtml

https://github.com/IQSS/dataverse/blob/develop/src/main/webapp/search-include-fragment.xhtml#L177

<h:outputText value="#{facetCategory.friendlyName}" styleClass="facetCategoryName"/>

modify to

<h:outputText value="#{bundle[facetCategory.friendlyName]}" styleClass="facetCategoryName"/>

Example:

FacetCategory:

#search-include-fragment.xhtml bundle[facetCategory.friendlyName]
Dataverse\u0020Category=Catégorie Dataverse
Publication\u0020Date=Date de publication
Author-Name=Nom \u2014 Auteur
Subject=Sujet
Deposit\u0020Date=Date de dépôt
File\u0020Type=Type de fichier
File\u0020Tag=Libellé de fichier
Access=Accès
Keyword-Term=Mot-clé \u2014 Terme
Author\u0020Affiliation=Affiliation de l'auteur
Language=Langue
Kind\u0020of\u0020Data=Type de données
Publication\u0020Status=Statut de publication

FacetLabel:

Researcher=Chercheur
Research\u0020Project=Projet de recherche
Journal=Revue
Organizations\u0020or\u0020Institutions=Organisation ou établissement
Teaching\u0020Course=Cours
Uncategorized=Sans catégorie
Research\u0020Group=Groupe de recherche
Laboratory=Laboratoire
Agricultural\u0020Sciences=Sciences de l'agriculture
Arts\u0020and\u0020Humanities=Arts et sciences humaines
Astronomy\u0020and\u0020Astrophysics=Astronomie et astrophysique
Business\u0020and\u0020Management=Affaires et gestion
Chemistry=Chimie
Earth\u0020and\u0020Environmental\u0020Sciences=Sciences de la terre et de l'environnement
Engineering=Génie
Medicine,\u0020Health\u0020and\u0020Life\u0020Sciences=Médecine, santé et sciences de la vie
Computer\u0020and\u0020Information\u0020Science=Informatique et science de l'information
Law=Droit
Mathematical\u0020Sciences=Sciences mathématiques
Physics=Physique
Social\u0020Sciences=Sciences sociales
Other=Autre
Published=Published
Unpublished= Non publiée
Draft= Version provisoire
In\u2000Review= En révision
Deaccessioned=Retirée

@scolapasta: please advice

screen shot 2018-10-17 at 11 31 18 am

Thanks

@pdurbin
Copy link
Member

pdurbin commented Oct 17, 2018

@4tikhonov mentioned the need to translate facets in last week's community call: https://groups.google.com/d/msg/dataverse-community/71kuJ6TdUIg/1NtdQPEcBAAJ

I had asked him to leave a comment on #4684 but now that this issue exists, we should have the conversation here instead.

@4tikhonov
Copy link
Contributor

hi @pdurbin and @JayanthyChengan, this is exactly what I have asked during Community call. We have already requested translation of SOLR schema to all languages from DataverseEU community members.

@scolapasta
Copy link
Contributor

For facet categories, I would think it would be just straightforward to take what solr returns and use that as a key into a bundle file. Did that not work so straightforwardly?

For facet labels:
I think it's because these are "values" and we store in solr the values so they are searchable. (we don't translate free text fields, for example)

So not sure what is best here. we could store the "key" in solr and then translate that with the bundles? but not sure if it would be weird that some of the metadata is translated and some is not.

I think we need to decide what the behavior we want is, before we decide on how to do that.

@4tikhonov
Copy link
Contributor

The biggest problem is a maintenance of all properties in all languages in bundle file and solr. We're considering possibility to turn all properties to RDF that will be checked by some tool after every Dataverse update and will show all new/not translated properties with some provenance information who is responsible for the specific language.

@JayanthyChengan
Copy link
Contributor Author

JayanthyChengan commented Nov 14, 2018

@scolapasta
We would like to build the interface with the "facetCategory.friendlyName" and "facetLabel.name" in search-include-fragment.xhtml toggles along with the language selection.

As we implemented at https://dataverse.scholarsportal.info/ , where key=value pairs are added to bundle property files

https://github.com/scholarsportal/SP-dataverse/blob/SP_v4.8.6.1/src/main/java/Bundle_fr.properties#L1958

https://github.com/scholarsportal/SP-dataverse/blob/SP_v4.8.6.1/src/main/java/Bundle_fr.properties#L2108

@scolapasta
Copy link
Contributor

@JayanthyChengan @juancorr there are two issues related to facets and internationalization, this one and #5623. I'd like to suggest we close one and have the discussion in one. (and arbitrarily chose this one, but am fine switching). Is that OK? (if we do we should connect the other PR to this one.)

That said, I have thoughts on both PRs. We had a meeting here (tech hours) where we discussed, and while I'm not yet sure what our final guidance might be, we have some thoughts.

  1. We shouldn't be adding facet categories and labels to the generic bundle.properties as they are already localized in the bundles for the metadata blocks (see @juancorr 's PR to see how he handles that)
  2. However, if we do that, we'll need to also be able to somehow figure out which block, since the facets are dynamic and not necessarily always from the citation block.
  3. Translating the categories makes complete sense; the labels are more problematic since they represent the data. @JayanthyChengan I noticed you translated them for ones that are controlled vocabulary, but even then I'm not sure that's the best approach. You'll notice that if your on SP dataverse if you look at a dataset that is one of those values, you see that that the value is still in English. (and for searching in solr, the value is in english).
  4. For the compound fields, I understand that "Autor nôm" doesn't make sense, but do we know if using a ":" would work in all languages?

Please add some discussion if we can figure out how to move forward. My suggestion is that we start with a variation of @juancorr's pr (modifying it to support facets from any metadata block) and we would get the facet categories first.

@scolapasta
Copy link
Contributor

@JayanthyChengan @juancorr I went ahead and consolidated the issues - but feel free to let us know if you prefer it the other way around.

Could you please provide comments on my suggestion above?

@JayanthyChengan
Copy link
Contributor Author

JayanthyChengan commented Mar 25, 2019

@scolapasta : I am fine with continuing the discussion in this issue.

So, as a first step, I was looking only at translating the facet categories.

In the current code, the Facet Category related to any dataset field from metadata block is translated properly in the interface. here

for (DatasetFieldType datasetField : datasetFields) { String solrFieldNameForDataset = datasetField.getSolrField().getNameFacetable(); **String friendlyName = datasetField.getDisplayName();** if (solrFieldNameForDataset != null && facetField.getName().endsWith(datasetField.getTmpNullFieldTypeIdentifier())) { // give it the non-friendly name so we remember to update the reference data script for datasets facetCategory.setName(facetField.getName()); } else if (solrFieldNameForDataset != null && facetField.getName().equals(solrFieldNameForDataset)) { if (friendlyName != null && !friendlyName.isEmpty()) { **facetCategory.setFriendlyName(friendlyName);** // stop examining available dataset fields. we found a match break; } } datasetfieldFriendlyNamesBySolrField.put(datasetField.getSolrField().getNameFacetable(), friendlyName); }

But there are some facet fields added to the solrQuery https://github.com/scholarsportal/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/search/SearchServiceBean.java#L227 and I guess they are not specific to any metadata block.

Please check staticsearchfield at here

Can we add those to bundle.properties? Correct me if I am wrong.

@scolapasta
Copy link
Contributor

@JayanthyChengan Yes, for the static ones, having them in the main bundle is fine.

@juancorr
Copy link

Hi @JayanthyChengan and @scolapasta,

sorry for my late response. I am comfortable with the @JayanthyChengan solution. In e-cienciaDatos tests, we have put all solr entries in a new properties file (we have called it solr), but this solution have some problems:

  • It is not easy to upgrade to future versions.
  • solr properties have duplicate entries in the file and another files (citation, main bundle, ...)
  • All solr entries are in the same file yet, there are not separate entries for subjects, languages, ...

You can see current e-cienciaDatos develop environment in http://oaimadrono.uned.es:8080/dataverse/madrono .
The code and properties file are:

@scolapasta , I think that the ":" solution to translate compound fields could not be valid for all languages. Another simple solution could be ignore the right part (Autor instead of Autor: nôm) .

@JayanthyChengan , I think that you are right. We have updated the block of code that you have signaled in the previous comment:

@JayanthyChengan
Copy link
Contributor Author

JayanthyChengan commented Mar 28, 2019

@scolapasta Please check the pull request #5697 ,where I tried both FacetCategoryName and FacetLabels to be translated from metadatablock property files and added staticSearchField in bundle.properties

The staticSearchField are coming from SearchFields.java, so added all those fields in Bundle.properties

Haven't handled compound names in facetCategory. Please review my code and let me know your opinion. Thanks

@JayanthyChengan
Copy link
Contributor Author

@scolapasta - is there any update on the PR #5697 ? Thanks

@pdurbin pdurbin added this to the 4.16 milestone Oct 12, 2019
@mhvezina
Copy link
Contributor

@scolapasta, @JayanthyChengan, @juancorr
Suggestion regarding compound names in facetCategory.

Why not use the > symbol which suggests a logical hierarchical sequence and that is often use with breadcrumb navigation. Would that work for all (a majority of maybe?) languages ?

e.g.:

Auteur > Nom
Auteur > Affiliation
Renseignements sur la subvention > Organisme subventionnaire 
Renseignements sur la subvention > Numéro de la subvention

Author > Name
Author > Affiliation
Grant Information > Grant Agency 
Grant Information > Grant Number

Autor > Nombre
Autor > Afiliación
Información de la subvención > Agencia subvencionadora
Información de la subvención > Número de subvención

@djbrooke
Copy link
Contributor

Hi @mhvezina, I closed this because it was marked as delivered in 4.16. If there's still specific work to be done, can you you create this in a new issue? Thank you for all of your work on internationalization !!

@mhvezina
Copy link
Contributor

sorry about that @djbrooke. I have created a new issue: #6573

Should I delete the former comment made on the #5697 closed issue?

@djbrooke
Copy link
Contributor

No need to delete it, and thanks for creating the new issue so quickly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants