Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add grouping feature by uid attribute #6

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

wadahiro
Copy link

@wadahiro wadahiro commented Oct 2, 2023

I added groupByEnabled option, which uses the uid attribute for grouping. Enabling this option allows CSV records with multiple rows per account to be aggregated by uid attribute and read as JSON string.

Motivation

In our experience with IDM projects, when importing data into the IDM system via CSV, we often deal with multiple rows per account. For example, the following csv:

id;name;dept;title
1;john;abc;engineer
1;john;efg;manager
2;jack;abc;manager

In the above example, the first row means that john belongs to the abc department as engineer, and the second row means that john belongs to the efg department as manager.

The current CSV connector cannot handle this kind of one-account, multiple rows data. Therefore, in this use case, we currently had to create custom BulkAction to read CSV files and update midPoint user data.
We would be happy if the CSV connector natively supports such use cases, so I have created this pull request.

How does this option work?

When groupByEnabled is enabled, then the executeQuery method is called, it will perform grouping by the value of the Uid and map it as a JSON string to a special attribute, __RAW_JSON__. For example, in the CSV john example above, the following JSON string is mapped. Note that other attributes are mapped from the first row of the grouped rows.

[
  {
    "name": "john",
    "id": "1",
    "dept": "abc",
    "title": "engineer"
  },
  {
    "name": "john",
    "id": "1",
    "dept": "efg",
    "title": "manager"
  }
]

Example inbound mapping on midPoint side

Here is an example of inbound mapping of a CSV resource definition that searches for OrgType by the value of the dept column in the CSV and assigns that organization. The value of the title column of the CSV is set to the subtype of the assignment to express which position in the organization is being assigned.

            <attribute>
                <c:ref>ri:__RAW_JSON__</c:ref>
                <inbound>
                    <expression>
                        <script>
                            <code>
                                import groovy.json.JsonSlurper
                                import com.evolveum.midpoint.xml.ns._public.common.common_3.*
                                
                                json = new JsonSlurper().parseText(input)
                                assignments = json.collect { [midpoint.searchObjectByName(OrgType.class, it.dept), it.title] }
                                    .findAll { it[0] != null }
                                    .collect {
                                        ref = new ObjectReferenceType()
                                        ref.setOid(it[0].oid)
                                        ref.setType(OrgType.COMPLEX_TYPE)

                                        assignment = new AssignmentType()
                                        assignment.setTargetRef(ref)
                                        assignment.subtype("dept")
                                        assignment.subtype(it[1])
                                        return assignment
                                    }
                                
                                log.info("created assignments: {}", assignments)
                                
                                return assignments
                            </code>
                        </script>
                    </expression>
                    <target>
                        <path>$focus/assignment</path>
                        <set>
                            <condition>
                                <script>
                                    <code>
                                        assignment?.subtype?.contains('dept')
                                    </code>
                                </script>
                            </condition>
                        </set>
                    </target>
                </inbound>

Enabling the groupByEnabled option allows CSV records with multiple rows per account to be aggregated by uid attribute and read as JSON string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant