Evidence Link Creation in the OnDemandDruidExhaust Job Data Product #29
Shakthieshwari
started this conversation in
Enhancement
Replies: 2 comments 6 replies
-
@sowmya-dixit @anandp504 Any update on this ? Please do help us out Thanks |
Beta Was this translation helpful? Give feedback.
0 replies
-
@sowmya-dixit @anandp504 Any update on this ? Please do help us out, If required we can connect as well Thanks |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi Team,
As part of 5.1 release, We have this https://project-sunbird.atlassian.net/browse/OB-70 Story.
ML Feature has a resource’s called observation, survey and projects where user can upload evidence/attachments (File) which is getting stored in the Azure cloud Storage. In the Program Dashboard CSV, We wanted to send this attachments as a link, Currently ML Data-Pipeline generates the link and store’s into druid.
Now to support Multi-Cloud Storage (Azure,AWS,GCP,Oracle), We are planning to create the Link in the OnDemandDruidExhaust Data Product itself Dynamically based on the cloud storage from the config file of druid query.
Sample Link https://{{azure_storage_account}}.blob.core.windows.net/{{azure_container_name}}/survey/631041ecd58d74000aec9e7f/bc374bdf-8c59-4036-b1c0-b1db471da3f1/e768b582-2b61-4913-a8db-dccc650c767e/fix csv report.png
https://sunbirdstagingpublic.blob.core.windows.net/samiksha/survey/631041ecd58d74000aec9e7f/bc374bdf-8c59-4036-b1c0-b1db471da3f1/e768b582-2b61-4913-a8db-dccc650c767e/fix csv report.png
Approach :-
In our ML Druid Datasource, we will store only FileSourcePath and generate the Link in the Scala Data Product and store the link into CSV for the Program Dashboard Usage.
With the help of FileSourcePath from the ML Druid Datasource we need to create a evidence link by modifying the Scala Data Product by getting the druid query from the config.
Sample Config for Multi-Cloud Storage Support :-
{"id":"ml-obs-question-detail-exhaust","labels":{"questionName":"Question","user_districtName":"Declared District","evidences":"Evidences","questionResponseLabel":"Question_response_label","solutionExternalId":"Observation ID","school_code":"Declared School ID","user_type":"User Type","role_title":"User Sub Type","minScore":"Question score","programName":"Program Name","questionExternalId":"Question_external_id","organisation_name":"Organisation Name","user_boardName":"Declared Board","createdBy":"UUID","remarks":"Remarks","user_blockName":"Declared Block","solutionName":"Observation Name","user_schoolName":"Declared School Name","programExternalId":"Program ID","user_stateName":"Declared State","observationSubmissionId":"observation_submission_id","districtName":"District observed","blockName":"Block observed","schoolName":"School observed","schoolExternalId":"ID of school observed"},"dateRange":{"interval":{"startDate":"1901-01-01","endDate":"2101-01-01"},"granularity":"all","intervalSlider":0},"metrics":[{"metric":"total_content_plays_on_portal","label":"total_content_plays_on_portal","druidQuery":{"intervals":"1901-01-01T00:00+00:00/2101-01-01T00:00:00+00:00","dataSource":"sl-observation","columns":["createdBy","user_type","role_title","user_stateName","user_districtName","user_blockName","school_code","user_schoolName","user_boardName","organisation_name","programName","programExternalId","solutionName","solutionExternalId","districtName","blockName","schoolName","schoolExternalId","observationSubmissionId","questionExternalId","questionName","questionResponseLabel","minScore","evidences","remarks"],"queryType":"scan"}}],"output":[{"zip":false,"label":"","dims":["date"],"fileParameters":["id","dims"],"metrics":["createdBy","user_type","role_title","user_stateName","user_districtName","user_blockName","school_code","user_schoolName","user_boardName","organisation_name","programName","programExternalId","solutionName","solutionExternalId","districtName","blockName","schoolName","schoolExternalId","observationSubmissionId","questionExternalId","questionName","questionResponseLabel","minScore","evidences","remarks"],"type":"csv"}],"sort":["UUID","Program ID","Observation ID","observation_submission_id","Question_external_id"],"queryType":"scan",**"cloud_storage":{"type":"S3(Azure,GCP,Oracle)","storage_account":"xyz","bucket_name(container_name)":"abc","base_url":"http://s3-REGION-.amazonaws.com/BUCKET-NAME/KEY"}**}
As part of 5.1 release,@sowmya-dixit @anandp504 Please Let us know if we can enhance this approach in the OnDemandDruidExhaust Data Product Job.
Cc- @aishwaryashikshalokam @Ashwiniev95 @Prateek-slokam @aks30 @kiranharidas187 @vijiurs @snehangsude
Please do the needful at the earliest
Awaiting your reply
Thanks
Beta Was this translation helpful? Give feedback.
All reactions