-
Notifications
You must be signed in to change notification settings - Fork 7
MMS process for ingestion, matchup, MMD generation
martin-boettcher edited this page Aug 8, 2011
·
7 revisions
This page describes the initial process of ingestion, matchup computation and complete MMD generation. The process shall follow "option 2c" with parallelisation of all three parts on the Eddie cluster, but with a final centralised database.
hosts: thetis, Eddie frontend, Eddie nodes
database: mmdb1 on thetis located in /home/v1mbottc/mms/db
backups: /exports/nas/exports/cse/geos/scratch/gc/sst-cci/db-backups
archive: /exports/nas/exports/cse/geos/scratch/gc/sst-cci/archive
The process is split into several phases:
- MD ingestion + removal of duplicates
- matchup computation
- timeandlocation file extraction + referenceflag updates
- small datasets ingestion and coincidences
- staging (per year or month)
- satellite data ingestion and coincidences (per month)
- ARC1+2 (+ NWP?) + reingestion (per month)
- NWP + ARC3 + regingestion (per month)
- MMD generation (per month)
- de-staging of satellite data
This step ingests MD files of ATSR, METOP, SEVIRI and AVHRR for 1991 to 2010.
host: thetis
configuration: /home/v1mbottc/mms/requests/md/mms-config-ingest-md.properties
tool: mmsingest.sh
trace: /home/v1mbottc/mms/log/ingestion-md.out
Execution:
ssh v1mbottc@thetis
cd mms/requests/md
./ingest-md.sh
Ingestion performed 2011-08-04 and 2011-08-05, runtime 7.5 h
The second part of this step removes duplicates from the MD records just ingested of ATSR_MD, METOP, SEVIRI and AVHRR_MD for 1991 to 2010.
host: thetis
configuration: /home/v1mbottc/mms/requests/md/mms-config-ingest-md.properties
tool: mmsmatchup.sh
trace: /home/v1mbottc/mms/log/duplicates-md.out
Execution:
ssh v1mbottc@thetis
cd mms/requests/md
./duplicates-md.sh
Duplicates removal performed 2011-08-05 to 2011-08-08, runtime 56.5 h
Number of observations per sensor and year before and after removal of duplicates:
mmdb1=# select sensor, extract(year from time), count(*) from mm_observation group by sensor, extract(year from time);
sensor | date_part | with dup | no dup
----------+-----------+----------+--------
atsr_md | 1991 | 62914 | 52020
atsr_md | 1992 | 163707 | 134722
atsr_md | 1993 | 180238 | 147686
atsr_md | 1994 | 165508 | 133346
atsr_md | 1995 | 311393 | 246445
atsr_md | 1996 | 185769 | 147516
atsr_md | 1997 | 209854 | 166802
atsr_md | 1998 | 165407 | 129360
atsr_md | 1999 | 184064 | 142504
atsr_md | 2000 | 187846 | 148694
atsr_md | 2001 | 186961 | 134737
atsr_md | 2002 | 303944 | 218581
atsr_md | 2003 | 349071 | 245611
atsr_md | 2004 | 274494 | 187291
atsr_md | 2005 | 345834 | 221007
atsr_md | 2006 | 397214 | 242365
atsr_md | 2007 | 387729 | 239529
atsr_md | 2008 | 455368 | 241556
atsr_md | 2009 | 475167 | 254349
atsr_md | 2010 | 495443 | 257905
atsr_md | 2011 | 71 | 56
avhrr_md | 1991 | 90507 | 90243
avhrr_md | 1992 | 156354 | 155688
avhrr_md | 1993 | 171920 | 170942
avhrr_md | 1994 | 172637 | 170734
avhrr_md | 1995 | 223063 | 221911
avhrr_md | 1996 | 260955 | 260678
avhrr_md | 1997 | 212109 | 211721
avhrr_md | 1998 | 243354 | 242893
avhrr_md | 1999 | 310978 | 310493
avhrr_md | 2000 | 148986 | 147348
avhrr_md | 2001 | 456065 | 453347
avhrr_md | 2002 | 573236 | 567718
avhrr_md | 2003 | 768107 | 762882
avhrr_md | 2004 | 431492 | 429072
avhrr_md | 2005 | 1006129 | 972769
avhrr_md | 2006 | 1064924 | 1031641
avhrr_md | 2007 | 580919 | 580458
metop | 2007 | 1302899 | 280204
metop | 2008 | 2794348 | 546279
metop | 2009 | 3012068 | 545244
metop | 2010 | 2892698 | 590147
metop | 2011 | 7 | 2
seviri | 2009 | 273 | 158
seviri | 2010 | 1889573 | 1331489
(45 rows)