Skip to content

MMS process for ingestion, matchup, MMD generation

martin-boettcher edited this page Aug 8, 2011 · 7 revisions

This page describes the initial process of ingestion, matchup computation and complete MMD generation. The process shall follow "option 2c" with parallelisation of all three parts on the Eddie cluster, but with a final centralised database.

hosts: thetis, Eddie frontend, Eddie nodes
database: mmdb1 on thetis located in /home/v1mbottc/mms/db
backups: /exports/nas/exports/cse/geos/scratch/gc/sst-cci/db-backups
archive: /exports/nas/exports/cse/geos/scratch/gc/sst-cci/archive

The process is split into several phases:

  1. MD ingestion + removal of duplicates
  2. matchup computation
  3. timeandlocation file extraction + referenceflag updates
  4. small datasets ingestion and coincidences
  5. staging (per year or month)
  6. satellite data ingestion and coincidences (per month)
  7. ARC1+2 (+ NWP?) + reingestion (per month)
  8. NWP + ARC3 + regingestion (per month)
  9. MMD generation (per month)
  10. de-staging of satellite data

MD ingestion + removal of duplicates

This step ingests MD files of ATSR, METOP, SEVIRI and AVHRR for 1991 to 2010.

host: thetis
configuration: /home/v1mbottc/mms/requests/md/mms-config-ingest-md.properties
tool: mmsingest.sh
trace: /home/v1mbottc/mms/log/ingestion-md.out

Execution:

ssh v1mbottc@thetis
cd mms/requests/md
./ingest-md.sh

Ingestion performed 2011-08-04 and 2011-08-05, runtime 7.5 h

The second part of this step removes duplicates from the MD records just ingested of ATSR_MD, METOP, SEVIRI and AVHRR_MD for 1991 to 2010.

host: thetis
configuration: /home/v1mbottc/mms/requests/md/mms-config-ingest-md.properties
tool: mmsmatchup.sh
trace: /home/v1mbottc/mms/log/duplicates-md.out

Execution:

ssh v1mbottc@thetis
cd mms/requests/md
./duplicates-md.sh

Duplicates removal performed 2011-08-05 to 2011-08-08, runtime 56.5 h

Number of observations per sensor and year before and after removal of duplicates:

mmdb1=# select sensor, extract(year from time), count(*) from mm_observation group by sensor, extract(year from time);
  sensor  | date_part | with dup |  no dup
----------+-----------+----------+--------
 atsr_md  |      1991 |   62914  |   52020
 atsr_md  |      1992 |  163707  |  134722
 atsr_md  |      1993 |  180238  |  147686
 atsr_md  |      1994 |  165508  |  133346
 atsr_md  |      1995 |  311393  |  246445
 atsr_md  |      1996 |  185769  |  147516
 atsr_md  |      1997 |  209854  |  166802
 atsr_md  |      1998 |  165407  |  129360
 atsr_md  |      1999 |  184064  |  142504
 atsr_md  |      2000 |  187846  |  148694
 atsr_md  |      2001 |  186961  |  134737
 atsr_md  |      2002 |  303944  |  218581
 atsr_md  |      2003 |  349071  |  245611
 atsr_md  |      2004 |  274494  |  187291
 atsr_md  |      2005 |  345834  |  221007
 atsr_md  |      2006 |  397214  |  242365
 atsr_md  |      2007 |  387729  |  239529
 atsr_md  |      2008 |  455368  |  241556
 atsr_md  |      2009 |  475167  |  254349
 atsr_md  |      2010 |  495443  |  257905
 atsr_md  |      2011 |      71  |      56
 avhrr_md |      1991 |   90507  |   90243
 avhrr_md |      1992 |  156354  |  155688
 avhrr_md |      1993 |  171920  |  170942
 avhrr_md |      1994 |  172637  |  170734
 avhrr_md |      1995 |  223063  |  221911
 avhrr_md |      1996 |  260955  |  260678
 avhrr_md |      1997 |  212109  |  211721
 avhrr_md |      1998 |  243354  |  242893
 avhrr_md |      1999 |  310978  |  310493
 avhrr_md |      2000 |  148986  |  147348
 avhrr_md |      2001 |  456065  |  453347
 avhrr_md |      2002 |  573236  |  567718
 avhrr_md |      2003 |  768107  |  762882
 avhrr_md |      2004 |  431492  |  429072
 avhrr_md |      2005 | 1006129  |  972769
 avhrr_md |      2006 | 1064924  | 1031641
 avhrr_md |      2007 |  580919  |  580458
 metop    |      2007 | 1302899  |  280204
 metop    |      2008 | 2794348  |  546279
 metop    |      2009 | 3012068  |  545244
 metop    |      2010 | 2892698  |  590147
 metop    |      2011 |       7  |       2
 seviri   |      2009 |     273  |     158
 seviri   |      2010 | 1889573  | 1331489
(45 rows)