forked from wkliao/parallel-netcdf
-
Notifications
You must be signed in to change notification settings - Fork 0
/
DEVELOPER_NOTES
745 lines (626 loc) · 35.4 KB
/
DEVELOPER_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
$Id$
==============================================================================
Tasks immediately before a new release (must run in the following order)
==============================================================================
0. Match NetCDF error codes. See "Conform with netCDF" below.
1. Make sure build is successful
* build and test under the same directory as source
- autoreconf
- configure (with command-line options: --enable-strict
--enable-coverage
--disable-cxx
--disable-fortran
CPPFLAGS=-funsigned-char
and their combinations)
For C only, run address sanitizer build (gcc and clang) by adding
--disable-fortran \
CFLAGS="-O0 -Wall -fsanitize=address -fno-omit-frame-pointer" \
CXXFLAGS="-O0 -Wall -fsanitize=address -fno-omit-frame-pointer"
- make check
- run "make check" with valgrind by adding the following lines to command
"make check". As of PnetCDF 1.8.0, "make check" passes without any
warning and error from valgrind. (See note below.)
* when built with --enable-shared, use commands:
make check \
TESTSEQRUN="libtool --mode=execute valgrind --quiet --leak-check=full" \
TESTMPIRUN="mpiexec -n NP libtool --mode=execute valgrind --quiet --leak-check=full"
* when built without shared libraries, use commands:
make check \
TESTSEQRUN="valgrind --quiet --leak-check=full" \
TESTMPIRUN="mpiexec -n NP valgrind --quiet --leak-check=full"
- make ptests
- run "make ptests" with valgrind.
* For using valgrind, set TESTSEQRUN and TESTMPIRUN as described above.
- make distcheck
- make -s V=1 LIBTOOLFLAGS=--silent distcheck DISTCHECK_CONFIGURE_FLAGS="--silent --enable-shared"
- check Building Binary Packages Using DESTDIR (automake chapter 2.2.10)
make DESTDIR=$HOME/inst install
This creates a folder under $HOME/inst with name equal to $prefix.
- test if file system type prefix added to file name is acceptable.
For example, "ufs:" added to file name. Note this ROMIO convention may
not be portable, so it is not added to the regular test programs.
* build and test on different platforms with various compilers:
- login.mcs.anl.gov : Intel, NAG Fortran, PGI, Solaris Studio compilers
- edison.nersc.gov : Cray, Intel compilers
- cetus.alcf.anl.gov : IBM BGQ, XL compilers, cross-compiling, Big Endian
- k.aics.riken.jp : Fujitsu compilers, cross-compiling, Big Endian
* build benchmarks/FLASH-IO separately from PnetCDF (it has its own build
script, i.e. configure.ac)
* build netCDF latest release against PnetCDF
2. Set the release version
In file "configure.ac", variable $PNETCDF_VERSION indicates the version.
It is automatically generated from the 2nd argument of AC_INIT call.
Revise that string to the right release version. For example,
AC_INIT([parallel-netcdf],[1.8.1],[[email protected]])
^^^^^
3. Update file "RELEASE_NOTES".
Copy the contents of file "sneak_peek" into file "RELEASE_NOTES".
Add the release version and date at the top of file "RELEASE_NOTES".
Clear up file "sneak_peek" (reset all items to none)
4. Update the release date (skip this step for 1.9.0 and later)
Below describe how to set the release date for 1.8.1 and prior.
For more info, see below section "Setting PnetCDF software release date".
Run command "svn commit" to get the svn property \$LastChangedDate\$
updated in file "configure.in". That string will be used as the official
release date.
svn commit -m "set release date of version 1.6.1 to today"
5. Create an SVN tag.
Run command below to duplicate the current trunk to a new tag:
svn copy https://svn.mcs.anl.gov/repos/parallel-netcdf/trunk \
https://svn.mcs.anl.gov/repos/parallel-netcdf/tags/v1-8-1 \
-m "Tagging release version 1.8.1"
6. Generate a new "configure" file with the release date by running command
"autoreconf" which will generate file "configure" to be included in the
release (but not to be saved in the SVN repo).
* Must use the bug-fixed autotools of version 2.69 or higher to run
"autoreconf". See README.Fujitsu for the bug in autoconf 2.69 and patch.
7. Create all Makefiles.
Run command "./configure" to create all Makefiles, so in the next step we
can run command "make dist" to create the tar balls.
8. Create tar balls.
Run command "make dist" to produce two tar balls for PnetCDF release:
parallel-netcdf-$(PNETCDF_VERSION).tar.gz
parallel-netcdf-$(PNETCDF_VERSION).tar.bz2
(Version 1.9.0 and later will only create .gz file.)
9. Generate SHA1 checksums
Run command:
openssl sha1 parallel-netcdf-1.6.1.tar.gz
openssl sha1 parallel-netcdf-1.6.1.tar.bz2
Example command-line output:
SHA1(parallel-netcdf-1.6.1.tar.gz)= 495d42f0a41abbd09d276262dce0f7c1c535968a
=====================================================
Convention of setting version numbers.
http://semver.org/
=====================================================
Given a version number MAJOR.MINOR.PATCH, increment the:
1. MAJOR version when you make incompatible API changes,
2. MINOR version when you add functionality in a backwards-compatible
manner, and
3. PATCH version when you make backwards-compatible bug fixes.
Additional labels for pre-release and build metadata are available as
extensions to the MAJOR.MINOR.PATCH format.
(For shared library versioning)
For libtool ABI versioning rules see:
http://www.gnu.org/software/libtool/manual/libtool.html#Updating-version-info
Update the version information only immediately before a public release.
In configure.ac, change/set variable ABIVERSION to the new version.
Here are a set of rules to help you update your library version information:
1. Start with version information of '0:0:0' for each libtool library.
2. Update the version information only immediately before a public release
of your software. More frequent updates are unnecessary, and only
guarantee that the current interface number gets larger faster.
3. If the library source code has changed at all since the last update, then
increment revision ('c:r:a' becomes 'c:r+1:a').
4. If any interfaces have been added, removed, or changed since the last
update, increment current, and set revision to 0.
5. If any interfaces have been added since the last public release, then
increment age.
6. If any interfaces have been removed or changed since the last public
release, then set age to 0.
=====================================================
Note on adding new MPI compiler candidates
=====================================================
- In configure.ac, check the following variables
CANDIDATE_MPICC
CANDIDATE_MPICXX
CANDIDATE_MPIF77
CANDIDATE_MPIF90
To make configure command automatically detect MPI compilers,
add new MPI compiler names to the list of the above variables.
=====================================================
Note on autotools version used for software development
=====================================================
- Starting from 1.9.0, in order to support building shared libraries,
the following minimum versions are required.
* autoconf 2.69
* automake 1.15
* libtool 2.4.6
Even these fairly recent versions can still fail to build on some platforms.
See the Section "Hacking autoconf and automake" below for patches to
customize autoconf and automake.
- Prior to 1.8.1, configure.in is developed based on autotools v2.59
I, Wei-keng, tend to test it using v2.59, in case PnetCDF users have
autotools as old as 2.59.
However, 2.59 could generate a buggy configure file and failed on processing
libraries from linker command line, for example when running on Carver
@NERSC
For official release of PnetCDF, I use autotools 2.69 with the patch for
Fujitsu compilers to generate file "configure". See README.Fujitsu for the
bug in autoconf 2.69 and patch.
=====================================================
Working on configure.in or configure.ac
=====================================================
- Debugging: change the value of variable _DEBUG to yes
This will print some messages for debugging purpose.
=====================================================
Conform with netCDF
=====================================================
- PnetCDF uses the same following constants as netCDF
data types: nc_type
file open/create modes
error codes
(in principle, all constants should conform with netCDF)
- Make sure file pnetcdf.h.in is updated with the latest netCDF header, by
building the latest netCDF with --enable-pnetcdf option.
- Similarly, check Fortran error codes defined as parameters in
src/libf/pnetcdf.inc.in or src/binding/f77/pnetcdf.inc.in
src/libf90/nf90_constants.f90 or src/binding/f90/nf90_constants.f90
src/libf90/nfmpi_constants.f90.in or src/binding/f90/nfmpi_constants.f90.in
- PnetCDF error codes start from -201
- PnetCDF error codes for inconsistent header data start from -250
- PnetCDF developers are reminded to periodically sync the followings with
netCDF release.
1. Error codes (pnetcdf.h v.s. netcdf.h)
2. Error strings (error.c v.s. netcdf-c/libdispatch/derror.c)
3. src/drivers/include/utf8proc.h v.s. netcdf-c/include/utf8proc.h
4. src/drivers/common/utf8proc.c v.s. netcdf-c/libdispatch/utf8proc.c
5. src/drivers/common/utf8proc_data.h v.s. netcdf-c/libdispatch/utf8proc_data.h
==========================================================================
Note on netCDF text APIs and variables of external data type NC_CHAR
==========================================================================
All netCDF external data types are considered numerical data types, except for
NC_CHAR. Numerical data types can be converted to different numerical data
types. However, no numerical datatype is allowed to converted to NC_CHAR and
vice versa. Given these limitations, note the followings.
For attribute APIs, only text API, ncmpi_put_att_text(), can create/write
attributes in NC_CHAR data type. Note that ncmpi_put_att_text() does not take
an argument of external data type like other attribute APIs, because the
attribute to be created/written will be of NC_CHAR type. For reading, only text
API, ncmpi_get_att_text(), can read attributes of NC_CHAR type from file,
otherwise NC_ECHAR error code will return. Non-text APIs are not allowed to
put/get attributes of NC_CHAR type.
For variable get/put APIs, only text APIs, for example nc_put_vara_text(), can
read/write a variable defined as NC_CHAR type. Trying to use non-text APIs to
read/write a NC_CHAR variable will result in NC_ECHAR error code returned.
There is no NC_ERANGE error code possibly returned from text APIs.
In netCDF, NC_CHAR is designed purely for storing text data. NC_CHAR is
considered by netCDF as unsigned 8-bit integer, but not for used to store a
numerical value. Others netCDF external numerical data types and their
numerical meanings:
NC_BYTE is considered as a signed 1-byte integer
NC_UBYTE is considered as a unsigned 1-byte integer
NC_SHORT is considered as a signed 2-byte integer
NC_USHORT is considered as a unsigned 2-byte integer
NC_INT is considered as a signed 4-byte integer
NC_UINT is considered as a unsigned 4-byte integer
NC_INT64 is considered as a signed 8-byte integer
NC_UINT64 is considered as a unsigned 8-byte integer
NC_FLOAT is considered as a signed 4-byte floating point
NC_DOUBLE is considered as a signed 8-byte double precision floating point
All external data types, their byte sizes, sign-ness, and numerical ranges are
independent from the systems running PnetCDF/netCDF.
===============================================================
config.guess, config.sub, install-sh in directory scripts
===============================================================
Copy config.guess, config.sub, and install-sh from GNU libtool.
http://www.gnu.org/software/libtool/
Or get the latest version by running the following commands.
wget -O config.guess 'http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD'
wget -O config.sub 'http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD'
===============================================================
Extra files used for idea development, not for release
===============================================================
This file "DEVELOPER_NOTES".
src/lib (1.8.1 and prior) src/drivers/ncmpio/Legacy (1.9.0 and later):
varnx.m4
varnx_api_header.m4
varn_api_header.m4
swap.c
src/libf (1.8.1 and prior) src/binding/f77 (1.9.0 and later):
varn_external.m4
varnx_external.m4
createffiles
src/libf90 (1.8.1 and prior) src/binding/f90 (1.9.0 and later):
varn_interface.m4
varnx_interface.m4
=====================================================
Note on creating new PnetCDF APIs
=====================================================
If a new PnetCDF API is created, please check and do the followings.
1. In src/lib (1.8.1 and prior) src/include (1.9.0 and later)
* add the declaration of new APIs in "pnetcdf.h.in"
For example,
int ncmpi_inq_num_rec_vars(int ncid, int *nump);
* If the new APIs have Fortran counterparts, their C declarations should
be placed outside of the comment block marked with:
/* Begin Skip Prototypes for Fortran binding */
and
/* End Skip Prototypes for Fortran binding */
Otherwise, their declarations should be inside of the block.
2. In src/libf (1.8.1 and prior) src/binding/f77 (1.9.0 and later)
* The new APIs should be declared in "pnetcdf.inc.in"
* add their C-to-Fortran file name in "Makefile.in"
For example, API nfmpi_put_att() corresponds to file put_attf.c
* Some APIs have arguments that need special treatment in file "defs"
for example, put_vard API's second argument varid, its Fortran value
is C value + 1
'put_vard' => '2',
'put_vard-2' => 'in:OffsetIndexIn',
3. In src/libf90 (1.8.1 and prior) src/binding/f90 (1.9.0 and later)
* The new APIs should be declared in "visibility.f90" and/or "api.f90.in"
* their definition should be added in the right files. For example,
nfmpi_inq_striping() is declared in api.f90.in
nf90mpi_inq_striping() is declared in visibility.f90
The definition of nf90mpi_inq_striping() is defined in file.f90 by
calling nfmpi_inq_striping().
4. In src/libcxx (1.8.1 and prior) src/binding/cxx (1.9.0 and later)
* The new APIs should be declared and added to the right files. For example,
getRecVarCount() is declared in ncmpiGroup.h and defined in ncmpiGroup.cpp.
=====================================================
Note on PnetCDF internal function and variable name convention
* internal vs. external data type
Internal data types refer to the data types of user's I/O buffers in memory.
External data types refer to the NC data type, i.e. nc_type, netCDF
variable's data type stored in the netCDF files. External types are fixed,
independent from the system platforms.
* Internal data types used in PnetCDF are: char, signed char, unsigned char,
short, unsigned short, int, unsigned int, long, float, double, long long, and
unsigned long long. Their corresponding function name substrings are "text",
"schar", "uchar", "short", "ushort", "int", "uint", "long", "float",
"double", "longlong", and "ulonglong".
* External data types are of nc_type. For CDF-1 and 2, they are NC_CHAR,
NC_BYTE, NC_SHORT, NC_INT, NC_LONG (is aliased to NC_INT), NC_FLOAT, and
NC_DOUBLE. For CDF-5, additional types are NC_UBYTE, NC_USHORT, NC_UINT,
NC_INT64, and NC_UINT64.
* When creating an internal functions, after 1.7.0, we start to use names of
nc_type to represent the external data types and the above convention for
internal ones. For example, ncmpix_getn_NC_SHORT_longlong(). In this case,
we can clearly see that this function is to read n elements of a netCDF
variable of NC_SHORT external data type from a file to a user buffer of
internal long long type.
* Function argument name convention: we use "xtype" for the argument of
external data type and "itype" for internal data types. Thus, xtype must be
of nc_type defined in pnetcdf.h and itype be either of an internal data type,
or an MPI derived data type.
=====================================================
=====================================================
Note on debugging
=====================================================
1. Enable debugging option (--enable-debug) at the configure time can trace the
usage of malloc and whether there is a malloc residue. All PnetCDF
development should call NCI_Malloc, NCI_Calloc, NCI_Realloc, and NCI_Free,
instead of malloc, calloc, realloc, and free. When adding a new test or
example program, please add a check for any malloc residue at the end. This
is to make sure PnetCDF properly free up all malloc used internally. The
code fragment is something like below.
/* check if there is any PnetCDF internal malloc residue */
MPI_Offset malloc_size, sum_size;
int err = ncmpi_inq_malloc_size(&malloc_size);
if (err == NC_NOERR) {
MPI_Reduce(&malloc_size, &sum_size, 1, MPI_OFFSET, MPI_SUM, 0, MPI_COMM_WORLD);
if (rank == 0 && sum_size > 0)
printf("heap memory allocated by PnetCDF internally has %lld bytes yet to be freed\n",
sum_size);
}
=====================================================
Note on adding a new error code
=====================================================
PnetCDF error codes start at -200 and the error codes for header/argument
inconsistency of any kind start at -250.
1. If the new error code is not related to data inconsistency, add the code to
the end before -250
2. If the new error code is related to data inconsistency, add it to the end
of -250 section and make sure the number is between NC_EMULTIDEFINE_FIRST
and NC_EMULTIDEFINE_LAST in pnetcdf.h.in. Update NC_EMULTIDEFINE_LAST if
necessary.
3. Again, check Fortran error codes defined as parameters in
src/libf/pnetcdf.inc.in or src/binding/f77/pnetcdf.inc.in
src/libf90/nf90_constants.f90 or src/binding/f90/nf90_constants.f90
src/libf90/nfmpi_constants.f90.in or src/binding/f90/nfmpi_constants.f90.in
4. The error message should be kept less than 80 characters, because this is
the string length used in Fortran API nfmpi_strerror().
=====================================================
Note on setting the last modify time of files to avoid rebuild
=====================================================
Under directory src/utils/ncmpigen, the following files are pre-built and
provided in the releases as is. They are no longer required to be built by
PnetCDF users since 1.5.0.
ncmpigenyy.c
ncmpigentab.c
ncmpigentab.h
Their dependency are
ncmpigenyy.c: ncmpigen.l
ncmpigentab.c ncmpigentab.h: ncmpigen.y ncmpigenyy.c ncmpigen.h
When their source files are modified, these files must be regenerated.
However, when using command "svn commit" to commit the changes to svn repo,
the modify time will be reset by svn, which may cause make to regenerate them
again. To make the last modify time of these 3 files newer than their source
files, one solution is to run two separate commit commands.
First, run "svn commit" for the source files.
Second, run "svn del" to delete these 3 files.
Third, run "svn add" to add these 3 files.
Fourth, run "svn commit" for these 3 files.
=====================================================
Some tricks for wiki trac format
=====================================================
Escape character is !
for example ncmpi__enddef will show underscore "enddef", so add ! before the
double underscores to disable the formatting. eg. ncmpi!__enddef
Example of using it on the command line:
svn commit -m "add \!__func\!__ to error message" error.c error.h
=====================================================
Trace MPI communication calls and I/O calls
=====================================================
Add the following C define macros to the end of DEFS variable in macros.make
to enable tracing.
-DPNETCDF_TRACE_MPI_COMM
-DPNETCDF_TRACE_MPI_IO
For 1.9.0 and later, run the following commands to enable this feature.
make clean
make CFLAGS="-DPNETCDF_TRACE_MPI_COMM -DPNETCDF_TRACE_MPI_IO"
Example outputs from stdout:
TRACE-MPI-COMM: FILE mpincio.c FUNC ncmpiio_create() LINE 244 calling MPI_Bcast()
TRACE-MPI-IO: FILE mpincio.c FUNC ncmpiio_create() LINE 261 calling MPI_File_open()
TRACE-MPI-COMM: FILE nc.c FUNC NC_begins() LINE 429 calling MPI_Allreduce()
TRACE-MPI-COMM: FILE nc.c FUNC NC_check_header() LINE 111 calling MPI_Bcast()
TRACE-MPI-COMM: FILE nc.c FUNC NC_check_header() LINE 119 calling MPI_Bcast()
TRACE-MPI-COMM: FILE nc.c FUNC NC_check_header() LINE 130 calling MPI_Allreduce()
TRACE-MPI-IO: FILE nc.c FUNC write_NC() LINE 834 calling MPI_File_write_at()
The program test/testcases/profile.c can be used to print the sequence of MPI
calls for various PnetCDF calls. It can be helpful to find unnecessary MPI
calls under safe or not-safe mode. For example, when safe mode is disable and
running profile on one MPI process with option "-v", the stdout is shown below.
0: ---- after ncmpi_create
0: ---- before ncmpi_enddef() ----
TRACE-MPI-COMM: FILE nc.c FUNC NC_begins() LINE 414 calling MPI_Bcast()
TRACE-MPI-COMM: FILE nc.c FUNC NC_check_header() LINE 111 calling MPI_Bcast()
TRACE-MPI-IO: FILE nc.c FUNC write_NC() LINE 770 calling MPI_File_write_at()
0: ---- before ncmpi_redef() ----
0: ---- before ncmpi_enddef() ----
TRACE-MPI-COMM: FILE nc.c FUNC NC_begins() LINE 414 calling MPI_Bcast()
TRACE-MPI-COMM: FILE nc.c FUNC NC_check_header() LINE 111 calling MPI_Bcast()
TRACE-MPI-IO: FILE nc.c FUNC write_NC() LINE 770 calling MPI_File_write_at()
0: ---- before ncmpi_iput_varn_int() ----
0: ---- before ncmpi_iput_varn_int() ----
0: ---- before ncmpi_iput_varn_int() ----
0: ---- before ncmpi_wait_all() ----
TRACE-MPI-COMM: FILE nonblocking.c FUNC ncmpii_wait() LINE 680 calling MPI_Allreduce()
TRACE-MPI-IO: FILE vard.c FUNC ncmpii_file_set_view() LINE 88 calling MPI_File_set_view()
TRACE-MPI-IO: FILE nonblocking.c FUNC ncmpii_mgetput() LINE 1899 calling MPI_File_write_at_all()
TRACE-MPI-COMM: FILE nc.c FUNC ncmpii_sync_numrecs() LINE 619 calling MPI_Allreduce()
0: ---- before ncmpi_iput_varn_int() ----
0: ---- before ncmpi_iget_varn_int() ----
0: ---- before ncmpi_iget_varn_int() ----
0: ---- before ncmpi_iget_varn_int() ----
0: ---- before ncmpi_wait_all() ----
TRACE-MPI-COMM: FILE nonblocking.c FUNC ncmpii_wait() LINE 680 calling MPI_Allreduce()
TRACE-MPI-IO: FILE vard.c FUNC ncmpii_file_set_view() LINE 88 calling MPI_File_set_view()
TRACE-MPI-IO: FILE nonblocking.c FUNC ncmpii_mgetput() LINE 1899 calling MPI_File_write_at_all()
TRACE-MPI-COMM: FILE nc.c FUNC ncmpii_sync_numrecs() LINE 619 calling MPI_Allreduce()
TRACE-MPI-IO: FILE vard.c FUNC ncmpii_file_set_view() LINE 88 calling MPI_File_set_view()
TRACE-MPI-IO: FILE nonblocking.c FUNC ncmpii_mgetput() LINE 1875 calling MPI_File_read_at_all()
0: ---- before ncmpi_close() ----
TRACE-MPI-IO: FILE mpincio.c FUNC ncmpiio_close() LINE 410 calling MPI_File_close()
==============================================================================
Mirror SVN repo to Github
==============================================================================
PnetCDF svn repo is currently mirrored to GitHub daily under the project named
Parallel NetCDF: https://github.com/Parallel-NetCDF/PnetCDF
built.
Below are Commands for creating a mirror on GitHub:
cd /path/to/your/git/clone/of/parallel-netcdf
git svn init -s https://svn.mcs.anl.gov/repos/parallel-netcdf parallel-netcdf
cd parallel-netcdf
git svn fetch >& fetch.log & # this will take long to get all commits
Log into your github account and create a new repo, named "parallel-netcdf"
Set the remote repo URL @ Github
git remote add origin https://github.com/wkliao/parallel-netcdf.git
git push -u origin master
Create a cron job to sync svn repo to Github using a script file named, say
cron_sync.sh. Its contents are:
#!/bin/bash
cd /path/to/your/git/clone/of/parallel-netcdf
git svn rebase
git push origin master
Push a new PnetCDF tag branch to github:
git checkout -b 1-8-0pre1 remotes/origin/tags/v1-8-0pre1
git push -u origin 1-8-0pre1
==============================================================================
Setting up Travis CI for PnetCDF @ Github
==============================================================================
Instructions are based on https://docs.travis-ci.com/user/getting-started/
Sign in to Travis CI with your GitHub account
Click (Your Name) -> Accounts -> Sync account
(This should show a new repo just created.)
Click the repo to enable Travis CI and configure "settings" tab
(Travis only runs a build on the commits you push after adding the repository
to Travis. Note: If your project already has a .travis.yml file, you need to
push another commit to trigger a build.)
==============================================================================
Coverity Scan + Travis CI Integration
==============================================================================
Following the instructions in https://scan.coverity.com/travis_ci
The modeling file ./coverity_model.c is for Coverity Scan used in the Analysis
Setting.
Change the branch for scan:
Under "coverity_scan:". change "branch_pattern:" to the branch name
==============================================================================
Testing with valgrind
==============================================================================
Valgrind checks memory leak (e.g. command: valgrind --leak-check=full -q),
including uninitialized I/O buffer used in MPI_File_write. This warning message
can appear when PnetCDF detects out-of-range data type conversion, error code
NC_ERANGE. Note that PnetCDF continues writing the data out even if only
partial of the user buffer causes NC_ERANGE. In the internal buffer allocated
for type conversion, the buffer locations corresponding to those out-of-range
data elements will not be initialized. Therefore, you should expect to see many
warning messages of this kind when running "make check", as our test programs
intentionally test whether NC_ERANGE can be returned properly.
Another memory leak reported by valgrind is when file header expands causing
data section to be moved to later file offsets. PnetCDF calls MPI read and
write to move a variable at a time. The ideal case is to write the actual read
size reported by MPI_Get_count in the MPI write call. This is in case some
variables are defined but never been written. However, in some MPICH versions
MPI_Get_count fails to report the correct value due to an internal error that
fails to initialize the MPI_Status object. Therefore, the solution can be
either to explicitly initialize the status object to zeros, or to just use the
same read request amount for write. Note that the latter will write the
variables that have not been written before. For now we adopt the former
option. See comments about MPI_Get_count() in function move_file_block() of
src/drivers/ncmpio/mpincio.c
When data sieving is enabled in MPI-IO (default in ROMIO), its
read-modify-write operation can also cause valgrind to complain uninitialized
I/O buffer. In particular, when a new file is created and PnetCDF fixed-sized
variable alignment is enabled, the read in read-modify-write may read no data
at all, resulting data sieving buffer containing uninitialized data. To
eliminate this possibility, set the hint environment variable to disable
alignment, e.g. PNETCDF_HINTS="nc_var_align_size=1;nc_header_align_size=1".
When using MPICH 3.2 with the bug of #2332 fixed, running "make check" and
"make ptests" through valgrind should run without any complains. See MPICH
ticket #2332 in https://trac.mpich.org/projects/mpich/ticket/2332.
==============================================================================
Note on using clang and gprof together
==============================================================================
% clang --version
clang version 3.4.2 (tags/RELEASE_34/dot2-final)
Target: x86_64-redhat-linux-gnu
Thread model: posix
% gprof --version
GNU gprof version 2.20.51.0.2-5.43.el6 20100205
Based on BSD gprof, copyright 1983 Regents of the University of California.
This program is free software. This program has absolutely no warranty.
The problem is reported in https://llvm.org/bugs/show_bug.cgi?id=14713
When compiling PnetCDF using clang based MPI compiler with compile option
"-g -O2 -pg", programs will receive signal SIGSEGV, Segmentation fault.
However, when compiled with -g -O0 -pg, programs receive no error.
Workaround: use "-pg -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer"
==============================================================================
Note on including config.h
==============================================================================
When doing VPATH build, remember to pass the C compiler a -I. option. Even if
you use #include "config.h", the preprocessor searches only the source
directory, not the build directory. Thus, we should use #include <config.h>
instead of #include "config.h". In addition, use -I. -I$(srcdir) in
Makefile.in.
Autoconf manual suggests it is a good habit to use angle brackets, because in
the rare case when the source directory contains another config.h, the build
directory should be searched first.
==============================================================================
Note on using NAG Fortran compiler
==============================================================================
Type conversion that causes NC_ERANGE error will also cause a coredump with
an error message of "Arithmetic exception". See discussion threads in
http://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg13327.html
The solution is to enable "erange-fill" option at the configure time (default
in PnetCDF). The implementation of this option is also provided to netCDF,
but need manually activate. See
https://github.com/Unidata/netcdf-c/pull/319#issuecomment-256982471
==============================================================================
Note on M4 flags
==============================================================================
M4 has a nice feature called synclines that adds line numbers into m4 files
so compilers can report the error locations in the m4 files, instead of the
derived C/Fortran files. To enable this feature, developers add M4FLAGS=-s to
the configure command line. Note that synclines currently only take effect for
C files. There is still some issues needed to be resolved for Fortran files.
Developers are also warned that when m4 macro functions are used, the line
numbers reported are the locations the functions are invoked, not the lines
inside the functions.
==============================================================================
NC error code precedence
==============================================================================
The precedence of reporting error codes depends on the seriousness of the
error. The guide line is the following. The most serious ones are related to
ncid, such as NC_EBADID, NC_EPERM, and NC_EINDEFINE. The next is related to
varid, such as NC_ENOTVAR. These two types of errors will make PnetCDF programs
difficult to continue, even if we modified the request to NULL for processes
that produce an error.
For put att APIs:
NC_EBADID, NC_EPERM, NC_ENOTVAR, NC_EBADNAME, NC_EBADTYPE, NC_ECHAR,
NC_EINVAL, NC_ENOTINDEFINE, NC_ERANGE
For get att APIs:
NC_EBADID, NC_ENOTVAR, NC_EBADNAME, NC_ENOTATT, NC_ECHAR, NC_EINVAL,
NC_ERANGE
For put/get variable APIs:
NC_EBADID, NC_EPERM, NC_EINDEFINE, NC_ENOTVAR, NC_ECHAR, NC_EINVALCOORDS,
NC_EEDGE, NC_ESTRIDE, NC_EINVAL, NC_ERANGE
==============================================================================
Setting PnetCDF software release date
==============================================================================
Prior to version 1.8.1, the release date was obtained from the SVN keyword
LastChangedDate set in file configure.in. It is used to produce two variables:
PNETCDF_RELEASE_DATE and PNETCDF_RELEASE_DATE2. These two variables are used
by all man pages, pnetcdf.h, pnetcdf_version, and pnetcdf-config. We used the
keyword value set by SVN as the release date. The assumption is that updating
PACKAGE_VERSION in configure.in is the last step before making a release, i.e.
the last modification date of file configure.in is the latest among all files.
This approach can cause a problem when using git-svn to clone the source codes,
as one must set git smudge/clean filter to produce the effect of SVN keywords.
In addition, it makes more sense to use the date when running command "make
dist" to create the tar ball.
Staring from 1.9.0, the release date will be set to the date when running
command "make dist". See the makefile target "dist-hook" in
./src/include/Makefile.am for an approach using command sed to do string
substitution for DIST_DATE.
==============================================================================
Hacking autoconf and automake
==============================================================================
Autoconf 2.69 has a problem of checking the version of the Fujitsu Fortran
compiler, which causes errors for building PnetCDF at the configure time.
The fix and patch have been discussed and provided in the link below. PnetCDF
users/developers are encouraged to apply to patch to their autoconf utility.
Subject: Support Fujitsu in _AC_PROG_FC_V
https://lists.gnu.org/archive/html/autoconf-patches/2013-11/msg00001.html
diff -r autoconf-2.69/lib/autoconf/fortran.m4 Origin/autoconf-2.69/lib/autoconf/fortran.m4
581,582d580
< # Fujitsu accepts --verbose and passes it to the linker, which doesn't yield
< # the desired result. Therefore test for -### before testing for --verbose.
590c588
< for ac_verb in -v -verbose -V -\#\#\# --verbose; do
---
> for ac_verb in -v -verbose --verbose -V -\#\#\#; do
On some systems, the Fortran preprocessor flag is not "-D". For example, IBM
XL fortran compilers use flag "-WF,-D". However, autoconf 2.69 hard-codes
"-DHAVE_CONFIG_H" into pre-defined variable "DEFS" which is used in C, C++,
F77, and F90 compile commands. The hack below adds a pre-define variable,
FC_DEFS, and replaces DEFS with it for Fortran-based compile commands.
diff -r autoconf-2.69/lib/autoconf/general.m4 Origin/autoconf-2.69/lib/autoconf/general.m4
1411d1410
< AC_SUBST([FC_DEFS])dnl
diff -r autoconf-2.69/lib/autoconf/status.m4 Origin/autoconf-2.69/lib/autoconf/status.m4
1269,1274d1268
< if test "x$FC_DEFINE" != "x-D" ; then
< ac_fc_define_sed_str="s/-D/${FC_DEFINE}/g"
< FC_DEFS=`echo $DEFS | sed $ac_fc_define_sed_str`
< else
< FC_DEFS=$DEFS
< fi
diff -r automake-1.15/bin/automake.in Origin/automake-1.15/bin/automake.in
625,633d624
< my @fpplike_flags =
< qw{
< $(FC_DEFS)
< $(DEFAULT_INCLUDES)
< $(INCLUDES)
< $(AM_CPPFLAGS)
< $(CPPFLAGS)
< };
<
888c879
< 'compile' => "\$(FC) @fpplike_flags \$(AM_FCFLAGS) \$(FCFLAGS)",
---
> 'compile' => "\$(FC) @cpplike_flags \$(AM_FCFLAGS) \$(FCFLAGS)",
920c911
< 'compile' => "\$(F77) @fpplike_flags \$(AM_FFLAGS) \$(FFLAGS)",
---
> 'compile' => "\$(F77) @cpplike_flags \$(AM_FFLAGS) \$(FFLAGS)",