Skip to content

revise documentation #53

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 22 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,45 @@
# PnetCDF python
# PnetCDF for Python
![](https://img.shields.io/badge/python-v3.9-blue)
![](https://img.shields.io/badge/tests%20passed-49-brightgreen)
![](https://readthedocs.org/projects/pnetcdf-python/badge/?version=latest)

PnetCDF-python is a Python interface to
[PnetCDF](https://parallel-netcdf.github.io/), a high-performance parallel I/O
library for accessing netCDF files.
This package allows Python users to access netCDF data using the rich ecosystem
of Python's scientific computing libraries, making it a valuable tool for
applications that require parallel access to netCDF files.
PnetCDF-Python is a Python interface to
[PnetCDF](https://parallel-netcdf.github.io/), a high-performance I/O library
for accessing netCDF files in parallel. It can provide MPI-based parallel
python programs to achieve a scalable I/O performance.

### Software Dependencies
* Python 3.9 or above
* MPI libraries
* PnetCDF [C library](https://github.com/Parallel-netCDF/PnetCDF), built with shared libraries.
* Python library [mpi4py](https://mpi4py.readthedocs.io/en/stable/install.html)
* Python library [numpy](http://www.numpy.org/)
* Python 3.9 or later.
* [numpy](http://www.numpy.org/) Python package.
* MPI C library and Python package, [mpi4py](https://mpi4py.readthedocs.io/en/stable/install.html).
* [PnetCDF C library](https://github.com/Parallel-netCDF/PnetCDF), built with shared libraries.

### Developer Installation
* Clone this GitHub repository
* Make sure the above dependent software are installed.
* In addition, [Cython](http://cython.org/), [packaging](https://pypi.org/project/packaging/), [setuptools>=65](https://pypi.org/project/setuptools/) and [wheel](https://pypi.org/project/wheel/) are required for developer installation.
* Set the environment variable `PNETCDF_DIR` to PnetCDF's installation path.
* Make sure utility program `pnetcdf-config` is available in `$PNETCDF_DIR/bin`.
* Run command below to install.
* Required software for developer installation:
+ The above mentioned dependent software are installed and additionally,
+ [Cython](http://cython.org/), [packaging](https://pypi.org/project/packaging/), [setuptools>=65](https://pypi.org/project/setuptools/) and [wheel](https://pypi.org/project/wheel/).
* Commands to install.
```
CC=/path/to/mpicc PNETCDF_DIR=/path/to/pnetcdf/dir pip install --no-build-isolation -e .
export CC=/path/to/mpicc
export PNETCDF_DIR=/path/to/pnetcdf/dir
pip install --no-build-isolation -e .
```
* Testing
+ Run command `"make check"` to test all the programs available in folders
["test/"](./test) and ["examples/"](./examples) in parallel on 4 MPI
processes.
+ In addition, command `"make ptests"` runs the same tests using 3, 4, and 8
MPI processes.
* Testing -- Command `"make check"` tests all the programs available in folders
["test/"](./test) and ["examples/"](./examples).

### Additional Resources
* [Example python programs](./examples#pnetcdf-python-examples) available in
folder [./examples](./examples).
* PnetCDF-python [User Guide](https://pnetcdf-python.readthedocs.io/en/latest)
* [Data objects](docs/pnetcdf_objects.md) in PnetCDF python programming
* [Comparison](docs/nc4_vs_pnetcdf.md) of NetCDF4-python and PnetCDF-python
* [PnetCDF project home page](https://parallel-netcdf.github.io)
* [PnetCDF repository of C/Fortran library](https://github.com/Parallel-NetCDF/PnetCDF)
* [PnetCDF of C/Fortran library repository](https://github.com/Parallel-NetCDF/PnetCDF)

### Developer Team
* Youjia Li <<youjia@northwestern.edu>>
* Wei-keng Liao <<wkliao@northwestern.edu>> (Principle Investigator)
* Wei-keng Liao <<wkliao@northwestern.edu>>

### Acknowledgements
Ongoing development and maintenance of PnetCDF-python is supported by the U.S.
Expand Down
41 changes: 41 additions & 0 deletions docs/copyright.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
================
Copyright Statement
================

::

Copyright (c) 2024 Northwestern University and Argonne National
Laboratory All rights reserved.

Portions of this software were developed by the Unidata Program at the
University Corporation for Atmospheric Research.

Access and use of this software shall impose the following obligations and
understandings on the user. The user is granted the right, without any fee
or cost, to use, copy, modify, alter, enhance and distribute this software,
and any derivative works thereof, and its supporting documentation for any
purpose whatsoever, provided that this entire notice appears in all copies
of the software, derivative works and supporting documentation. Further,
Northwestern University and Argonne National Laboratory request that the
user credit Northwestern University and Argonne National Laboratory in any
publications that result from the use of this software or in any product
that includes this software. The names Northwestern University and Argonne
National Laboratory, however, may not be used in any advertising or
publicity to endorse or promote any products or commercial entity unless
specific written permission is obtained from Northwestern University and
Argonne National Laboratory. The user also understands that Northwestern
University and Argonne National Laboratory are not obligated to provide the
user with any support, consulting, training or assistance of any kind with
regard to the use, operation and performance of this software nor to
provide the user with any updates, revisions, new versions or "bug fixes."

THIS SOFTWARE IS PROVIDED BY NORTHWESTERN UNIVERSITY AND ARGONNE NATIONAL
LABORATORY "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NORTHWESTERN
UNIVERSITY AND ARGONNE NATIONAL LABORATORY BE LIABLE FOR ANY SPECIAL,
INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE ACCESS,
USE OR PERFORMANCE OF THIS SOFTWARE.

218 changes: 217 additions & 1 deletion docs/nc4_vs_pnetcdf.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,14 @@
# Difference between NetCDF4-python and PnetCDF-python
# Comparison between PnetCDF-Python and NetCDF4-Python

Programming using [NetCDF4-Python](http://unidata.github.io/netcdf4-python/) and
[PnetCDF-Python](https://pnetcdf-python.readthedocs.io) are very similar.
Below lists some of the differences, including the file format support and
operational modes.

* [Supported File Formats](#supported-file-formats)
* [Differences in Python Programming](#differences-in-python-programming)
* [Define Mode and Data Mode](#define-mode-and-data-mode)
* [Collective and Independent I/O Mode](#collective-and-independent-io-mode)
* [Blocking vs. Nonblocking APIs](#blocking-vs-nonblocking-apis)

---
Expand Down Expand Up @@ -59,6 +66,215 @@
| ... ||
| # close file<br>f.close() | ditto NetCDF4 |

---
## Define Mode and Data Mode

In PnetCDF, an opened file is in either define mode or data mode. Switching
between the modes is done by explicitly calling `"pnetcdf.File.enddef()"` and
`"pnetcdf.File.redef()"`. NetCDF4-Python has no such mode switching
requirement. The reason of PnetCDF enforcing such a requirement is to ensure
the metadata consistency across all the MPI processes and keep the overhead of
metadata synchronization small.

* Define mode
+ When calling constructor of python class `"pnetcdf.File()"` to create a new
file, the file is automatically put in the define mode. While in the
define mode, the python program can create new dimensions, i.e. instances
of class `"pnetcdf.Dimension"`, new variables, i.e. instances of class
`"pnetcdf.Variable"`, and netCDF attributes. Modification of these data
objects' metadata can only be done when the file is in the define mode.
+ When opening an existing file, the opened file is automatically put in the
data mode. To add or modify the metadata, a python program must call
`"pnetcdf.File.redef()"`.

* Data mode
+ Once the creation or modification of metadata is complete, the python
program must call `"pnetcdf.File.enddef()"` to leave the define mode and
enter the data mode.
+ While an open file is in data mode, the python program can make read and
write requests to that variables that have been created.

<ul>
<li> A PnetCDF-Python example shows switching between define and data modes
after creating a new file.</li>
<li> <details>
<summary>Example code fragment (click to expand)</summary>

```python
import pnetcdf
...
# Create the file
f = pnetcdf.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD)
...
# Define dimensions
dim_y = f.def_dim("Y", 16)
dim_x = f.def_dim("X", 32)

# Define a 2D variable of integer type
var = f.def_var("grid", pnetcdf.NC_INT, (dim_y, dim_x))

# Add an attribute of string type to the variable
var.str_att_name = "example attribute"

# Exit the define mode
f.enddef()

# Write to a subarray of the variable, var
var[4:8, 20:24] = buf

# Re-enter the define mode
f.redef()

# Define a new 2D variable of float type
var_flt = f.def_var("temperature", pnetcdf.NC_FLOAT, (dim_y, dim_x))

# Exit the define mode
f.enddef()

# Write to a subarray of the variable, var_flt
var_flt[0:4, 16:20] = buf_flt

# Close the file
f.close()
```
</details></li>

<li> An example shows switching between define and data modes after opening an existing file.
</li>
<li> <details>
<summary>Example code fragment (click to expand)</summary>

```python
import pnetcdf
...
# Opening an existing file
f = pnetcdf.File(filename, 'r', MPI.COMM_WORLD)
...
# get the python handler of variable named 'grid', a 2D variable of integer type
var = f.variables['grid']

# Read the variable's attribute named "str_att_name"
str_att = var.str_att_name

# Read a subarray of the variable, var
r_buf = np.empty((4, 4), var.dtype)
r_buf = var[4:8, 20:24]

# Re-enter the define mode
f.redef()

# Define a new 2D variable of double type
var_dbl = f.def_var("precipitation", pnetcdf.NC_DOUBLE, (dim_y, dim_x))

# Add an attribute of string type to the variable
var_dbl.unit = "mm/s"

# Exit the define mode
f.enddef()

# Write to a subarray of the variable, temperature
var_dbl[0:4, 16:20] = buf_dbl

# Close the file
f.close()
```
</details></li>
</ul>


---
## Collective and Independent I/O Mode

The terminology of collective and independent I/O comes from MPI standard. A
collective I/O function call requires all the MPI processes opening the same
file to participate. On the other hand, an independent I/O function can be
called by an MPI process independently from others.

For metadata I/O, both PnetCDF and NetCDF4 require the function calls to be
collective.

* Mode Switch Mechanism
+ PnetCDF-Python -- when a file is in the data mode, it can be put into
either collective or independent I/O mode. The default mode is collective
I/O mode. Switching to and exiting from the independent I/O mode is done
by explicitly calling `"pnetcdf.File.begin_indep()"` and
`"pnetcdf.File.end_indep()"`.

+ NetCDF4-Python -- collective and independent mode switching is done per
variable basis. Switching mode is done by explicitly calling
`"Variable.set_collective()"` before accessing the variable.
For more information, see
[NetCDF4-Python User Guide on Parallel I/O](https://unidata.github.io/netcdf4-python/#parallel-io)

<ul>
<li> A PnetCDF-Python example shows switching between collective and
independent I/O modes.</li>
<li> <details>
<summary>Example code fragment (click to expand)</summary>

```python
import pnetcdf
...
# Create the file
f = pnetcdf.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD)
...
# Metadata operations to define dimensions and variables
...
# Exit the define mode (by default, in the collective I/O mode)
f.enddef()

# Write to variables collectively
var_flt[start_y:end_y, start_x:end_x] = buf_flt
var_dbl[start_y:end_y, start_x:end_x] = buf_dbl

# Leaving collective I/O mode and entering independent I/O mode
f.begin_indep()

# Write to variables independently
var_flt[start_y:end_y, start_x:end_x] = buf_flt
var_dbl[start_y:end_y, start_x:end_x] = buf_dbl

# Close the file
f.close()
```
</details></li>
</ul>

<ul>
<li> A NetCDF4-Python example shows switching between collective and
independent I/O modes.</li>
<li> <details>
<summary>Example code fragment (click to expand)</summary>

```python
import netCDF4
...
# Create the file
f = netCDF4.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD, parallel=True)
...
# Metadata operations to define dimensions and variables
...

# Write to variables collectively
var_flt.set_collective(True)
var_flt[start_y:end_y, start_x:end_x] = buf_flt

var_dbl.set_collective(True)
var_dbl[start_y:end_y, start_x:end_x] = buf_dbl

# Write to variables independently
var_flt.set_collective(False)
var_flt[start_y:end_y, start_x:end_x] = buf_flt

var_dbl.set_collective(False)
var_dbl[start_y:end_y, start_x:end_x] = buf_dbl

# Close the file
f.close()
```
</details></li>
</ul>

---

## Blocking vs Nonblocking APIs
Expand Down
30 changes: 15 additions & 15 deletions docs/source/api/dimension_api.rst
Original file line number Diff line number Diff line change
@@ -1,30 +1,30 @@
.. currentmodule:: pnetcdf
==============
Dimension
==============

Dimension defines the shape and structure of variables and stores
coordinate data for multidimensional arrays. The ``Dimension`` object,
which is also a key component of ``File`` class, provides an interface
to access dimensions.

.. note::

``Dimension`` instances should be created using the :meth:`File.def_dim` method of a ``File`` instance,
not using :meth:`Dimension.__init__` directly.
Dimension defines the shape and structure of variables and stores coordinate
data for multidimensional arrays. The ``Dimension`` object, which is also a key
component of ``File`` class, provides an interface to access dimensions.

.. autoclass:: pnetcdf::Dimension
:members: getfile, isunlimited
:exclude-members: name, size

Dimension Attributes
The following class members are read-only and should not be modified by the user.
Read-only Python Attributes of Dimension Class
The following class members are read-only and should not be modified by the
user.

.. attribute:: name

String name of Dimension instance. This class member is read-only and
should not be modified by the user. To rename a dimension, use :meth:`File.rename_dim` method.
should not be modified by the user. To rename a dimension, use
:meth:`File.rename_dim` method.

**Type:** `str`

.. attribute:: size

The current size of Dimension (calls ``len`` on Dimension instance).

The current size of Dimension (calls ``len`` on Dimension instance).

**Type:** `int`

Loading
Loading