Skip to content

Commit fad4e0e

Browse files
author
Gen Tolhurst
committed
Merge branch 'main' of github.com:AusClimateService/plotting_maps
2 parents 67ffe57 + 3776549 commit fad4e0e

File tree

1 file changed

+122
-23
lines changed

1 file changed

+122
-23
lines changed

README.md

Lines changed: 122 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -347,7 +347,8 @@ For example only, this would make a dataframe in this format:
347347
### Where can I find some worked examples to get started?
348348
<details>
349349
<summary> Expand </summary>
350-
I have collected [example_notebooks](https://github.com/AusClimateService/plotting_maps/tree/main/example_notebooks) which contain examples of creating plots with a variety of hazards and using a range of functionalities available.
350+
351+
I have collected [example notebooks](https://github.com/AusClimateService/plotting_maps/tree/main/example_notebooks) which contain examples of creating plots with a variety of hazards and using a range of functionalities available.
351352

352353
Notebooks used to make plots for specific requests and reports can be found under [reports](https://github.com/AusClimateService/plotting_maps/tree/main/reports). These are good references for the range of plots we can create using these functions and you are welcome to look through them and copy code you like.
353354

@@ -365,37 +366,46 @@ Statistic examples:
365366
* Basic example of acs_regional_stats [area_statistics_example.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/area_statistics_example.ipynb)
366367
* Region and ensemble member mean table for NCRA regions [ensemble-table.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/ensemble-table.ipynb)
367368
* Using acs_regional_stats to calculate area averages with custom regions [area_statistics_example_basin_gpkg.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/area_statistics_example_basin_gpkg.ipynb)
369+
368370
</details>
369371

370372
### Something is not working and I don't know why!
371373
<details>
372374
<summary> Expand </summary>
375+
373376
Here are some common suggestions for troubleshooting:
374-
- see “getting started” above and make sure you have followed all the instructions
375-
- Check you are using the right venv. This code is designed to work with hh5 analysis3-24.04 virtual environment.
376-
- Restart the kernel and rerun all cells from start. Especially if you have made a variety of modifications, you may have renamed a function/variable.
377-
- If python can't find the module, check you have the .py module in your working directory. If not cd to the directory with the module.
378-
- Make sure you have requested access to all the right gdata projects (eg gdata/ia39)
377+
378+
* see “getting started” above and make sure you have followed all the instructions
379+
* Check you are using the right venv. This code is designed to work with hh5 analysis3-24.04 virtual environment.
380+
* Restart the kernel and rerun all cells from start. Especially if you have made a variety of modifications, you may have renamed a function/variable.
381+
* If python can't find the module, check you have the .py module in your working directory. If not cd to the directory with the module.
382+
* Make sure you have requested access to all the right gdata projects (eg gdata/ia39)
383+
379384
</details>
380385

381386
### An argument I have used before using this code no longer works. What's happening?
382387
<details>
383388
<summary> Expand </summary>
389+
384390
During development, priorities and requests have changed what the functions needed to do. As a result, there are a few deprecated features and functionalities. Some things that were needed that are now not required:
385-
- “show_logo”, it was initially requested to have an ACS logo in the figures. The comms team now prefers only the copywrite in the bottom
386-
- Contour and contourf are generally not recommended now due to errors in plotting and long computational time. They are left in the function because they can be useful for lower resolution data, eg ocean data.
387-
- “infile” is not used. The idea was to use this for well-organised data with a consistent DRS to enable a good plot to be made without lots of keyword inputs. The data we have is not organised consistently enough for this.
388-
- “regions_dict” in acs_plotting_maps.py made to module very slow to load Shapefiles can take many seconds to load. It is inefficient to load all these regions even when you don’t use them all. This was replaced with a class
389-
- “regions” in acs_area_stats had preloaded shapefiles. Shapefiles can take many seconds to load. It is inefficient to load all these regions even when you don’t use them all. This was replaced with “get_regions”
391+
392+
* “show_logo”, it was initially requested to have an ACS logo in the figures. The comms team now prefers only the copywrite in the bottom
393+
* Contour and contourf are generally not recommended now due to errors in plotting and long computational time. They are left in the function because they can be useful for lower resolution data, eg ocean data.
394+
* “infile” is not used. The idea was to use this for well-organised data with a consistent DRS to enable a good plot to be made without lots of keyword inputs. The data we have is not organised consistently enough for this.
395+
* “regions_dict” in acs_plotting_maps.py made to module very slow to load Shapefiles can take many seconds to load. It is inefficient to load all these regions even when you don’t use them all. This was replaced with a class
396+
* “regions” in acs_area_stats had preloaded shapefiles. Shapefiles can take many seconds to load. It is inefficient to load all these regions even when you don’t use them all. This was replaced with “get_regions”
397+
390398
```python
391399
from acs_area_statistics import acs_regional_stats, get_regions
392400
regions = get_regions(["ncra_regions", "australia"])
393401
```
402+
394403
</details>
395404

396405
### How can I add stippling (hatching) to plots to indicate model agreement?
397406
<details>
398407
<summary> Expand </summary>
408+
399409
The plotting scripts can add stippling to the plots using the stippling keyword(s). [Here is a notebook showing examples of using stippling](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/FAQ_example_stippling.ipynb).
400410

401411
You will need to calculate the mask and provide this as a dataarray with "lat" and "lon". The mask must be a True/False boolean mask. It does not have to be the same resolution as the underlying data (you may wish to coarsen the mask if the underlying data is high-resolution and noisy).
@@ -428,21 +438,25 @@ plot_acs_hazard_4pp(ds_gwl12=ds_gwl12[var],
428438
issued_date="",
429439
);
430440
```
441+
431442
</details>
432443

433444
### Why is the stippling weird?
434445
<details>
435446
<summary> Expand </summary>
447+
436448
You may need to check that the stippling is in the areas you expect it to be. There is a bug in contourf that causes the stippling to get confused when plotting noisy high-resolution mask. If that is the case, I recommend coarsening the stippling mask
437449
E.g.
438450
new_stippling_mask = stippling_mask.coarsen(lat=2, boundary="pad").mean().coarsen(lon=2, boundary="pad").mean()>0.4
439451

440452
(full example here https://github.com/AusClimateService/plotting_maps/blob/main/reports/fire_climate_classes_projections.ipynb)
453+
441454
</details>
442455

443456
### Is there a way to use the 4pp plot with the average conditions for GWL1.2 and the change % for GWL1.5 to GWL3? Or does it only work for plots that use a consistent colourbar?
444457
<details>
445458
<summary> Expand </summary>
459+
446460
`plot_acs_hazard_1plus3` is a specific version of the plotting function to address this situation. While `plot_acs_hazard_4pp` assumes a shared colorbar and scale for all four maps, `plot_acs_hazard_1plus3` provides additional key word arguments to define a separate colorbar and scale for the first plot (as a baseline), while the last three figures share a different colorbar and scale.
447461

448462
See example here: [FAQ_example_4pp_1plus3.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/FAQ_example_4pp_1plus3.ipynb)
@@ -487,38 +501,35 @@ plot_acs_hazard_1plus3(ds_gwl12=ds_gwl12[var],
487501
outfile = "figures/FAQ_example_1plus3.png",
488502
)
489503
```
504+
490505
</details>
491506

492507
### How can I change the orientation (eg from vertical to horizontal) of the figures in a multipaneled plot?
493508
<details>
494509
<summary> Expand </summary>
510+
495511
For multi-panelled plots, we have provided a keyword `orientation` to easily change `"vertical"` stacked plots to `"horizontal"` aligned subplots. For four panelled plots there is also a `"square"` option for a 2-by-2 arrangement.
496512

497513
These options specify the axes grid, figsize, and location of titles etc.
498514

499515
See [FAQ_example_orientation.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/FAQ_example_orientation.ipynb) for an example.
500516
</details>
501517

502-
### Can I use my own shapefiles to define regions?
503-
<details>
504-
<summary> Expand </summary>
505-
Yes, you can provide any shapefiles you like. Here is an example: [FAQ_example_custom_mask.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/FAQ_example_custom_mask.ipynb).
506-
507-
We have provided some helpful Australian regions from /g/data/ia39, but the functions are flexible to take custom regions. [See more about the provided shapefiles here](https://github.com/aus-ref-clim-data-nci/shapefiles/).
508-
You will need to define [regionmask regions](https://regionmask.readthedocs.io/en/stable/notebooks/mask_3D.html) with unique abbreviations and names
509-
</details>
510518

511519
### I want to use a divergent colormap, but the neutral color isn't in the middle of my ticks. What can I do to align the centre of the colormap to zero?
512520
<details>
513521
<summary> Expand </summary>
522+
514523
When we plot anomalies, it is best to use divergent colormaps. However, some climate change signals are highly skewed or only in one direction. For example, heat hazards are nearly always increasing. To use divergent colormaps, but not waste space in the color scale on large cool anomalies, we can use the "vcentre" key word to centre the neutral centre of the colormap at zero, but only show relevant ticks on the scale.
515524

516525
See this notebook for an example: [FAQ_example_vcentre.ipynb](example_notebooks/FAQ_example_vcentre.ipynb)
526+
517527
</details>
518528

519529
### What does gwl mean?
520530
<details>
521531
<summary> Expand </summary>
532+
522533
GWL describe global warming levels. These are 20 year periods centred on the year when a climate model is projected to reach a specified global surface temperature above the pre-industrial era. Global climate models reach these temperature thresholds at different years.
523534

524535
For example, the Paris Agreement (2012) refers to global warming levels in its aims:
@@ -528,21 +539,25 @@ For example, the Paris Agreement (2012) refers to global warming levels in its a
528539
Find more information here https://github.com/AusClimateService/gwls
529540

530541
The plotting functions have been designed to accommodate present and future global warming levels. This is indicated by argument names containing "gwl12", "gwl15", "gwl20", "gwl30". If you want to use the function for other time periods or scenarios, you can still use these functions. The functions will work for any data in the right format (eg 2D xarray data array with lat and lon).
542+
531543
</details>
532544

533545
### I am not using GWLs but I want to use these functions. How can I change the subtitles?
534546
<details>
535547
<summary> Expand </summary>
548+
536549
The plotting functions have been designed to accommodate present and future global warming levels. This is indicated by argument names containing "gwl12", "gwl15", "gwl20", "gwl30". If you want to use the function for other time periods or scenarios, you can still use these functions. The functions will work for any data in the right format (eg 2D xarray data array with lat and lon).
537550

538551
You can use `subplot_titles` to provide a list of titles for each subplot in your figure. You may also use this to suppress the default subplot titles, or label the plots differently.
539552

540553
This example shows the subplot_title being renamed for sea level rise increments instead of GWLs: [FAQ_example_subplot_titles.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/FAQ_example_subplot_titles.ipynb)
554+
541555
</details>
542556

543557
### I only want to plot data below 30S latitude, is there a mask for this?
544558
<details>
545559
<summary> Expand </summary>
560+
546561
There is no specific mask for this, but it is easy to adjust your input to achieve this. Here is a notebook to demonstrate [FAQ_example_cropo_mask.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/FAQ_example_crop_mask.ipynb)
547562

548563
If you just want to plot the data below 30S, you can use ```plot_acs_hazard(data= ds.where(ds["lat"]<-30)[var] , ...)```
@@ -581,11 +596,13 @@ var="low_freq"
581596
df_summary = acs_regional_stats(ds=ds,var=var, mask=mask, dims = dims, how = ["min", "median", "max"])
582597
df_summary
583598
```
599+
584600
</details>
585601

586602
### How may I plot gridded data and station data on the same figure?
587603
<details>
588604
<summary> Expand </summary>
605+
589606
You can plot gridded data and station data on the same plot if they share the same colorscale and ticks. All you need to do is provide valid `data` and `station_df`. Similarly, this is possible for multipanelled plots.
590607

591608
```python
@@ -616,32 +633,114 @@ plot_acs_hazard(data=data[var],
616633
vcentre=0)
617634
```
618635
<img src="https://github.com/user-attachments/assets/0b641920-cc7d-46f1-b928-180a8212770b" width="500">
636+
637+
</details>
638+
639+
### Can I use my own shapefiles to define regions?
640+
<details>
641+
<summary> Expand </summary>
642+
643+
Yes, you can provide any shapefiles you like. Here is an example: [FAQ_example_custom_mask.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/FAQ_example_custom_mask.ipynb).
644+
645+
We have provided some helpful Australian regions from /g/data/ia39, but the functions are flexible to take custom regions. [See more about the provided shapefiles here](https://github.com/aus-ref-clim-data-nci/shapefiles/).
646+
You will need to define [regionmask regions](https://regionmask.readthedocs.io/en/stable/notebooks/mask_3D.html) with unique abbreviations and names
647+
619648
</details>
620649

621650
### Can I use any regions for the acs_regional_stats statistics function?
622651
<details>
623652
<summary> Expand </summary>
624-
Yes, provide any mask for your data. The more regions, the slower and
653+
654+
Yes, provide any mask for your data. Calculation take more memory and time when more regions are provided. For example, 500 local government areas require much more memory than calculating statistics for 10 State areas.
655+
656+
[FAQ_example_custom_mask.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/FAQ_example_custom_mask.ipynb) describes defining a mask froma shape file then applying the acs_regional_stats function.
657+
658+
Depending on the format of the original shapefile, you may need to preprocess the regions to be in the correct format, for example, defining the names of the names and abbrevs columns, and ensuring unique index.
659+
660+
```python
661+
# you need to rename the "name" column and "abbrevs" column
662+
# have a look at the table and see what makes sense, for example:
663+
name_column = "regionname"
664+
abbr_column = "short_name"
665+
666+
# specify the name of the geopandas dataframe. any str
667+
shapefile_name = "custom_regions"
668+
669+
# update the crs to lats and lons. Some original shapefiles will use northings etc
670+
gdf =gdf.to_crs(crs = "GDA2020")
671+
672+
# ensure the index has unique values from zero
673+
gdf.index = np.arange(0, len(gdf))
674+
675+
regions= regionmask.from_geopandas(gdf,
676+
names=name_column,
677+
abbrevs=abbr_column,
678+
name=shapefile_name,
679+
overlap=True)
680+
```
681+
682+
You may also need to change the CRS to "lat" and "lon". You may also need to create uniqueness by "dissolving" repeated named areas. In [area_statistics_example_basin_gpkg.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/area_statistics_example_basin_gpkg.ipynb), the geometries are read from a *.gpkg, the northings/eastings need to be converted to lat and lons, and dissolve is used to create uniquely named regions.
683+
684+
```python
685+
# read in the data for the areas to average across
686+
gdf = gpd.read_file("/g/data/mn51/users/ah7841/NCBLevel2DrainageBasinGroup_gda2020_v01.gpkg")
687+
688+
#convert geometry to lat lon (from northings)
689+
gdf.geometry = gdf.geometry.to_crs("EPSG:4326")
690+
691+
# There are duplicated of IDs. Merge geometries with the same IDs
692+
gdf = gdf.dissolve(by="HydroID").reset_index()
693+
694+
# use the geopandas dataframe to make a regionmask object
695+
# you will need to change the names, abbrevs and, name for your custom file.
696+
regions = regionmask.from_geopandas(gdf,
697+
names= "Level2Name",
698+
abbrevs= "HydroID",
699+
name="NCBLevel2DrainageBasinGroup_gda2020_v01",
700+
overlap=True)
701+
```
702+
625703
</details>
626704

627-
### Can I use acs_regional_stats for NaNs?
705+
### Can I use acs_regional_stats for NaNs and infinite values?
628706
<details>
629707
<summary> Expand </summary>
630-
Some of the statistics will not work if you have NaNs. eg mean, std, var
708+
709+
Be careful when calculating statistics over areas with many missing data. Investigate your own data and make sure that the statistics are still meaningful when the non-finite values are ignored. Depending on your data, consider filling missing data with a value (eg 0) if that results in more representative statistics.
710+
711+
New update (19 Nov 2020) allows for statistics on NaNs and infinite values by applying the following.
712+
713+
```python
714+
ds[var].values = np.ma.masked_invalid(ds[var].values)
715+
```
716+
717+
Previously, some of the statistics would not work if you had NaNs. eg mean, std, var
718+
631719
</details>
632720

633721
### How do I calculate statistics for categorical data?
634722
<details>
635723
<summary> Expand </summary>
724+
636725
Different types of data need different tools to summarise the data. For example, some data is not numerical but is defined as a class or category eg ["forest", "grassland", "arid"]. We cannot calculate a `sum` or `mean` of different classes.
637726
Categorical statistics include the `mode` (most common category) and `proportion` (proportion of each category relative to the whole).
638727
If there is an order to the classes eg ["low", "moderate", "high", "extreme"], we can also calculate `min`, `median`, and `max` values.
728+
729+
[plotting_and_stats_examples.ipynb](https://github.com/AusClimateService/plotting_maps/blob/main/example_notebooks/plotting_and_stats_examples.ipynb) shows examples of plotting and calculating statistics of categorical data.
730+
639731
</details>
640732

641733
### Calculating time series using acs_regional_stats
642734
<details>
643735
<summary> Expand </summary>
644-
use the dims keyword and don't include "time". This may be very memory intensive depending on your data size.
736+
737+
Although many examples for applying acs_regional_stats use dims=("lat", "lon") to reduce 2D data to regional averages, the function is very flexible. For example, if you have a time dimension, then you can calculate regional averaged (or min/median/max/any stat) time series by excluding time "time" dimension from the dims tuple. This may be very memory intensive depending on your data size, so request lots of memory if you need to.
738+
739+
740+
Future development will look to manage memory more effectively.
741+
742+
An example of extracting time series from point locations can be found here: https://github.com/AusClimateService/TimeSeriesExtraction
743+
645744
</details>
646745

647746

0 commit comments

Comments
 (0)