Skip to content

Conversation

mjlosch
Copy link

@mjlosch mjlosch commented Sep 30, 2024

It is not clear if tape keys are correct. The gradients have not been tested.
There are no remaining recomputation warnings after adjusting logic in ihop_driver.F

On my MacBook, I can now type

../../../tools/genmake2 -ncad '-mods=../code_ad ../code.ihop' -devel
make depend
make adall

and everything compiles. Only at the link step I get a memory error:

final section layout:
    __TEXT/__text addr=0x1000011D4, size=0x00B25C20, fileOffset=0x000011D4, type=1
    __TEXT/__stubs addr=0x100B26DF4, size=0x00000534, fileOffset=0x00B26DF4, type=29
    __TEXT/__cstring addr=0x100B27328, size=0x0018ABA8, fileOffset=0x00B27328, type=13
    __TEXT/__const addr=0x100CB1ED0, size=0x00036A58, fileOffset=0x00CB1ED0, type=0
    __TEXT/__eh_frame addr=0x100CE8928, size=0x0001F6D0, fileOffset=0x00CE8928, type=19
    __DATA_CONST/__got addr=0x100D08000, size=0x00000378, fileOffset=0x00D08000, type=31
    __DATA_CONST/__const addr=0x100D08378, size=0x000001E0, fileOffset=0x00D08378, type=0
    __DATA/__data addr=0x100D0C000, size=0x02BB0EF0, fileOffset=0x00D0C000, type=0
    __DATA/__common addr=0x1038BCEF0, size=0x063BA450, fileOffset=0x00000000, type=26
    __DATA/__bss addr=0x109C77340, size=0x45DA1E67C0, fileOffset=0x00000000, type=26
[..., repeated many times ...] 
ld: ARM64 ADRP out of range (300164542464 max is +/-4GB): from _csystemtime_ (0x100B26D28) to _invclktck (0x46E3E5DAF8) in '_csystemtime_' from timer_stats.o for architecture arm64
collect2: error: ld returned 1 exit status
make[1]: *** [mitgcmuv_ad] Error 1
make: *** [ad_exe_target] Error 2

but it is not clear if the gradients are correct
for reference, but commented out, as it leads to errors
@mjlosch
Copy link
Author

mjlosch commented Sep 30, 2024

@IvanaEscobar I created the PR just to show my changes in a compact form.

Comment on lines +504 to +511
!$TAF STORE arr, narr, ray2d = tapelev2, key = ilev_2
!$TAF STORE angles%dalpha = tapelev2, key = ilev_2
!$TAF STORE bdry%bot, bdry%top = tapelev2, key = ilev_2
!$TAF STORE beam%nsteps = tapelev2, key = ilev_2
!$TAF STORE pos%rz, pos%sz = tapelev2, key = ilev_2
!$TAF STORE ssp%c, ssp%cmat = tapelev2, key = ilev_2
!$TAF STORE ssp%cz, ssp%czmat = tapelev2, key = ilev_2
!$TAF STORE ssp%rho = tapelev2, key = ilev_2
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be done in a similar way for tapelevel3 and tapelevel4.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the PASSIVE statement does not work properly

Comment on lines 61 to 69
CML# ifdef ALLOW_AUTODIFF_TAMC
CML tkey = nts*(ikey_dynamics-1) + t
CML!$TAF STORE arr, bdry, ray2d = comlev1_ihop_nts, key = tkey
CML!$TAF STORE angles%dalpha = comlev1_ihop_nts, key = tkey
CML!$TAF STORE bdry%bot, bdry%top = comlev1_ihop_nts, key = tkey
CML!$TAF STORE beam%nsteps = comlev1_ihop_nts, key = tkey
CML!$TAF STORE pos%rz, pos%sz = comlev1_ihop_nts, key = tkey
CML!$TAF STORE ssp%c = comlev1_ihop_nts, key = tkey
CML# endif
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the commented part that leads to the segmentation fault at TAF's end.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may lead to issues due to ray2d and other derived types being extended into arrays. TAF seems to have a hard time with allocatable arrays within an array of derived types.

One way around this is to create a bunch of temporary arrays to store into tapes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I tried different subsets of the full set, but it always gave the segmentation fault. I store ray2d in other places without (obvious) problems.

@@ -1004,6 +1019,9 @@ SUBROUTINE gcmSSP( myThid )
k = k + 1
END DO
END DO
#ifdef ALLOW_AUTODIFF_TAMC
!$TAF STORE ssp%cMat = comlev1_ihop, key = ikey_dynamics
#endif
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These stores avoid recomputing tmpssp and ssp%cMat in the AD-code (something that TAF didn't warn about)

allows to remove an now unused tape
Comment on lines 739 to 743
!$TAF STORE ssp%c, ssp%cmat = comlev1_ihop, key = ikey_dynamics
!$TAF STORE ssp%cz, ssp%czmat = comlev1_ihop, key = ikey_dynamics
C for some reason TAF does not do this right, so we store ssp instead
CML!$TAF STORE ssp%rho = comlev1_ihop, key = ikey_dynamics
!$TAF STORE ssp = comlev1_ihop, key = ikey_dynamics
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as ssp is stored, we don't need the rest

Suggested change
!$TAF STORE ssp%c, ssp%cmat = comlev1_ihop, key = ikey_dynamics
!$TAF STORE ssp%cz, ssp%czmat = comlev1_ihop, key = ikey_dynamics
C for some reason TAF does not do this right, so we store ssp instead
CML!$TAF STORE ssp%rho = comlev1_ihop, key = ikey_dynamics
!$TAF STORE ssp = comlev1_ihop, key = ikey_dynamics
CML!$TAF STORE ssp%c, ssp%cmat = comlev1_ihop, key = ikey_dynamics
CML!$TAF STORE ssp%cz, ssp%czmat = comlev1_ihop, key = ikey_dynamics
C for some reason TAF does not do this right, so we store ssp instead
CML!$TAF STORE ssp%rho = comlev1_ihop, key = ikey_dynamics
!$TAF STORE ssp = comlev1_ihop, key = ikey_dynamics

@IvanaEscobar IvanaEscobar self-assigned this Sep 30, 2024
@mjlosch
Copy link
Author

mjlosch commented Oct 7, 2024

More comments:

  • moving IHOP_COST_INLOOP to s/r forward_step requires fewer store directives
  • CALL COST_IHOP can probably be moved to s/r cost_final (to follow the general logic), routine does not need any store directives

@IvanaEscobar
Copy link
Owner

Adding a note here on TAF inner workings as of 6.5.0:

A derived type should be able to support both passive and active components mixed. To enforce a passive component within a derived type, one may set it as such at the point of declaration.

TYPE myDerivedType
  INTEGER :: myInt
  REAL (KIND=8), ALLOCATABLE :: myAllocArr( : )
  REAL (KIND=8) :: myFixedArr( : )
END TYPE myDerivedType

!$TAF PASSIVE myDerivedType%myFixedArr

TYPE( myDerivedType ) :: foo

This will enforce any call of foo%myFixedArr to be set to passive.

NOTE: TAF should figure this out without human intervention. We have experienced TAF missing some derived type components mostly due to misplaced code flow within ihop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants