https://github.com/yitu-opensource/T2T-ViT/blob/main/models/token_performer.py#L18 My code has turned on fp16, so the 1e-8 on this line to prevent division by 0 is not enough for my code... the loss of the network calculation appears nan due to this code : https://github.com/yitu-opensource/T2T-ViT/blob/main/models/token_performer.py#L50
https://github.com/yitu-opensource/T2T-ViT/blob/main/models/token_performer.py#L18
My code has turned on fp16, so the 1e-8 on this line to prevent division by 0 is not enough for my code... the loss of the network calculation appears nan due to this code :
https://github.com/yitu-opensource/T2T-ViT/blob/main/models/token_performer.py#L50