Skip to content

sine2pi/Maxfactor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

MaxFactor is a custom PyTorch optimizer with adaptive learning rates and specialized handling for matrix parameters. I wrote it for the model in the asr_model repository. I needed something that performs well but has a light memory foot print since I do everything from my laptop.

Characteristics

  • Adaptive learning rates based on parameter norms and training step
  • Specialized matrix handling:
    • Separate row and column variance estimates for 2D parameters
    • Matrix updates use sign-based scaling with max values
  • Vector handling uses EMA of squared gradients (similar to RMSprop)
  • Update normalization using infinity norm
  • Dynamic beta2 that changes with step count according to beta2_decay
  • Automatic learning rate annealing that decreases with square root of step count
  • Parameter specific updates scaled by parameter norms

This optimizer combines elements from several optimization techniques with specialized matrix handling that could be beneficial for asr/nlp neural network architectures.

About

An optimizer that is memory efficient

Topics

Resources

Stars

Watchers

Forks

Languages