This repository contains code and links to data and ML models used in the paper Learning Euler Factors of Elliptic Curves by Angelica Babei, François Charton, Edgar Costa, Xiaoyu Huang, Kyu-Hwan Lee, David Lowry-Duda, Ashvni Narayanan, and Alexey Pozdnyakov. We apply transformer models and feedforward neural networks to predict Frobenius traces
The preprint is on the arxiv:2502.10357.
Contents
This code was run using SageMath version 10.2 with a small number of additional packages:
- pandas
- pytorch
- requests for simplicity
- scikit-learn
In addition, some of the code uses a particular version of Int2Int, which is pinned in the source tree.
It would be straightforward to remove SageMath as a dependency. Doing so would require installing the packages
and would require writing some small code snippets for prime generation and related utilities.
In different words, the standard scientific python stack, pytorch, and some utility functions are sufficient.
Note that small numbers of GPUs were used to generate the data for the paper, but the machine learning tasks employed ultimately don't require extensive computation. It would be possible to run this on CPUs with some patience.
The code is separated into three parts, according to the relevant section of the paper. The code in sections 4 and 5 both generate datafiles that are then given to Int2Int.
In Code/Section 4/generate_ap_data.ipynb, there is code that creates datafiles for predicting
In Code/Section 4/train_and_load_predicting_ap_models.ipynb there are terminal commands for Int2Int to train and load the models described in Section 4, and the encoder-decoder models described in Section 7. The trained models are available at Trained transformer models for predicting traces of Frobenius of elliptic curves with conductor up to 10^6.
In Code/Section 5/5.1/generate_data_sec_5_1.ipynb, there is a sample code that creates datafiles used in Section 5.1 for Int2Int.
In Code/Section 5/5.1/train_and_load_5_1_models.ipynb there are terminal commands for Int2Int to train and load the models described in Section 5.1. The trained models are available at Checkpoints And Train Logs for Section 5.1 in Learning Euler Factors of Elliptic Curves.
In Code/Section 5/5.2/generate_mod2data_no_duplicates.ipynb, there is simple code that creates datafiles for mod
In Code/Section 5/5.2/train_and_load_mod2_no_duplicates_models.ipynb there are terminal commands for Int2Int to train and load the models described in Section 5.2. The trained models are available at Trained transformer models for predicting traces of Frobenius mod 2 of elliptic curves with conductor up to 10^7.
The code in Code/Section 6 is more involved, as it contains two different neural network implementations and experiments. This code is self-contained. The jupyter notebook for Section 6.1 is a complete record of an interactive session generating the data for section 6.1 of the paper.
The underlying data for this comes from ECQ8, A set of isogeny classes of elliptic curves of conductor up to 10^8 by Drew Sutherland. The set ECQ6 is at Frobenious traces for a set of isogeny classes of elliptic curves of conductor up to 10^6
by Edgar Costa; this is a subset of the isogeny classes of curves in ECQ8, but with
[ECQ8]
Sutherland, A. V. (2024). A set of isogeny classes of
elliptic curves of conductor up to 10^8.
Zenodo. https://doi.org/10.5281/zenodo.14847809
[ECQ6]
Costa, E. (2025). Frobenious traces for a set of
isogeny classes of elliptic curves of conductor
up to 10^6.
Zenodo. https://doi.org/10.5281/zenodo.15777475
[ECQ6small]
Babei, A. (2025). Frobenius traces of small primes
for a subset of isogeny classes of elliptic curves
of conductor up to 10^6.
Zenodo. https://doi.org/10.5281/zenodo.15832317
The paper also uses the set ECQ7, the subset of isogeny classes of curves in ECQ8 along with
All datasets ECQ6, ECQ6small, and ECQ8 are available under CC-BY-4.0. See their DOI pages for complete licensing information.
The transformer experiments with Int2Int use a fixed version of Int2Int. The API to use Int2Int has changed since these experiments were carried out; to replicate the experiments here, make sure to use the pinned version indicated in this repository!
This repository is not accepting contributions. But the authors are going to consider applications of ML to math (and vice versa) further in the future. Feel free to contact us with ideas, suggestions, or other proposals for projects and collaboration.
The code here is made available under the MIT License. See the LICENSE file for more.