Skip to content

Am/chore/param dedup #2306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 8, 2025
Merged

Am/chore/param dedup #2306

merged 3 commits into from
May 8, 2025

Conversation

IceTDrinker
Copy link
Member

@IceTDrinker IceTDrinker commented May 2, 2025

C API and JS API parameters have been reduced with macro magic

to create the parameter dir I did :
copy v1_1 to v1_2, replace V1_1 prefix by V1_2

then run:

cargo run --release -p param_dedup -- --tfhe-path tfhe/ --to-deduplicate v1_2


This change is Reviewable

@cla-bot cla-bot bot added the cla-signed label May 2, 2025
@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch 2 times, most recently from e6adee8 to 6e2b899 Compare May 6, 2025 10:18
@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch from 6e2b899 to bafe248 Compare May 6, 2025 10:20
@zama-bot zama-bot removed the approved label May 6, 2025
@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch from bafe248 to 9812439 Compare May 6, 2025 11:37
@zama-bot zama-bot removed the approved label May 6, 2025
@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch from 9812439 to e52f3e5 Compare May 6, 2025 14:16
@IceTDrinker IceTDrinker marked this pull request as ready for review May 6, 2025 14:27
@IceTDrinker IceTDrinker requested a review from mayeul-zama as a code owner May 6, 2025 14:27
@IceTDrinker
Copy link
Member Author

small mistake in shortint/parameters/v1_2/mod.rs left 1.1 in the doc comment

@IceTDrinker IceTDrinker requested review from soonum and nsarlin-zama May 6, 2025 15:55
@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch from e52f3e5 to d8be6e4 Compare May 7, 2025 08:14
@zama-bot zama-bot removed the approved label May 7, 2025
Copy link
Contributor

@nsarlin-zama nsarlin-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 10 unresolved discussions (waiting on @mayeul-zama and @soonum)


a discussion (no related file):
Only looked at param_dedup for the moment, nice tool ! Some small remarks to simplify a bit the code.


utils/param_dedup/src/main.rs line 144 at r1 (raw file):

        let module_name = dir_entry_name
            .to_str()
            .ok_or("Could not convert DirEntry name to rust str.")

Another option is to use to_string_lossy, it replaces invalid UTF-8 by a placeholder, I think this is enough in our case


utils/param_dedup/src/main.rs line 178 at r1 (raw file):

    shortint_parameters_per_version
        .sort_by(|(version_a, _dir_a), (version_b, _dir_b)| version_a.cmp(version_b));

Maybe at this point you can also filter to keep only versions that are < args.to_deduplicate on one side and get the to_deplicate on the other side.
That way you wont have to parse files that are not needed and you can remove the check below that the version is smaller than the one to deduplicate


utils/param_dedup/src/main.rs line 199 at r1 (raw file):

            ))
        })
        .unwrap();

Nitpick, but I think this can be simplified a bit since you don't really use the result (neither Ok nor Err):

    shortint_parameters_per_version
        .iter()
        .find(|(version, _dir)| {
            let version_as_str = format!("v{}_{}", version.major, version.minor);
            version_as_str == args.to_deduplicate
        })
        .unwrap_or_else(|| {
            panic!(
                "Could not find version to deduplicate: {}",
                args.to_deduplicate
            )
        });

utils/param_dedup/src/main.rs line 205 at r1 (raw file):

    let mut param_version_and_associated_file_parameters: HashMap<
        _,
        HashMap<PathBuf, HashSet<syn::Item>>,

Instead of HashMap<PathBuf, HashSet<syn::Item>>, you could have HashMap<syn::Ident, syn::ItemConst>. You index by Ident so it makes it resistant to the case where definitions are moved between versions, and it simplifies it a bit since you remove the HashSet.

Later you simply check with something like:

                        if let Some(existing_normalized_parameters) =
                            old_file_and_params.get(&current_normalized_param.ident)
                        {
                            if existing_normalized_parameters == &current_normalized_param {
                               // ...
                            }
                       }

utils/param_dedup/src/main.rs line 208 at r1 (raw file):

    > = shortint_parameters_per_version
        .iter()
        .map(|(version, dir)| ((version, dir), HashMap::new()))

I think here you can only use the version as key to simplify this a bit, especially when you have to use it later


utils/param_dedup/src/main.rs line 214 at r1 (raw file):

        let param_ident_prefix = shortint_param_dir
            .file_name()
            .ok_or("Could not convert get file name")

"Could not get file name"


utils/param_dedup/src/main.rs line 216 at r1 (raw file):

            .ok_or("Could not convert get file name")
            .unwrap()
            .to_str()

Similarly, could be made a bit less verbose with to_string_lossy


utils/param_dedup/src/main.rs line 282 at r1 (raw file):

                                    }
                                };
                                if !hash_set.insert(syn::Item::Const(normalized_param)) {

Since you only need the Const case, you could directly store the ItemConst in the hashset (ie: hash_set.insert(normalized_param))


utils/param_dedup/src/main.rs line 426 at r1 (raw file):

        // All const items have been mapped to old parameters, so we can remove all imports except
        // for the parameter types used in the file

Maybe it would be easier to do that with a cargo clippy --fix ? It would handle some corner cases, for example if you do not remove all the items but somehow an import ends up being not used anymore ?

Copy link
Member Author

@IceTDrinker IceTDrinker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 10 unresolved discussions (waiting on @mayeul-zama, @nsarlin-zama, and @soonum)


utils/param_dedup/src/main.rs line 144 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

Another option is to use to_string_lossy, it replaces invalid UTF-8 by a placeholder, I think this is enough in our case

I guess


utils/param_dedup/src/main.rs line 178 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

Maybe at this point you can also filter to keep only versions that are < args.to_deduplicate on one side and get the to_deplicate on the other side.
That way you wont have to parse files that are not needed and you can remove the check below that the version is smaller than the one to deduplicate

you are right, I think I started by parsing everything hence this code


utils/param_dedup/src/main.rs line 199 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

Nitpick, but I think this can be simplified a bit since you don't really use the result (neither Ok nor Err):

    shortint_parameters_per_version
        .iter()
        .find(|(version, _dir)| {
            let version_as_str = format!("v{}_{}", version.major, version.minor);
            version_as_str == args.to_deduplicate
        })
        .unwrap_or_else(|| {
            panic!(
                "Could not find version to deduplicate: {}",
                args.to_deduplicate
            )
        });

used to return everything as an error as main was returning a Result<(), Box<dyn Error>>

but I was losing the line that generated the error so I reverted to unwraping everything


utils/param_dedup/src/main.rs line 205 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

Instead of HashMap<PathBuf, HashSet<syn::Item>>, you could have HashMap<syn::Ident, syn::ItemConst>. You index by Ident so it makes it resistant to the case where definitions are moved between versions, and it simplifies it a bit since you remove the HashSet.

Later you simply check with something like:

                        if let Some(existing_normalized_parameters) =
                            old_file_and_params.get(&current_normalized_param.ident)
                        {
                            if existing_normalized_parameters == &current_normalized_param {
                               // ...
                            }
                       }

the PathBuf is used on purpose to avoid mixing compact pk parameters and classic, as they need to be in the same file in previous versions otherwise we could have things like

PARAM_MESSAGE_2_CARRY_2 = v1_1::...::PARAM_COMPACT_PK etc.

so this is intended


utils/param_dedup/src/main.rs line 208 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

I think here you can only use the version as key to simplify this a bit, especially when you have to use it later

I agree but I think I might have needed the dir at some point


utils/param_dedup/src/main.rs line 214 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

"Could not get file name"

yep


utils/param_dedup/src/main.rs line 216 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

Similarly, could be made a bit less verbose with to_string_lossy

prefer avoiding weird stuff happening don't you think ?


utils/param_dedup/src/main.rs line 282 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

Since you only need the Const case, you could directly store the ItemConst in the hashset (ie: hash_set.insert(normalized_param))

don't recall why I put the wrapped thing in, I'll recheck


utils/param_dedup/src/main.rs line 426 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

Maybe it would be easier to do that with a cargo clippy --fix ? It would handle some corner cases, for example if you do not remove all the items but somehow an import ends up being not used anymore ?

I preferred the above for now, clippy fix proved annoying on my machine to get to work (don't recall why)

Copy link
Member Author

@IceTDrinker IceTDrinker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 10 unresolved discussions (waiting on @mayeul-zama, @nsarlin-zama, and @soonum)


utils/param_dedup/src/main.rs line 208 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

I agree but I think I might have needed the dir at some point

it's to avoid having to recreate the dir, once it has been found it's stored alongside the version to not have to recreate the path

Copy link
Contributor

@nsarlin-zama nsarlin-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 10 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)


utils/param_dedup/src/main.rs line 216 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

prefer avoiding weird stuff happening don't you think ?

As you wish, I think that if at some point we have filenames that are not utf-8 in our codebase we will have other problems 😁

@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch from d8be6e4 to fc4d95f Compare May 7, 2025 12:38
Copy link
Member Author

@IceTDrinker IceTDrinker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 6 unresolved discussions (waiting on @mayeul-zama, @nsarlin-zama, and @soonum)


utils/param_dedup/src/main.rs line 178 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

you are right, I think I started by parsing everything hence this code

with <= as we need to analyze the last version to be able to find old parameters matching the new ones


utils/param_dedup/src/main.rs line 216 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

As you wish, I think that if at some point we have filenames that are not utf-8 in our codebase we will have other problems 😁

agreed, so the unwrap will protect us more in that case :)

Copy link
Contributor

@nsarlin-zama nsarlin-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 6 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)


utils/param_dedup/src/main.rs line 178 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

with <= as we need to analyze the last version to be able to find old parameters matching the new ones

But then I think the == case will be parsed twice ?
Because you parse it when you build the hashmap and then again to do the dedup. And after that you skip ">=" so unless I'm missing something the one in the hashmap is parsed and then thrown away


utils/param_dedup/src/main.rs line 208 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

it's to avoid having to recreate the dir, once it has been found it's stored alongside the version to not have to recreate the path

but you have it anyways because later you iterate on shortint_parameters_per_version where the pair (version, dir) is also present. The one from the hashmap is actually not used.

Copy link
Contributor

@nsarlin-zama nsarlin-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 6 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)


utils/param_dedup/src/main.rs line 205 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

the PathBuf is used on purpose to avoid mixing compact pk parameters and classic, as they need to be in the same file in previous versions otherwise we could have things like

PARAM_MESSAGE_2_CARRY_2 = v1_1::...::PARAM_COMPACT_PK etc.

so this is intended

But in that case wouldn't the ident be different too ? Because compact pk params have COMPACT_PK in their name so they shouldn't be mixed? I might not have the full picture here


utils/param_dedup/src/main.rs line 216 at r1 (raw file):

Previously, IceTDrinker (Arthur Meyre) wrote…

agreed, so the unwrap will protect us more in that case :)

ok, I understand the choice!

Copy link
Contributor

@nsarlin-zama nsarlin-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 8 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)


a discussion (no related file):
the parameters update and the macro look good to me, except a small typo left


tfhe/docs/fhe-computation/compute/parameters.md line 38 at r1 (raw file):

The naming convention of these parameters indicates their capabilities. Taking `tfhe::parameters::v1_2::V1_2_PARAM_MESSAGE_2_CARRY_2_KS_PBS_TUNIFORM_2M128` as an example:
- `V1_2`: these parameters were introduced in **TFHE-rs** v1.0

Still a small mistake here, it should either be: "V1_2: these parameters were introduced in TFHE-rs v1.2" or "V1_0: these parameters were introduced in TFHE-rs v1.0"

(I'm slightly in favor of the second one, since it is an example to show how the naming work we don't have to update it every releases)

@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch from fc4d95f to 60b0b88 Compare May 7, 2025 14:03
Copy link
Member Author

@IceTDrinker IceTDrinker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 6 unresolved discussions (waiting on @mayeul-zama, @nsarlin-zama, and @soonum)


tfhe/docs/fhe-computation/compute/parameters.md line 38 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

Still a small mistake here, it should either be: "V1_2: these parameters were introduced in TFHE-rs v1.2" or "V1_0: these parameters were introduced in TFHE-rs v1.0"

(I'm slightly in favor of the second one, since it is an example to show how the naming work we don't have to update it every releases)

thanks


utils/param_dedup/src/main.rs line 205 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

But in that case wouldn't the ident be different too ? Because compact pk params have COMPACT_PK in their name so they shouldn't be mixed? I might not have the full picture here

no I think you are right if we are not making any mistakes the names should be unique


utils/param_dedup/src/main.rs line 208 at r1 (raw file):

Previously, nsarlin-zama (Nicolas Sarlin) wrote…

but you have it anyways because later you iterate on shortint_parameters_per_version where the pair (version, dir) is also present. The one from the hashmap is actually not used.

ah right, the one in the hash map

@IceTDrinker IceTDrinker requested a review from nsarlin-zama May 7, 2025 14:07
Copy link
Contributor

@nsarlin-zama nsarlin-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 132 files reviewed, 3 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)


utils/param_dedup/src/main.rs line 226 at r3 (raw file):

        .unwrap();

    // Keep all previous versions including the one to deduplicate to have data to compare

now I think the part about the one to deduplicate is not relevant anymore

@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch from 60b0b88 to acbdfce Compare May 7, 2025 14:54
@IceTDrinker IceTDrinker requested a review from nsarlin-zama May 7, 2025 14:54
Copy link
Contributor

@nsarlin-zama nsarlin-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:, nice!

Reviewable status: 0 of 132 files reviewed, 3 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)

- this allows to keep relevant information with param_dedup, as param_dedup
uses syn, comments are lost as syn does not preserver comments in the AST
- update parameters deduped for classic and multi bit
@IceTDrinker IceTDrinker force-pushed the am/chore/param-dedup branch from acbdfce to b4c211a Compare May 7, 2025 16:02
@zama-bot zama-bot removed the approved label May 7, 2025
@IceTDrinker IceTDrinker merged commit d197a2a into main May 8, 2025
109 of 110 checks passed
@IceTDrinker IceTDrinker deleted the am/chore/param-dedup branch May 8, 2025 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants