-
Notifications
You must be signed in to change notification settings - Fork 176
Am/chore/param dedup #2306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Am/chore/param dedup #2306
Conversation
e6adee8
to
6e2b899
Compare
6e2b899
to
bafe248
Compare
bafe248
to
9812439
Compare
9812439
to
e52f3e5
Compare
small mistake in shortint/parameters/v1_2/mod.rs left 1.1 in the doc comment |
e52f3e5
to
d8be6e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 10 unresolved discussions (waiting on @mayeul-zama and @soonum)
a discussion (no related file):
Only looked at param_dedup
for the moment, nice tool ! Some small remarks to simplify a bit the code.
utils/param_dedup/src/main.rs
line 144 at r1 (raw file):
let module_name = dir_entry_name .to_str() .ok_or("Could not convert DirEntry name to rust str.")
Another option is to use to_string_lossy
, it replaces invalid UTF-8 by a placeholder, I think this is enough in our case
utils/param_dedup/src/main.rs
line 178 at r1 (raw file):
shortint_parameters_per_version .sort_by(|(version_a, _dir_a), (version_b, _dir_b)| version_a.cmp(version_b));
Maybe at this point you can also filter to keep only versions that are < args.to_deduplicate
on one side and get the to_deplicate
on the other side.
That way you wont have to parse files that are not needed and you can remove the check below that the version is smaller than the one to deduplicate
utils/param_dedup/src/main.rs
line 199 at r1 (raw file):
)) }) .unwrap();
Nitpick, but I think this can be simplified a bit since you don't really use the result (neither Ok nor Err):
shortint_parameters_per_version
.iter()
.find(|(version, _dir)| {
let version_as_str = format!("v{}_{}", version.major, version.minor);
version_as_str == args.to_deduplicate
})
.unwrap_or_else(|| {
panic!(
"Could not find version to deduplicate: {}",
args.to_deduplicate
)
});
utils/param_dedup/src/main.rs
line 205 at r1 (raw file):
let mut param_version_and_associated_file_parameters: HashMap< _, HashMap<PathBuf, HashSet<syn::Item>>,
Instead of HashMap<PathBuf, HashSet<syn::Item>>
, you could have HashMap<syn::Ident, syn::ItemConst>
. You index by Ident so it makes it resistant to the case where definitions are moved between versions, and it simplifies it a bit since you remove the HashSet.
Later you simply check with something like:
if let Some(existing_normalized_parameters) =
old_file_and_params.get(¤t_normalized_param.ident)
{
if existing_normalized_parameters == ¤t_normalized_param {
// ...
}
}
utils/param_dedup/src/main.rs
line 208 at r1 (raw file):
> = shortint_parameters_per_version .iter() .map(|(version, dir)| ((version, dir), HashMap::new()))
I think here you can only use the version as key to simplify this a bit, especially when you have to use it later
utils/param_dedup/src/main.rs
line 214 at r1 (raw file):
let param_ident_prefix = shortint_param_dir .file_name() .ok_or("Could not convert get file name")
"Could not get file name"
utils/param_dedup/src/main.rs
line 216 at r1 (raw file):
.ok_or("Could not convert get file name") .unwrap() .to_str()
Similarly, could be made a bit less verbose with to_string_lossy
utils/param_dedup/src/main.rs
line 282 at r1 (raw file):
} }; if !hash_set.insert(syn::Item::Const(normalized_param)) {
Since you only need the Const case, you could directly store the ItemConst
in the hashset (ie: hash_set.insert(normalized_param)
)
utils/param_dedup/src/main.rs
line 426 at r1 (raw file):
// All const items have been mapped to old parameters, so we can remove all imports except // for the parameter types used in the file
Maybe it would be easier to do that with a cargo clippy --fix
? It would handle some corner cases, for example if you do not remove all the items but somehow an import ends up being not used anymore ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 10 unresolved discussions (waiting on @mayeul-zama, @nsarlin-zama, and @soonum)
utils/param_dedup/src/main.rs
line 144 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
Another option is to use
to_string_lossy
, it replaces invalid UTF-8 by a placeholder, I think this is enough in our case
I guess
utils/param_dedup/src/main.rs
line 178 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
Maybe at this point you can also filter to keep only versions that are <
args.to_deduplicate
on one side and get theto_deplicate
on the other side.
That way you wont have to parse files that are not needed and you can remove the check below that the version is smaller than the one to deduplicate
you are right, I think I started by parsing everything hence this code
utils/param_dedup/src/main.rs
line 199 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
Nitpick, but I think this can be simplified a bit since you don't really use the result (neither Ok nor Err):
shortint_parameters_per_version .iter() .find(|(version, _dir)| { let version_as_str = format!("v{}_{}", version.major, version.minor); version_as_str == args.to_deduplicate }) .unwrap_or_else(|| { panic!( "Could not find version to deduplicate: {}", args.to_deduplicate ) });
used to return everything as an error as main was returning a Result<(), Box<dyn Error>>
but I was losing the line that generated the error so I reverted to unwraping everything
utils/param_dedup/src/main.rs
line 205 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
Instead of
HashMap<PathBuf, HashSet<syn::Item>>
, you could haveHashMap<syn::Ident, syn::ItemConst>
. You index by Ident so it makes it resistant to the case where definitions are moved between versions, and it simplifies it a bit since you remove the HashSet.Later you simply check with something like:
if let Some(existing_normalized_parameters) = old_file_and_params.get(¤t_normalized_param.ident) { if existing_normalized_parameters == ¤t_normalized_param { // ... } }
the PathBuf is used on purpose to avoid mixing compact pk parameters and classic, as they need to be in the same file in previous versions otherwise we could have things like
PARAM_MESSAGE_2_CARRY_2 = v1_1::...::PARAM_COMPACT_PK etc.
so this is intended
utils/param_dedup/src/main.rs
line 208 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
I think here you can only use the version as key to simplify this a bit, especially when you have to use it later
I agree but I think I might have needed the dir at some point
utils/param_dedup/src/main.rs
line 214 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
"Could not get file name"
yep
utils/param_dedup/src/main.rs
line 216 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
Similarly, could be made a bit less verbose with
to_string_lossy
prefer avoiding weird stuff happening don't you think ?
utils/param_dedup/src/main.rs
line 282 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
Since you only need the Const case, you could directly store the
ItemConst
in the hashset (ie:hash_set.insert(normalized_param)
)
don't recall why I put the wrapped thing in, I'll recheck
utils/param_dedup/src/main.rs
line 426 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
Maybe it would be easier to do that with a
cargo clippy --fix
? It would handle some corner cases, for example if you do not remove all the items but somehow an import ends up being not used anymore ?
I preferred the above for now, clippy fix proved annoying on my machine to get to work (don't recall why)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 10 unresolved discussions (waiting on @mayeul-zama, @nsarlin-zama, and @soonum)
utils/param_dedup/src/main.rs
line 208 at r1 (raw file):
Previously, IceTDrinker (Arthur Meyre) wrote…
I agree but I think I might have needed the dir at some point
it's to avoid having to recreate the dir, once it has been found it's stored alongside the version to not have to recreate the path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 10 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)
utils/param_dedup/src/main.rs
line 216 at r1 (raw file):
Previously, IceTDrinker (Arthur Meyre) wrote…
prefer avoiding weird stuff happening don't you think ?
As you wish, I think that if at some point we have filenames that are not utf-8 in our codebase we will have other problems 😁
d8be6e4
to
fc4d95f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 6 unresolved discussions (waiting on @mayeul-zama, @nsarlin-zama, and @soonum)
utils/param_dedup/src/main.rs
line 178 at r1 (raw file):
Previously, IceTDrinker (Arthur Meyre) wrote…
you are right, I think I started by parsing everything hence this code
with <= as we need to analyze the last version to be able to find old parameters matching the new ones
utils/param_dedup/src/main.rs
line 216 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
As you wish, I think that if at some point we have filenames that are not utf-8 in our codebase we will have other problems 😁
agreed, so the unwrap will protect us more in that case :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 6 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)
utils/param_dedup/src/main.rs
line 178 at r1 (raw file):
Previously, IceTDrinker (Arthur Meyre) wrote…
with <= as we need to analyze the last version to be able to find old parameters matching the new ones
But then I think the == case will be parsed twice ?
Because you parse it when you build the hashmap and then again to do the dedup. And after that you skip ">=" so unless I'm missing something the one in the hashmap is parsed and then thrown away
utils/param_dedup/src/main.rs
line 208 at r1 (raw file):
Previously, IceTDrinker (Arthur Meyre) wrote…
it's to avoid having to recreate the dir, once it has been found it's stored alongside the version to not have to recreate the path
but you have it anyways because later you iterate on shortint_parameters_per_version
where the pair (version, dir) is also present. The one from the hashmap is actually not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 6 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)
utils/param_dedup/src/main.rs
line 205 at r1 (raw file):
Previously, IceTDrinker (Arthur Meyre) wrote…
the PathBuf is used on purpose to avoid mixing compact pk parameters and classic, as they need to be in the same file in previous versions otherwise we could have things like
PARAM_MESSAGE_2_CARRY_2 = v1_1::...::PARAM_COMPACT_PK etc.
so this is intended
But in that case wouldn't the ident be different too ? Because compact pk params have COMPACT_PK
in their name so they shouldn't be mixed? I might not have the full picture here
utils/param_dedup/src/main.rs
line 216 at r1 (raw file):
Previously, IceTDrinker (Arthur Meyre) wrote…
agreed, so the unwrap will protect us more in that case :)
ok, I understand the choice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 8 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)
a discussion (no related file):
the parameters update and the macro look good to me, except a small typo left
tfhe/docs/fhe-computation/compute/parameters.md
line 38 at r1 (raw file):
The naming convention of these parameters indicates their capabilities. Taking `tfhe::parameters::v1_2::V1_2_PARAM_MESSAGE_2_CARRY_2_KS_PBS_TUNIFORM_2M128` as an example: - `V1_2`: these parameters were introduced in **TFHE-rs** v1.0
Still a small mistake here, it should either be: "V1_2
: these parameters were introduced in TFHE-rs v1.2" or "V1_0
: these parameters were introduced in TFHE-rs v1.0"
(I'm slightly in favor of the second one, since it is an example to show how the naming work we don't have to update it every releases)
fc4d95f
to
60b0b88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 6 unresolved discussions (waiting on @mayeul-zama, @nsarlin-zama, and @soonum)
tfhe/docs/fhe-computation/compute/parameters.md
line 38 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
Still a small mistake here, it should either be: "
V1_2
: these parameters were introduced in TFHE-rs v1.2" or "V1_0
: these parameters were introduced in TFHE-rs v1.0"(I'm slightly in favor of the second one, since it is an example to show how the naming work we don't have to update it every releases)
thanks
utils/param_dedup/src/main.rs
line 205 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
But in that case wouldn't the ident be different too ? Because compact pk params have
COMPACT_PK
in their name so they shouldn't be mixed? I might not have the full picture here
no I think you are right if we are not making any mistakes the names should be unique
utils/param_dedup/src/main.rs
line 208 at r1 (raw file):
Previously, nsarlin-zama (Nicolas Sarlin) wrote…
but you have it anyways because later you iterate on
shortint_parameters_per_version
where the pair (version, dir) is also present. The one from the hashmap is actually not used.
ah right, the one in the hash map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 3 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)
utils/param_dedup/src/main.rs
line 226 at r3 (raw file):
.unwrap(); // Keep all previous versions including the one to deduplicate to have data to compare
now I think the part about the one to deduplicate is not relevant anymore
60b0b88
to
acbdfce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 132 files reviewed, 3 unresolved discussions (waiting on @IceTDrinker, @mayeul-zama, and @soonum)
- this allows to keep relevant information with param_dedup, as param_dedup uses syn, comments are lost as syn does not preserver comments in the AST
- update parameters deduped for classic and multi bit
acbdfce
to
b4c211a
Compare
C API and JS API parameters have been reduced with macro magic
to create the parameter dir I did :
copy v1_1 to v1_2, replace V1_1 prefix by V1_2
then run:
cargo run --release -p param_dedup -- --tfhe-path tfhe/ --to-deduplicate v1_2
This change is