-
Notifications
You must be signed in to change notification settings - Fork 202
Bug fixes: ungraceful test crash fixes #637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Bug fixes: ungraceful test crash fixes #637
Conversation
Changed functions to use std::time::Instant which is monotonic, to avoid Rust panics with SystemTime. Replaced some unwraps with an expect. Removed redundant brackets, and secp256k1::
Changed fn clean_old_pending_outpoints to retain keys that are younger than an hour, instead of collecting older than an hour ones as a vector, and then using a new for loop to deleting them. linting with cargo fmt
# Optimization flow.rs ; async sync_missing_relay_past_headers # Fix - Save memory; changed jobs to pass an iterator to try_join_all instead of collecting a vector and then passing it. - Changed error check to return an explicit error, instead of implicit
This reverts commit 0bffb60.
create_temp_db now returns a Result
create_test_db returns a Result
fixes cargo test crashes (ungraceful termination)
create_temp_db now returns Result to fix test crashing ungracefully
create_temp_db returns result to prevent test crashing ungracefully, so we need to add error handling
tokio-runtime-worker subtracted below zero and caused overflow during cargo test, and ungraceful termination
serve_result causes testing to crash ungracefully using panic! when error recieved. Now error is propagated
create temp db returns Result
create_temp_db returns Result, add error handling
…eads to hitting File Descriptor Limit and failing tests due to panic in get_kaspa_tempdir func in database/src/utils.rs
… reaches System File Descriptor limit and crashes testing ungracefully leaving temp files in system
this is fixed in PR: The other test failing is the thread 'tx::generator::test::test_generator_inputs_250k_outputs_2_sweep' has overflowed its stack
fatal runtime error: stack overflow I think it might be just that the vector with 250k elements is in stack, so it overflows. You could allocate it in heap with fixed size, but then you are fixing the test, and not sure how it reflects to testing the making of utxo_entries in wallet/core/src/tx/generator/generator.rs @michaelsutton @coderofstuff @someone235 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review. I'll probably on be able to get back to this after the next few weeks but please take a look.
self | ||
); | ||
// Prevent underflow; [ERROR] thread 'tokio-runtime-worker' panicked at notify/src/subscription/single.rs:388:13: attempt to subtract with overflow | ||
let _ = UTXOS_CHANGED_SUBSCRIPTIONS.fetch_update(Ordering::SeqCst, Ordering::SeqCst, |count| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we just did:
UTXOS_CHANGED_SUBSCRIPTIONS.fetch_sub(1, Ordering::SeqCst).saturating_sub(1)
Also, I wonder what the root cause of UTXOS_CHANGED_SUBSCRIPTIONS
being 0
is. The above fixes the underflow, but the incorrect usage that allowed the situation still needs to be determined. the expectation here is that UTXOS_CHANGED_SUBSCRIPTIONS
should always be above 0 when drop occurs OR document the situations when it can be 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaik saturating_sub only applies to old value from fetch_sub not the atomic itself, so it doesnt prevent actual atomic variable from becoming negative. Suggested fetch_update solution only applies if the value is greater than zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -113,6 +113,9 @@ jobs: | |||
target/ | |||
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }} | |||
|
|||
- name: Run cargo tests non--release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We moved away from test
and switched to using nextest
. But I do wonder why nextest doesn't capture the errors that test is displaying here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cargo test
still shows tests failing that nextest does not, so what is a recommended solution?
match serve_result { | ||
Ok(_) => info!("P2P Server stopped: {}", serve_address), | ||
Err(err) => panic!("P2P, Server {serve_address} stopped with error: {err:?}"), | ||
Err(err) => log::error!("P2P, Server {serve_address} stopped with error: {err:?}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error in the use
at line 8 and just use that here.
@@ -1820,7 +1821,7 @@ async fn run_kip10_activation_test() { | |||
let mut genesis_multiset = MuHash::new(); | |||
consensus.append_imported_pruning_point_utxos(&initial_utxo_collection, &mut genesis_multiset); | |||
consensus.import_pruning_point_utxo_set(config.genesis.hash, genesis_multiset).unwrap(); | |||
consensus.init(); | |||
let wait_handles = consensus.init(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments on why this is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was done in previous tests as well.
Here
pub fn init(&self) -> Vec<JoinHandle<()>> {
self.consensus.run_processors()
}
and
pub fn shutdown(&mut self) {
self.core.shutdown();
self.join();
}
where core is wrapped in Arc, so its reference counted. Without it some background threads might stay running, thought it might be related to getting System File Descriptor problems and not getting /tmp/rusty-kaspa cleanups early.
@@ -1872,6 +1873,8 @@ async fn run_kip10_activation_test() { | |||
let status = consensus.add_utxo_valid_block_with_parents((index + 1).into(), vec![index.into()], vec![spending_tx.clone()]).await; | |||
assert!(matches!(status, Ok(BlockStatus::StatusUTXOValid))); | |||
assert!(consensus.lkg_virtual_state.load().accepted_tx_ids.contains(&tx_id)); | |||
consensus.shutdown(wait_handles); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here and the rest of similar changes: Add comments on why this is necessary.
…fore even generator function call
a0ae6d4
to
fb2607a
Compare
Bug fix
What I've done
Additional thoughts