basic check to prevent 'orphan' clusters during prune #222

espg · 2025-06-17T22:20:43Z

Additional check inside of prune to ensure that each cluster is connected to at least one other cluster

espg · 2025-06-17T22:44:32Z

Worth reviewing the logic so we're clear on what is happening:

        if problems == 0:
            for row in mod:
                if np.sum(row) > 0:
                    problems += ~(np.max(mod[:,row].sum(axis=0)) >= 2)
            if problems == 0:
                subset.append(i)
                OC[i, :] = np.zeros(rowlength)

First line, problems == 0 means that we haven't removed any station from the processing dataset-- i.e., each station in the processing matrix has at least one entry, and will be processed by GAMIT at least once.

Lines 2 and 3: iterate thru each row of the M by N station matrix, where M rows refer to M clusters, and N columns refer to N total stations to process. The if np.sum(row) > 0 says to only check for non-pruned clusters; we don't care about clusters that have previously been removed, including if the current cluster is under consideration for removal.

Line 4: we entered this loop because removing the current cluster didn't cause any problems in station coverage. Now we'll check if removing the cluster causes any problems in overlap coverage. mod[:,row] takes the current row (cluster), checks what stations (columns) are in that cluster, and then subsets the M by N matrix to be M by Stations-in-that-cluster. The .sum(axis=0) gives the counts for each station across all clusters; a value of 1 in sum means that the station isn't tied anywhere else, a value of 2 or higher means that station is overlapped/tied. In plain english, mod[:,row].sum(axis=0) returns for cluster/row i, a list of the station counts in cluster i across all clusters. With ~(np.max(mod[:,row].sum(axis=0)) >= 2), we return a boolean of True if at least one station in cluster i is present/overlapping in another cluster; since True means there was no problem, we flip it to False if there are no problems, or True if there is an issue, and then add that value to problems.

Line 5 and beyond: When removing a cluster, we have to check all other cluster to see if we've removed all overlap/ties, so we iterate thru the full overcluster matrix everytime we remove a cluster, and see if any clusters are impacted. If none are, we modify the overcluster matrix and remove that cluster. If removing that cluster leaves any other clusters without any ties, then we don't remove that cluster.

espg · 2025-06-17T22:45:55Z

This doesn't have any logic to check for a minimum number of ties/overlaps. It just checks that clusters have overlap post pruning.

espg · 2025-06-17T23:16:10Z

One more note on this-- if we have a case where we haven't overclustered something, such as the rejection_threshold triggering:

...then adding in the logic in this PR will result in no pruning at all for the clusters. This is because if overcluster doesn't add overlap stations for a row, then problems will always be non-zero in line 5 above. Line 4 will not generate any problems for every cluster, until it gets to the cluster that didn't expand, which will report that it isn't tied to anything.

This is very much an edge case, but we should be aware of it. We could add another function to check for orphans between the overcluster and prune steps explicitly, and then error / warn or apply different logic when we detect the edge case.

demiangomez · 2025-06-18T11:25:12Z

I think that when an orphan is detected, we should probably remove it from the dataset to allow the processing to continue, but we also need to make PGAMIT aware of this so that it prints a message to the user. Maybe we could do this through an exception during prune. Since no pruning will occur anyways, this would allow the processing to continue but PGAMIT can print a message in the log so that the user becomes aware of this problem.

basic check to prevent 'orphan' clusters during prune

d7f80b8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

basic check to prevent 'orphan' clusters during prune #222

basic check to prevent 'orphan' clusters during prune #222

Uh oh!

espg commented Jun 17, 2025

Uh oh!

espg commented Jun 17, 2025 •

edited

Loading

Uh oh!

espg commented Jun 17, 2025

Uh oh!

espg commented Jun 17, 2025

Uh oh!

demiangomez commented Jun 18, 2025

Uh oh!

Uh oh!

basic check to prevent 'orphan' clusters during prune #222

Are you sure you want to change the base?

basic check to prevent 'orphan' clusters during prune #222

Uh oh!

Conversation

espg commented Jun 17, 2025

Uh oh!

espg commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

espg commented Jun 17, 2025

Uh oh!

espg commented Jun 17, 2025

Uh oh!

demiangomez commented Jun 18, 2025

Uh oh!

Uh oh!

espg commented Jun 17, 2025 •

edited

Loading