Skip to content
Open
Show file tree
Hide file tree
Changes from 237 commits
Commits
Show all changes
248 commits
Select commit Hold shift + click to select a range
0e6b04e
small tweak to readme
philipwosull Jul 19, 2024
2deabd8
adding code for gsmc method
philipwosull Aug 29, 2024
4e506ae
adding some files I forgot to push
philipwosull Aug 29, 2024
5bfd0d8
added documentation for new functions. Mostly finished but still a li…
philipwosull Sep 1, 2024
1baad5b
removing format library to try to fix compilation issue on FAS cluster
philipwosull Sep 3, 2024
1ae29ca
added diagnostic flag to make high mem usage optional
philipwosull Sep 5, 2024
4bb7583
fixed sd_lp issue for gsmc in summary
philipwosull Sep 6, 2024
0e579eb
refactored Plan object code to drastically reduce memory usage
philipwosull Sep 7, 2024
eb23a5b
updated some function doc pages
philipwosull Sep 7, 2024
2be4d4f
Weights now calculated in seperate function and parent tries are (may…
philipwosull Sep 11, 2024
fa2e5ca
renamed parent tries mat to parent unsuccessful tries mat
philipwosull Sep 11, 2024
466fd95
added population tempering
philipwosull Sep 11, 2024
a053c70
can now pass vector of k values in instead of one for entire run
philipwosull Sep 15, 2024
400afa0
added basic version of original smc with optimal weights
philipwosull Sep 16, 2024
37b81c7
added basic version of original smc with optimal weights
philipwosull Sep 16, 2024
a002c5b
added district counting functions
philipwosull Oct 7, 2024
d652bb2
basic smc no longer stores useless info
philipwosull Oct 12, 2024
73ce6fc
takes into account index changes for final resampling for ancestor re…
philipwosull Oct 14, 2024
26fba06
added task tracker document
philipwosull Oct 15, 2024
3dd1bc7
added new file for smc_withMCMC steps code
philipwosull Oct 15, 2024
0d2c678
updated task tracking doc
philipwosull Oct 15, 2024
a30e4ec
created and tested seperate plan update function and added code to ex…
philipwosull Oct 16, 2024
b24e115
put splitting functions in their own file
philipwosull Nov 1, 2024
369afb2
reorganized file structure. Added seperate files for weight computati…
philipwosull Nov 1, 2024
e4c7b57
created all purpose code to handle smc and gsmc with optimal weights
philipwosull Nov 1, 2024
3c8c4a9
removed old gsmc and basic_smc files, now replaced by new consolidate…
philipwosull Nov 1, 2024
8143211
refactored code so R output now orders region ids by when they were s…
philipwosull Nov 1, 2024
4d1e88f
tweaked find edge to cut function to better integrate with merge split
philipwosull Nov 2, 2024
4906982
more tweaks to the updating functions from a cut tree
philipwosull Nov 2, 2024
8feac8b
started working on merge split stuff
philipwosull Nov 2, 2024
76147e6
built out skeleton for adding merge split steps
philipwosull Nov 2, 2024
9e16744
almost done merge split. Just need to add MH ratio rejection step
philipwosull Nov 3, 2024
083b3b9
mh rejection step now added for generalized split still need to test …
philipwosull Nov 3, 2024
45a5be5
merge split now works for one district split as well
philipwosull Nov 3, 2024
d04f123
fixed issue with not correctly spacing out merge split steps
philipwosull Nov 7, 2024
8639baf
fixed issue with bad mcmc weights. Now just leaves them unchanged
philipwosull Nov 7, 2024
b19d03c
added support for counting specific region ids within a plan instead …
philipwosull Nov 8, 2024
cb3b8c2
tweaked merge split to just do 1/acceptance rate again, will make it …
philipwosull Nov 10, 2024
9da5906
removed reference to file not on github
philipwosull Nov 10, 2024
ee74cb7
made original smc support not multiprocessing
philipwosull Nov 10, 2024
9dacb0f
makes it so when not doing multiprocess the output from each run is p…
philipwosull Nov 12, 2024
6d0b077
made it possible to do a final merge split step at the end
philipwosull Nov 13, 2024
3e5eb3b
changed so its now possible to run merge split after the final smc step
philipwosull Nov 14, 2024
8a2ec6c
can now specify multple of expected mcmc steps
philipwosull Nov 20, 2024
8f0500d
Cory redist merge split now works
philipwosull Nov 28, 2024
732ecbc
attempting to rename package to gredist
philipwosull Dec 7, 2024
78e59a6
two more files
philipwosull Dec 7, 2024
d81750e
rename types
CoryMcCartan Dec 7, 2024
c356234
fixed final compilation issue. package successfully renamed
philipwosull Dec 7, 2024
5aa34ab
started major code refactoring. Most things returned from function ar…
philipwosull Dec 8, 2024
f926a0c
added function to reorder by oldest split in cpp
philipwosull Dec 10, 2024
b28f5c9
significant refactor. Now outputs integer types back to R as integers…
philipwosull Dec 11, 2024
337c4de
signficantly sped up weight computation by using hash map and doing e…
philipwosull Dec 11, 2024
c0e052a
added support for simple backwards kernel weights
philipwosull Dec 20, 2024
9e370e7
added support for splitting regions where one of the pieces is within…
philipwosull Jan 1, 2025
9171573
added function for complete manual splitting of map and saving the fo…
philipwosull Jan 4, 2025
033aa05
renamed redist gsmc file
philipwosull Jan 4, 2025
f1e372a
last push before refactor to supporting multiple plan subtypes
philipwosull Jan 7, 2025
891066f
rewrote functions to support abstract plan type
philipwosull Jan 13, 2025
8a6817a
switched sampling in smc back to local random.h file, not cpp random …
philipwosull Jan 14, 2025
4579d0b
fixed k estimation for gsmc
philipwosull Jan 16, 2025
2314a11
added support for two new splitting types and verified validation sti…
philipwosull Jan 17, 2025
8e29aff
added support for running with non-blank initial plans and only runni…
philipwosull Jan 19, 2025
4782e0b
added forest adjacency stuff
philipwosull Jan 20, 2025
e7e0e5f
updated tree splitter to avoid unnecssary vector allocations for pop_…
philipwosull Jan 22, 2025
9e3c18e
passes basic validation:
philipwosull Jan 23, 2025
bcd8c84
spanning forest passes validation for uniform valid edge and expo wei…
philipwosull Jan 25, 2025
ba892ec
added new splitter types
philipwosull Jan 26, 2025
963d2cb
validation still works after minor code refactoring
philipwosull Jan 30, 2025
23e1c17
started working on adding proper weights for custom size splitting sc…
philipwosull Feb 2, 2025
87676e9
weights now partially work for custom sizes but still slightly off
philipwosull Feb 3, 2025
79e0c6f
some minor cleanup of tree splitting functions
philipwosull Feb 7, 2025
6a9544f
tweaked k estimation but half splits and by two still don't work
philipwosull Feb 8, 2025
ee07d45
tweaked R files so now one function handles gsmc cpp call and other t…
philipwosull Feb 17, 2025
ede4fa5
tweaked weight computation, still working on making custom size scheu…
philipwosull Feb 17, 2025
bb3660e
made seperate function for getting ajd regions and effective forest b…
philipwosull Feb 17, 2025
0dba418
made abstract splitting schedule base class and derived classes for a…
philipwosull Feb 18, 2025
838a09d
starting working on constraint abstract class and scoring function cl…
philipwosull Feb 19, 2025
a24069f
optimal weight function now has support for scoring function and stre…
philipwosull Feb 19, 2025
322f61d
now supports custom adj sizing
philipwosull Feb 19, 2025
4df0dd6
fixed hinge constraint bug
philipwosull Feb 22, 2025
cbfb15b
tweaked original smc to allow for setting processes and num threads p…
philipwosull Feb 22, 2025
4858283
tweaked region selection prob in gsmc to now be size^alpha
philipwosull Feb 23, 2025
999bbd8
RNG generation is now threadsafe
philipwosull Mar 2, 2025
b68fe50
tweaked diagnostic that redist_gsmc puts out
philipwosull Mar 3, 2025
487dcd3
consolidated mergesplit files and changed to just specify desired num…
philipwosull Mar 3, 2025
8979e27
make PRNG more robust with long_jump functionality
philipwosull Mar 3, 2025
4e26019
Adding Plan method for computing log tau terms
philipwosull Mar 4, 2025
32a5f75
cleaned up code and removed extraneous plan variables
philipwosull Mar 5, 2025
2e5b3bc
added compactness parameter and all alg versions passed 6x6 validation
philipwosull Mar 5, 2025
dfeaadb
removed more unncessary attributes from plan object
philipwosull Mar 7, 2025
a500803
removed recursion from functions which use cut tree to update plans
philipwosull Mar 8, 2025
82c2706
refactor of get valid edges in tree with speed up by only requiring o…
philipwosull Mar 8, 2025
8e42811
Now removed unnessecary copying in the splitting step
philipwosull Mar 9, 2025
eaeda4f
removed duplicate tree splitters, now only uses one for entire gsmc f…
philipwosull Mar 9, 2025
15f3bd8
forgot new UST Sampler files
philipwosull Mar 9, 2025
7ef437c
some minor tweaks to starting prinout message
philipwosull Mar 10, 2025
6dc8bc6
removed feature unnesarily computing log tau for simple graph space w…
philipwosull Mar 10, 2025
11d6d8a
added parallel versions of some redistmetrics functions to speed up s…
philipwosull Mar 15, 2025
2d94604
added more parallel versions of some redistmetrics functions
philipwosull Mar 15, 2025
c427528
adding weights file
philipwosull Mar 15, 2025
36c83c4
more tweaks to make parallel redistmetric functions more thread safe.…
philipwosull Mar 15, 2025
750abc0
fixed issue with parallel polsby popper. If no pool object RcppThread…
philipwosull Mar 15, 2025
5ca5610
implemented new merge split using Plan class. Passes validation
philipwosull Mar 19, 2025
7d0cef1
some more code refactor to help support linking edge stuff
philipwosull Mar 20, 2025
79f292c
skeleton of linking edge done. Just need to add multigraph tau
philipwosull Mar 20, 2025
71a2db0
linking edge space sampling passes validation
philipwosull Mar 21, 2025
33d242e
some code tweaks
philipwosull Mar 21, 2025
6af2ef4
minor typo in tree splitter types file
philipwosull Mar 21, 2025
3ed2cac
exposed exported generic gsmc file as export
philipwosull Mar 21, 2025
96dc627
forgot generic redist man page
philipwosull Mar 21, 2025
ce63e94
more man page tweaks
philipwosull Mar 21, 2025
086c3c9
removed unneccesary copy in smc plans function
philipwosull Mar 21, 2025
91318bf
reduced memory usage in plan reorder
philipwosull Mar 22, 2025
a7d5acd
smc merge split working for forest, still doesn't pass validation for…
philipwosull Mar 22, 2025
820a179
caught error with not correctly updating linking probabilities/comput…
philipwosull Mar 23, 2025
6531948
smc+ms works for linking edge
philipwosull Mar 24, 2025
21913a3
refactored diagnostics and added code to track the size of regions we…
philipwosull Mar 25, 2025
bc0efe0
added for mcmc too
philipwosull Mar 25, 2025
4b74e61
added constants for alg types
philipwosull Mar 25, 2025
adab046
adding county check in merge split
philipwosull Mar 26, 2025
2d4b2a6
added county split checking
philipwosull Mar 26, 2025
b3d5557
added more diagnostics for merge split seperation bug
philipwosull Mar 26, 2025
5af72f6
hopefully fixed issue with incorrect hiearchical tree drawing for gen…
philipwosull Mar 26, 2025
f776cb7
trying to turn off pop bounds in sample_sub_ust to see if that fixes …
philipwosull Mar 26, 2025
e2e4ac9
trying again to fix
philipwosull Mar 27, 2025
7452576
indexing typo in update
philipwosull Mar 27, 2025
e6f9d8d
added quick multithreaded sorting function for district statistics
philipwosull Mar 31, 2025
a14f8db
had non uploaded function in Rcpp file
philipwosull Mar 31, 2025
ef8bb31
didnt remove stuff from Rcpp exports last time
philipwosull Mar 31, 2025
842e0db
very hacky fix for county issue GRAPH SPACE ONLY
philipwosull Apr 10, 2025
696b5b4
adding functions for computing log optimal weights and target density
philipwosull Apr 10, 2025
3187a56
another bug in county checking, quick fix but needs more verification
philipwosull Apr 10, 2025
0157522
attempted to fix county boundary counting issue
philipwosull Apr 11, 2025
e98f306
major refactor. Switched from arma columns to uint8 vectors for plan ids
philipwosull Apr 13, 2025
c0b3bdb
starting to refactor forests to be smaller
philipwosull Apr 13, 2025
cea469d
tweaking forest assignment to take more space upfront
philipwosull Apr 13, 2025
5adc7bd
Now support 65535 counties instead of 255
philipwosull Apr 13, 2025
6cdcfdb
hopefully counties now work for graph plans optimal weights
philipwosull Apr 15, 2025
89d5399
fixed another uint issue
philipwosull Apr 15, 2025
eba70c4
switched determinants to PSD version and added shallow copy
philipwosull Apr 15, 2025
dcf6730
refactor to avoid using as much memory when no intial plans
philipwosull Apr 23, 2025
d679b97
more memory usage refactoring
philipwosull Apr 24, 2025
7fac0e6
more memory usage reduction
philipwosull Apr 24, 2025
e389547
resampling done in c++ to save memory
philipwosull Apr 24, 2025
fdc0433
more stuff done in Rcpp to save mem
philipwosull Apr 25, 2025
73ba889
made extra printing controlled by verbose
philipwosull Apr 25, 2025
e9554a3
fixed bug with incorrect copying for odd split numbers
philipwosull Apr 28, 2025
354ad0c
fixed minor typo
philipwosull Apr 28, 2025
fc949cd
added functions to compute log target density and the log target for …
philipwosull May 3, 2025
c341b20
added incomplete implementation of pair hash
philipwosull May 22, 2025
4c1bbe2
overhauled county component stuff for graph and forest space
philipwosull Jun 2, 2025
70b1836
smc and smc+mcmc pass validation now
philipwosull Jun 4, 2025
bf09868
constraint code now scores districts only on per constraint basis. De…
philipwosull Jun 8, 2025
bd16605
mergesplit tweak
philipwosull Jun 8, 2025
2750687
added support for MMD plans and nseats for partial SMD plans
philipwosull Jun 12, 2025
6a2709c
Added but not tested code for multi member district district only splits
philipwosull Jun 16, 2025
03c189d
added hard constraints to manual weights cpp code
philipwosull Jun 17, 2025
fb9009b
caught potential bug in MMD split district only where big districts c…
philipwosull Jun 17, 2025
480f50a
nvm allowing districts to be split again
philipwosull Jun 17, 2025
b1b13a1
renamed and changed redist_gsmc function inputs to match old redist_smc
philipwosull Jun 19, 2025
da355f6
some more cleanup tweaks to gsmc cpp file
philipwosull Jun 19, 2025
de276c3
added support for seeding random trees on initialized non blank fores…
philipwosull Jun 20, 2025
3aa720f
Major refactor of code, eliminated county components and replaced wit…
philipwosull Jun 24, 2025
db69bcb
fixed minor bug in simple weights
philipwosull Jun 24, 2025
d4ce7e8
caught bug in not catching diff adjacent components
philipwosull Jun 24, 2025
61cbe73
fixed error in redist_gsmc where it would not do split districts only
philipwosull Jun 25, 2025
3c388fa
change multigraph to be vec of vec of int array of size 3
philipwosull Jun 25, 2025
896fc3b
now display the number of threads in the threadpool
philipwosull Jun 25, 2025
3ab54c6
renamed redist gsmc to redist smc and moved old smc to legacy version…
philipwosull Jun 26, 2025
6c63099
renamed everything from gredist back to redist
philipwosull Jun 26, 2025
8cfa4ee
more renaming back to redist stuff and reset plot_maps file
philipwosull Jun 27, 2025
5b4f6c4
attempting to add seq_alpha back but failing validation
philipwosull Jun 27, 2025
44c5465
seq alpha passes validation
philipwosull Jun 27, 2025
85033ac
removed old smc cpp code and replaced it with new code
philipwosull Jun 28, 2025
7f03402
fixed bug in MMD district splits where incorrect split sizes from pre…
philipwosull Jun 29, 2025
7a59ba3
starting to overhaul diagnostics
philipwosull Jun 29, 2025
ab6373c
made it so you always compute multidsistrict selection probability fo…
philipwosull Jun 29, 2025
dc68e07
more tweaks and fixed a ms bug with tree size counts
philipwosull Jul 1, 2025
c69bad1
removed some pointless tests and fixed issue with null nseats for ref…
philipwosull Jul 1, 2025
000a32b
removed so non used functions, consolidated a file, and reskinned mer…
philipwosull Jul 1, 2025
6f1d60b
caught issue with multigraph check if hier valid not working properly
philipwosull Jul 2, 2025
17e023f
added incumbency constraint to updated smc code, avoided duplicate by…
philipwosull Jul 2, 2025
9d3ced8
more constraints added to new code
philipwosull Jul 2, 2025
fa0cffe
overhauled Wilson code to reuse as many vectors as possible
philipwosull Jul 9, 2025
5725d4d
made tree pop stacks fixed size
philipwosull Jul 9, 2025
f7fba3e
updating SMC tests for new code interface changes
philipwosull Jul 12, 2025
77df900
updated merge split tests for new function inputs
philipwosull Jul 13, 2025
b0fe550
added new soft and hard plan constraints and other tweaks
philipwosull Jul 13, 2025
b112b07
code now passes all tests on Philip mac
philipwosull Jul 14, 2025
2fe10f8
removed all parallel redistmetrics functions. Moved to gsmcs repo
philipwosull Jul 14, 2025
4111e4a
clean ups
philipwosull Jul 14, 2025
5a37f79
fixing merge conflicts
philipwosull Jul 14, 2025
9ea023e
updated code to avoid unnecessary repeated memory allocation in weigh…
philipwosull Jul 14, 2025
a2eb9d7
Added updated version of eval_segregation and eval_grp_pow and grp_po…
philipwosull Jul 14, 2025
120acfe
adding some of the new updates under 5.0
philipwosull Jul 15, 2025
3382833
more tweaks to news and readme
philipwosull Jul 15, 2025
666d857
addressing comments on plans_helpers
philipwosull Jul 15, 2025
f2c3c0c
updating main readme
philipwosull Jul 15, 2025
4849d02
reverting changes to dot files
philipwosull Jul 15, 2025
6d0bed7
starting to work on short explainer for c++ code
philipwosull Jul 15, 2025
70d475d
revising redist smc
philipwosull Jul 16, 2025
f0cf574
switching naming conventions. total_seats to nseats, nseats to seats,…
philipwosull Jul 17, 2025
6eb02b5
switched shortburst mergesplit backend to forest space
philipwosull Jul 17, 2025
0626438
more renaming of region_sizes to seats
philipwosull Jul 17, 2025
860cdac
cleaning up redist alg helpers
philipwosull Jul 17, 2025
785b5f6
cleaning up redist alg helpers
philipwosull Jul 17, 2025
0c11a58
removing cli prefix before cli_warn and cli_abort
philipwosull Jul 17, 2025
479845f
refactored summary function
philipwosull Jul 18, 2025
766b077
now pass district pops out from redist smc and ms to avoid needing to…
philipwosull Jul 19, 2025
96f8ccb
renaming score_districts_only to only_districts for constraints
philipwosull Jul 19, 2025
961368d
added thresh to constraint R code, still need to add to c++
philipwosull Jul 19, 2025
b8fe0dd
more minor tweaks
philipwosull Jul 20, 2025
74d5c24
moved some constraint functions to map calc
philipwosull Jul 20, 2025
86c744a
changes to linking edge multigraph tau
philipwosull Jul 22, 2025
34faa16
in the middle of fixing linking edge with counties on
philipwosull Jul 22, 2025
5460e68
hierarchical linking edge count now passes some tests
philipwosull Jul 23, 2025
fc871e6
linking edge with counties on now passes first round of tests. Method…
philipwosull Jul 23, 2025
0098a52
linking edge now passes all hierarchical validations
philipwosull Jul 23, 2025
7f50da7
resolving some merge conflicts
philipwosull Jul 23, 2025
cfd3691
all cli lib calls now prefixed with cli colon colon
philipwosull Jul 23, 2025
730a22e
slightly sped up hierarchical linking edge calculation
philipwosull Jul 25, 2025
0a93fa2
more efficiency tweaks to compute log linking edge
philipwosull Jul 25, 2025
90c1bbb
for optimal weights moved linking edge computation from weights
philipwosull Jul 25, 2025
4f5daa8
flipped linking edge term ratio in optimal weights. fixed
philipwosull Jul 25, 2025
cd8799b
added and tested faster method to convert log linking edge for merged…
philipwosull Jul 26, 2025
7b07488
now only compute linking edge selection probability when needed, inst…
philipwosull Jul 28, 2025
f68a826
fixed issue with district populations not being resampled correctly
philipwosull Jul 29, 2025
a7e6a79
when inferring seat sizes only warns about not-tight bounds once
philipwosull Jul 29, 2025
d5d7f83
added mergesplit parallel back with deprecation warning
philipwosull Jul 30, 2025
8dd1677
max_dev now works for multimember plans
philipwosull Jul 31, 2025
93e6c3d
more code tweaks
philipwosull Jul 31, 2025
69a4b66
more mostly style changes for pr
philipwosull Jul 31, 2025
70fdd1b
added thresholding in constraints and new test for custom plan based …
philipwosull Aug 4, 2025
d70da95
adding some missing man pages
philipwosull Aug 4, 2025
b4f5b28
made summary backwards compatible with previous redist plans
philipwosull Aug 4, 2025
90dfcaa
moved constants to package file
philipwosull Aug 5, 2025
4472305
modified constraints cpp code to reduce memory usage
philipwosull Aug 8, 2025
e71879f
thresholding now turns constraints into indicators
philipwosull Aug 10, 2025
01da894
added missing array header
philipwosull Aug 21, 2025
69d2972
can now exactly specify what number of regions plan constraints apply to
philipwosull Aug 21, 2025
010d185
fixed problem with non-unique thread ids
philipwosull Aug 22, 2025
3b0ea35
fixed remaining thread id assignment issues. Now ensures thread ids a…
philipwosull Aug 22, 2025
32eabe2
fixed redist parity bug with MMD
philipwosull Aug 22, 2025
76bf3a7
renaming some mergesplit parameters
philipwosull Sep 19, 2025
a48ba78
removing stale debugging statement
philipwosull Sep 20, 2025
770fb00
forgot to rename some ms move to step in diagnostics
philipwosull Sep 22, 2025
a207d6e
updated diagnostics to return rhats and diagnostics df
philipwosull Oct 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,4 @@ builder.sh
^explore$
^\.github$
^LICENSE\.md$
^CRAN-SUBMISSION$
^CRAN-SUBMISSION$
6 changes: 4 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
Package: redist
Version: 4.3.0
Date: 2025-07-07
Version: 5.0.0.1
Date: 2025-08-08
Title: Simulation Methods for Legislative Redistricting
Authors@R: c(
person("Christopher T.", "Kenny", email = "christopherkenny@fas.harvard.edu",
role = c("aut", "cre"), comment = c(ORCID = "0000-0002-9386-6860")),
person("Cory", "McCartan", email = "mccartan@psu.edu", role = "aut",
comment = c(ORCID = "0000-0002-6251-669X")),
person("Philip", "O'Sullivan", email = "posullivan@fas.harvard.edu", role = "aut",
comment = c(ORCID = "0000-0002-9665-2462")),
person("Ben", "Fifield", email = "benfifield@gmail.com", role = "aut",
comment = c(ORCID = "0000-0002-2247-0201")),
person("Kosuke", "Imai", email = "imai@harvard.edu", role = "aut",
Expand Down
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ S3method(ungroup,redist_plans)
S3method(weights,redist_plans)
export(add_constr_compet)
export(add_constr_custom)
export(add_constr_custom_plan)
export(add_constr_edges_rem)
export(add_constr_fry_hold)
export(add_constr_grp_hinge)
Expand Down Expand Up @@ -91,6 +92,7 @@ export(get_plans_matrix)
export(get_plans_weights)
export(get_pop_tol)
export(get_sampling_info)
export(get_seats_matrix)
export(get_target)
export(group_frac)
export(is_contiguous)
Expand Down
42 changes: 42 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,45 @@
# 5.0.0
* Replaces old SMC weights with generally lower variance optimal weights.
* Adds the option to add Mergesplit MCMC steps at any point during an SMC run.
Adding mergesplit steps can help achieve convergence for plans with a larger
number of districts without increasing the sample size.
* Improves SMC and Mergesplit MCMC performance by pre-allocating and reusing as
much memory as possible while drawing spanning trees.
* Introduces new methods for sampling plans for both SMC and Mergesplit MCMC.
The final output is still plans from the same distribution as before but new
sampling spaces and splitting methods will sometimes perform better under some
scenarios.
* Introduces a new method for splitting in SMC - generalized region splits.
Instead of splitting off one district at a time this allows for splitting into
two arbitrary sized regions. For an equal sample size generalized region splits
tends to converge slower but it is typically much faster (up to twice as fast or
more) since on average it draws spanning trees on smaller subgraphs then
single district splits.
* Adds support for sampling multimember district plans with both SMC and
mergesplit MCMC under some mild conditions. The district seat sizes (how many
legislators a district can have) must be a range of values e.g. (3,4,5) and no
district seat size can be the sum of two others.
* When counties are used `redist_mergesplit` now samples from the same target
distribution as `redist_smc` (it guarantees no more than the number of districts
minus 1 splits).
* `redist_mergesplit` inputs now work differently.
* `nsims` is now the number of plans saved.
* `warmup` is the number of steps to run the chain for before collecting any samples.
* `thin` means we will run the chain for `thin - 1` steps between saving plans
* Overall the chain will be run for `warmup + nsims * thin` and return `nsims` plans.
* Adds the option to incorporate rejection sampling for all constraints in SMC
and mergesplit MCMC. Any constraint can now include a threshold argument `thresh`
where for a newly split plan if either of the two new regions has a raw score
greater than or equal to `thresh` then the plan will be automatically reject.
This amounts to giving plans where any region has a score above `thresh` a
probability of 0.
* Updates the target distribution when counties are turned on. For more details
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give any more details here, since the paper is not available?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will have a preprint by the time this is released

see the forthcoming working paper.
* The mergesplit backend for `redist_shortburst` now uses uniform edge sampling
with forest space for the backend instead of sampling with graph space and all
`k` related parameters have been removed.


# 4.3.0
* Improves SMC performance by pre-allocating some memory while drawing spanning trees.
* Replaces SMC label-counting adjustments (exact and importance-sampling-based) with a new backward kernel that eliminates approximation error and requires far less computation
Expand Down
227 changes: 217 additions & 10 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,14 @@ dist_dist_diff <- function(p, i_dist, j_dist, x_center, y_center, x, y) {
.Call(`_redist_dist_dist_diff`, p, i_dist, j_dist, x_center, y_center, x, y)
}

get_region_multigraph <- function(adj_list, region_ids) {
.Call(`_redist_get_region_multigraph`, adj_list, region_ids)
}

get_region_laplacian <- function(adj_list, region_ids) {
.Call(`_redist_get_region_laplacian`, adj_list, region_ids)
}

log_st_map <- function(g, districts, counties, n_distr) {
.Call(`_redist_log_st_map`, g, districts, counties, n_distr)
}
Expand All @@ -69,6 +77,48 @@ calcPWDh <- function(x) {
.Call(`_redist_calcPWDh`, x)
}

#'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to export these from c++ to R if they're not used by any package R code?

if we are going to keep them exported (f,rom c++) but internal, then I'd like to change the names, because people can still see these functions with ::: and the names are super verbose

#' @returns A list with the following
#' - `uncut_tree`: The spanning tree drawn on the region stored as a
#' 0-indexed directed edge adjacency graph.
#' - `num_attempts`: The number of attempts it took to draw the tree.
#'
#' @keywords internal
#' @noRd
draw_a_tree_on_a_region <- function(adj_list, counties, pop, ndists, num_regions, num_districts, region_id_to_draw_tree_on, lower, upper, region_ids, region_sizes, verbose) {
.Call(`_redist_draw_a_tree_on_a_region`, adj_list, counties, pop, ndists, num_regions, num_districts, region_id_to_draw_tree_on, lower, upper, region_ids, region_sizes, verbose)
}

#' Splits a multidistrict into two new regions within population bounds
#'
#' Splits a multidistrict into two new valid regions by drawing spanning
#' trees uniformly at random and attempting to find an edge to cut until
#' a successful cut is made.
#'
#' @title Split a multidistrict into two regions
#'
#' @inheritParams run_redist_smc
#' @noRd
perform_a_valid_multidistrict_split <- function(adj_list, counties, pop, ndists, num_regions, num_districts, region_id_to_split, target, lower, upper, region_ids, region_sizes, split_dval_min, split_dval_max, split_district_only, verbose = FALSE, k_param = 1L) {
.Call(`_redist_perform_a_valid_multidistrict_split`, adj_list, counties, pop, ndists, num_regions, num_districts, region_id_to_split, target, lower, upper, region_ids, region_sizes, split_dval_min, split_dval_max, split_district_only, verbose, k_param)
}

draw_trees_on_a_region <- function(adj_list, counties, pop, ndists, region_id_to_draw_tree_on, region_size, lower, target, upper, region_ids, num_tree, num_threads, verbose) {
.Call(`_redist_draw_trees_on_a_region`, adj_list, counties, pop, ndists, region_id_to_draw_tree_on, region_size, lower, target, upper, region_ids, num_tree, num_threads, verbose)
}

attempt_splits_on_a_region <- function(adj_list, counties, pop, ndists, init_num_regions, region_id_to_split, lower, target, upper, region_ids, region_sizes, splitting_schedule_str, k_param, num_plans, num_threads, verbose) {
.Call(`_redist_attempt_splits_on_a_region`, adj_list, counties, pop, ndists, init_num_regions, region_id_to_split, lower, target, upper, region_ids, region_sizes, splitting_schedule_str, k_param, num_plans, num_threads, verbose)
}

compute_log_unnormalized_target_density_components <- function(adj_list, counties, pop, constraints, pop_temper, compute_pop_temper, rho, ndists, total_seats, num_regions, district_seat_sizes, lower, target, upper, region_ids, region_sizes, output_type, num_threads) {
.Call(`_redist_compute_log_unnormalized_target_density_components`, adj_list, counties, pop, constraints, pop_temper, compute_pop_temper, rho, ndists, total_seats, num_regions, district_seat_sizes, lower, target, upper, region_ids, region_sizes, output_type, num_threads)
}

compute_plans_log_optimal_weights <- function(adj_list, counties, pop, constraints, pop_temper, rho, splitting_schedule_str, ndists, total_seats, district_seat_sizes, num_regions, lower, target, upper, region_ids, region_sizes, num_threads) {
.Call(`_redist_compute_plans_log_optimal_weights`, adj_list, counties, pop, constraints, pop_temper, rho, splitting_schedule_str, ndists, total_seats, district_seat_sizes, num_regions, lower, target, upper, region_ids, region_sizes, num_threads)
}

group_pct_top_k <- function(m, group_pop, total_pop, k, n_distr) {
.Call(`_redist_group_pct_top_k`, m, group_pop, total_pop, k, n_distr)
}
Expand All @@ -89,20 +139,32 @@ prec_cooccur <- function(m, idxs, ncores = 0L) {
.Call(`_redist_prec_cooccur`, m, idxs, ncores)
}

group_pct <- function(m, group_pop, total_pop, n_distr) {
.Call(`_redist_group_pct`, m, group_pop, total_pop, n_distr)
group_pct <- function(plans_mat, group_pop, total_pop, n_distr, ncores = 1L) {
.Call(`_redist_group_pct`, plans_mat, group_pop, total_pop, n_distr, ncores)
}

pop_tally <- function(districts, pop, n_distr, ncores = 1L) {
.Call(`_redist_pop_tally`, districts, pop, n_distr, ncores)
}

pop_tally <- function(districts, pop, n_distr) {
.Call(`_redist_pop_tally`, districts, pop, n_distr)
infer_region_seats <- function(region_pops, lower, upper, total_seats, num_threads = 0L) {
.Call(`_redist_infer_region_seats`, region_pops, lower, upper, total_seats, num_threads)
}

max_dev <- function(districts, pop, n_distr) {
.Call(`_redist_max_dev`, districts, pop, n_distr)
max_dev <- function(districts, pop, n_distr, multimember_districts = FALSE, nseats = -1L, seats_matrix = matrix(1,1), num_threads = 1L) {
.Call(`_redist_max_dev`, districts, pop, n_distr, multimember_districts, nseats, seats_matrix, num_threads)
}

ms_plans <- function(N, l, init, counties, pop, n_distr, target, lower, upper, rho, constraints, control, k, thin, verbosity) {
.Call(`_redist_ms_plans`, N, l, init, counties, pop, n_distr, target, lower, upper, rho, constraints, control, k, thin, verbosity)
order_district_stats <- function(district_stats, ndists, num_threads) {
.Call(`_redist_order_district_stats`, district_stats, ndists, num_threads)
}

order_columns_by_district <- function(df, columns, ndists, num_threads = 0L) {
.Call(`_redist_order_columns_by_district`, df, columns, ndists, num_threads)
}

ms_plans <- function(nsims, warmup, thin, ndists, total_seats, district_seat_sizes, adj_list, counties, pop, target, lower, upper, rho, init_plan, init_seats, sampling_space_str, merge_prob_type, control, constraints, verbosity = 3L, diagnostic_mode = FALSE) {
.Call(`_redist_ms_plans`, nsims, warmup, thin, ndists, total_seats, district_seat_sizes, adj_list, counties, pop, target, lower, upper, rho, init_plan, init_seats, sampling_space_str, merge_prob_type, control, constraints, verbosity, diagnostic_mode)
}

pareto_dominated <- function(x) {
Expand All @@ -125,6 +187,125 @@ resample_lowvar <- function(wgts) {
.Call(`_redist_resample_lowvar`, wgts)
}

#' Reorders all the plans in the vector by order a region was split
#'
#' Takes a vector of plans and uses the vector of dummy plans to reorder
#' each of the plans by the order a region was split.
#'
#'
#' @title Reorders all the plans in the vector by order a region was split
#'
#' @param pool A threadpool for multithreading
#' @param plan_ptrs_vec A vector of pointers to plans
#' @param dummy_plans_vec A vector of pointers to dummy plans
#'
#' @details Modifications
#' - Each plan in the `plans_vec` object is reordered by when the region was split
#' - Each plan is a shallow copy of the plans in `plans_vec`
#'
#' @noRd
#' @keywords internal
NULL

maximum_input_sizes <- function() {
.Call(`_redist_maximum_input_sizes`)
}

#' Checks a matrix of seat counts is valid
#'
#' Checks that a matrix of seat counts associated with a plan is valid
#' meaning that every region has a positive seat value and for each plan
#' the sum of seats is equal to the total number of seats (`nseats`).
#' If anything is not correct an error will be thrown.
#'
#' @param init_seats A matrix of 1-indexed plans
#' @param num_regions The number of regions in the plan.
#' @param nseats The total number of seats in the map
#' @param seats_range Vector of number of seats a district is allowed to have
#' @param split_districts_only Whether or not to check that all but the last region are
#' districts or not. (Allows for the possibility the last region is a district too).
#' @param num_threads The number of threads to use. Defaults to number of machine threads.
#'
#' @details Modifications
#' - None
#'
#' @keywords internal
#' @noRd
validate_init_seats_cpp <- function(init_seats, num_regions, nseats, seats_range, split_districts_only, num_threads = 1L) {
invisible(.Call(`_redist_validate_init_seats_cpp`, init_seats, num_regions, nseats, seats_range, split_districts_only, num_threads))
}

#' Get canonically relabeled plans matrix
#'
#' Given a matrix of 1-indexed plans (or partial plans) this function
#' returns a new plans matrix with all the plans labeled canonically.
#' The canonical labelling of a plan is the one where the region of the
#' first vertex gets mapped to 1, the region of the next smallest vertex
#' in a different region than the first gets mapped to 2, and so on. This
#' is guaranteed to result in the same labelling for any plan where the
#' region ids have been permuted.
#'
#'
#' @param plans_mat A matrix of 1-indexed plans
#' @param num_regions The number of regions in the plan
#' @param num_threads The number of threads to use. Defaults to number of machine threads.
#'
#' @details Modifications
#' - None
#'
#' @returns A matrix of canonically labelled plans
#'
#' @keywords internal
#' @noRd
get_canonical_plan_labelling <- function(plans_mat, num_regions, num_threads = 0L) {
.Call(`_redist_get_canonical_plan_labelling`, plans_mat, num_regions, num_threads)
}

#' Count how many times each plan appears in a plans matrix
#'
#' Given a matrix of 1-indexed plans (or partial plans) this function
#' returns a list mapping plan vectors as a giant concatened string to
#' the count of how many times the plan appears.
#'
#' If `use_canonical_ordering` is set to true then the plans will be
#' reordered using the canonical reordering function
#' `get_canonical_plan_labelling`. This guarantees that the same plan
#' will not be incorrectly counted if there are different permutations
#' of its labels. If `use_canonical_ordering` is not set to true then
#' its possible the count will be incorrect because of different
#' permutations of the same underlying plan.
#'
#'
#' @param plans_mat A matrix of 1-indexed plans
#' @param num_regions The number of regions in the plan
#' @param use_canonical_ordering Whether or not to reorder the plans using the
#' canonical ordering on plans.
#' @param num_threads The number of threads to use. Defaults to number of machine threads.
#'
#' @details Modifications
#' - None
#'
#' @returns A list mapping plans (stored as a string concatened vector) to
#' how many times they appear in the matrix
#'
#' @keywords internal
#' @noRd
get_plan_counts <- function(input_plans_mat, num_regions, use_canonical_ordering = TRUE, num_threads = 0L) {
.Call(`_redist_get_plan_counts`, input_plans_mat, num_regions, use_canonical_ordering, num_threads)
}

resample_plans_lowvar <- function(normalized_weights, plans_mat, region_pops_mat, region_sizes_mat, reorder_sizes_mat) {
.Call(`_redist_resample_plans_lowvar`, normalized_weights, plans_mat, region_pops_mat, region_sizes_mat, reorder_sizes_mat)
}

get_log_number_linking_edges <- function(adj_list, counties, constraints, ndists, nseats, num_regions, region_ids) {
.Call(`_redist_get_log_number_linking_edges`, adj_list, counties, constraints, ndists, nseats, num_regions, region_ids)
}

get_merged_log_number_linking_edges <- function(adj_list, counties, constraints, ndists, nseats, num_regions, region_ids, region1_id, region2_id) {
.Call(`_redist_get_merged_log_number_linking_edges`, adj_list, counties, constraints, ndists, nseats, num_regions, region_ids, region1_id, region2_id)
}

plan_joint <- function(m1, m2, pop) {
.Call(`_redist_plan_joint`, m1, m2, pop)
}
Expand All @@ -149,8 +330,34 @@ k_biggest <- function(x, k = 1L) {
.Call(`_redist_k_biggest`, x, k)
}

smc_plans <- function(N, l, counties, pop, n_distr, target, lower, upper, rho, districts, n_drawn, n_steps, constraints, control, verbosity = 1L) {
.Call(`_redist_smc_plans`, N, l, counties, pop, n_distr, target, lower, upper, rho, districts, n_drawn, n_steps, constraints, control, verbosity)
#' Run SMC (optionally with Merge Split steps too)
#'
#' Uses smc method with optimal weights and merge split steps to generate a sample of `nsims` plans in `c++`
#'
#'
#' Using the procedure outlined in <PAPER HERE> this function uses Sequential
#' Monte Carlo (SMC) methods to generate a sample of `M` plans
#'
#'
#' @param ndists The number of districts the final plans will have
#' @param adj_list A 0-indexed adjacency list representing the undirected graph
#' which represents the underlying map the plans are to be drawn on
#' @param counties Vector of county labels of each vertex in `g`
#' @param pop A vector of the population associated with each vertex in `g`
#' @param target Ideal population of a valid district. This is what deviance is calculated
#' relative to
#' @param lower Acceptable lower bounds on a valid district's population
#' @param upper Acceptable upper bounds on a valid district's population
#' @param nsims The number of plans (samples) to draw
#' @param k_param The k parameter from the SMC algorithm, you choose among the top k_param edges
#' @param control Named list of additional parameters.
#' @param num_threads The number of threads the threadpool should use
#' @param verbosity What level of detail to print out while the algorithm is
#' running <ADD OPTIONS>
#' @keywords internal
#' @noRd
run_redist_smc <- function(nsims, total_seats, ndists, district_seat_sizes, initial_num_regions, adj_list, counties, pop, step_types, target, lower, upper, rho, sampling_space_str, control, constraints, verbosity, diagnostic_level, region_id_mat, region_sizes_mat, log_weights) {
.Call(`_redist_run_redist_smc`, nsims, total_seats, ndists, district_seat_sizes, initial_num_regions, adj_list, counties, pop, step_types, target, lower, upper, rho, sampling_space_str, control, constraints, verbosity, diagnostic_level, region_id_mat, region_sizes_mat, log_weights)
}

splits <- function(dm, community, nd, max_split) {
Expand Down
Loading