-
Notifications
You must be signed in to change notification settings - Fork 6
CTGenerator
oschulte edited this page Jul 4, 2017
·
5 revisions
Solves the Contingency Table Problem described in Qian et al. CIKM 2014. Implements the solution in that paper, which uses the Fast Moebius Transform.
The procedure is passed connections objects for different databases.
-
con_std
connects to adata_db
database with the original data (e.g. unielwin_std). [This should be renamedcon_data
.] -
con_setup
is a database connection that connects to a metadata databasesetup_db
(e.g unielwin_std_setup). The metadata comprise first-order random variable called functor nodes (e.g. 1Nodes, RNodes, FNodes), . Optional Arguments:-
FunctorSet
a table insetup_db
. Restricts the computation to the functor nodes listed inFunctorSet
. Default setting: contains all functor nodes. -
Groundings
a table insetup_db
. Contains population variables (e.g Student). The contingency tables are expanded with entity Ids (e.g. student-id), so that the computation returns counts for individuals. Default setting: empty.
-
-
con_bn
connects to abn_db
database that contains metadata for learning (e.g. the lattice of relationship chains). -
con_ct
connects to act_db
with the contingency tables that are constructed by dynamic programming algorithm. [db_db
andct_db
should be merged.]
- after running CTGenerator,
ct_db
contains the contingency table for the first-order random variables listed insetup_db.FunctorSet
and the data listed indata_db
. Ifsetup_db.Groundings
contains first-order population variables, then the contingency table lists counts for each tuple of population members.
Assumes the following steps have been taken:
- Runs script
transfer.sql
. Transfer metadata fromsetup_db
tobn_db
. - Generates relationship chain lattice in
bn_db
. - Generates more metadata using metadata_3.sql or a variant depending on which option was chosen (link analysis on or off).
Then does the following:
- Builds contingency tables for each population variable (
BuildCT_Pvars
). - For each relationship chain length, builds contingency tables for that length (
BuildCT_Rnodes_join
).
If link analysis is off, the procedure uses simple table joins. If it is on, it performs a virtual join using the Moebius Transform.
- Make this a self-contained repository.
- Add screenshots
- Add a gallery of examples