-
Notifications
You must be signed in to change notification settings - Fork 0
Linear Bounded Automata enumeration program
The enumeration program was written in C++
to provide a better performance for the intensive task of enumerating and simulating the automata.
It is located in the repository: lba_test_call.cpp
Uses libtbb
and C++17
for_each
for parallelization.
To install libtbb:
sudo apt install libtbb-dev
The program can be compiled with g++
:
g++ -O3 -std=c++17 -o [EXECUTABLE_FILE] [PATH_TO_CPP] -ltbb
The program contains a series of define directives that need to be changed in case one want to simulate different quantity of states or tape size. The reason for this is that we optimized the simulation procedure by restricting to binary alphabet and using BITSET, a data structure "close to an array of bits" that requires its size to be known at compilation time. So, if you desire to change the number of states, or tape size, you will need to adjust the directives in the source file, and recompile it. These directives are:
- N_STATES: number of states for LBA
- OUTPUT_CAP: maximum tape size to consider (this means only strings with up to this size will be produced)
- STATE_BITS: amount of bits required to codify N_STATES (should BE ceiling of log base 2 of N_STATES)
- POS_BITS: amount of bits required to codify OUTPUT_CAP (should BE ceiling of log base 2 of OUTPUT_CAP)
Allowing more alphabets would require further changes in the source code itself, since it was optimized specifically for binary alphabets
Once it is compiled, as "LBA_ENUM", for instance, it can be executed with no parameters:
./LBA_ENUM
In this case, root transitions will be enumerated and paralellized across CPU cores. Each parallel call will generate a file with its partial results. The program creates a dist
directory in the same path where the executable resides, and the partial results files will be within.
It can also be executed in sharded mode, simply by providing an integer argument:
./LBA_ENUM [SHARD_NUMBER]
For more information on the root transitions and sharding, see the sections below.
Each pair of (state, symbol) will output a transition (state, symbol, direction), (where direction is direction to move tape head). We recursively go through each of these pairs and enumerate their possible transitions.
If we have
We list the possible transitions for (n, 1), and send each of them to a CPU core, where it does the recursion starting from (n, 0) down to (0, 0).
Each of these initial transition calls will output a file with the information of strings produced, in the dist
folder.
Sharding here means "breaking the whole enumeration in shards: sub-enumeration".
Instead of parallelizing over all
This allow for smaller subtasks that we can run separately and even distribute among different computers.
For instance, consider that the
Each shard outputs a collection of files (one for each transition), they will be in directory dist/shard[i]/
This sharding can be extend further so we can call two levels depth (so
To deeper understand the foundation of the method, refer to this document
The partial results text files are in the following format:
Count:[number of LBAs simulated]
Halted:[number of LBAs that halted/produced valid results/did not exceed tape size]
[string1]:[amount of LBAs that produced string1]
[string2]:[amount of LBAs that produced string2]
(...)
==
[steps1]:[amount of LBAs that halted after 'steps1' steps]
[steps2]:[amount of LBAs that halted after 'steps2' steps]
(...)
You can see below an example of one of the partial results file when running all LBAs with 3 states:
Count:153664
Halted:110661
01:55918
10:91904
11:94600
00:52727
101:5719
010:2864
111:6159
000:3081
001:3240
110:6314
011:3077
100:6314
0001:1
0000:1
0101:2
1111:2
1000:3
0011:1
1110:3
0111:2
1101:15
0010:1
1011:15
1010:4
0100:14
1100:2
==
10:18
9:6
8:13
5:1900
7:68
6:128
4:1904
3:18816
2:87808
Note that this distribution is not in the final format for the BDM (Block decomposition method).
See here for more information on analyzing the distribution and preparing the BDM dataset.