This project implements a lexical analyzer (lexer) for a custom programming language. The lexer uses Non-Deterministic Finite Automata (NFA) and Deterministic Finite Automata (DFA) to recognize different token types, including integers, decimals, identifiers, booleans, operators, and delimiters.
- Converts NFAs to DFAs for efficient token recognition.
- Supports multiple token types with priority-based classification.
- Can process input code and classify tokens accordingly.
- Java Development Kit (JDK 11 or later)
- Any Java IDE (e.g., IntelliJ IDEA, Eclipse, or VS Code)
- Clone the repository:
git clone https://github.com/yourusername/your-repo.git cd your-repo
- Compile and run the project:
javac -d out src/**/*.java java -cp out main.Main
The lexer follows the following rules to classify tokens:
- Must start with a lowercase letter (
a-z
). - Can contain only lowercase letters.
- Example:
varname
,hello
,testvar
- Consist of one or more digits (
0-9
). - Example:
42
,12345
- Consist of an integer part followed by a
.
and a fractional part (up to 5 digits). - Example:
3.14
,42.001
- Recognized keywords:
true
andfalse
. - Example:
true
,false
- Supported:
+
,-
,*
,/
,%
,^
. - Example:
a + b
,x * y
- Single equal sign (
=
) for assignment. - Example:
x = 10;
- Supported:
;
,,
,(
,)
,{
,}
,[
,]
. - Example:
if (true) {}
x = 10;
y = 3.14159;
z = x + y;
if (true) { z = z * 2; }
The lexer will classify the tokens accordingly and output their types.
This project is licensed under the MIT License. See the LICENSE file for details.