Skip to content

TranslodeP2C is an AI-powered pseudocode-to-C++ transformer, built with a seq2seq model. It preprocesses structured pseudocode, trains on paired datasets, and generates efficient C++ code. With an intuitive Streamlit UI, TranslodeP2C enables seamless and intelligent code synthesis from natural language descriptions. πŸš€

Notifications You must be signed in to change notification settings

AbsarRaashid3/TranslodeP2C

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TranslodeP2C

Overview

TranslodeP2C is an AI-powered pseudocode-to-C++ conversion system.
Leveraging a Transformer-based seq2seq model,
it translates pseudocode descriptions into structured C++ programs.
The project includes preprocessing, vocabulary building, training,
and inference, with an interactive Streamlit UI.

Features

  • Transformer-based sequence-to-sequence model for code generation.
  • Converts pseudocode to C++ using deep learning.
  • Preprocessing and vocabulary management for structured learning.
  • Training pipeline with customizable hyperparameters.
  • Inference system with greedy decoding.
  • Streamlit-based web UI for user-friendly interactions.

Installation

Prerequisites

Ensure you have the following installed:

  • Python 3.8+
  • PyTorch
  • Streamlit
  • tqdm

Setup

  1. Clone the repository: git clone https://github.com/absarraashid3/translodep2c.git cd translodep2c
  2. Install dependencies: pip install -r requirements.txt
  3. Prepare your dataset and place it in data/train/split/.

Usage

Preprocessing

Convert TSV trInaining data into paired pseudocode-code format:

 python src/preprocess.py --input_tsv "C:\Projects\GenAi\data\train\split\spoc-train-train.tsv" --output_txt "C:\Projects\GenAi\data\train_pairs.txt" 

Building Vocabulary

Generate vocabulary pickle files from training pairs:

 python src/vocab.py --pairs_file "C:\Projects\GenAi\data\train_pairs.txt" --src_vocab_file "src/src_vocab.pkl" --tgt_vocab_file "src/tgt_vocab.pkl" 

Training the Model

Train the Transformer model for pseudocode-to-C++ conversion:

 python src/train.py --pairs_file "C:\Projects\GenAi\data\train_pairs.txt" --src_vocab_file "src/src_vocab.pkl" --tgt_vocab_file "src/tgt_vocab.pkl" --epochs 10 --batch_size 8 

Inference

Generate C++ code from input pseudocode:

 python src/infer.py --model_checkpoint transformer_seq2seq.pt --src_vocab_file "src/src_vocab.pkl" --tgt_vocab_file "src/tgt_vocab.pkl" --pseudocode "read n print factorial of n" 

Web Application

Launch the Streamlit UI:

 streamlit run src/app.py 

Enter pseudocode and get auto-generated C++ code!

Future Enhancements

  • Implement beam search decoding for better predictions.
  • Fine-tune with more programming languages.
  • Optimize the model for faster inference.

πŸš€ Transform pseudocode into real C++ with TranslodeP2C!

1 2 3 4

About

TranslodeP2C is an AI-powered pseudocode-to-C++ transformer, built with a seq2seq model. It preprocesses structured pseudocode, trains on paired datasets, and generates efficient C++ code. With an intuitive Streamlit UI, TranslodeP2C enables seamless and intelligent code synthesis from natural language descriptions. πŸš€

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages