GitHub - yousef-younes/RADAr

RADAr: A Transformer-based Autoregressive Decoder Architecture for Hierarchical Text Classification

This repo contains the code for The RADAr model. RADAR is a sequence-to-sequence model for Hierarchical Text Classification.

Here is a brief description about the funcationality of the different modules in the repo.

We use the same data splits and preprocessing as HBGL. So first, the preprocessing done for HBGL must be run on each dataset.
The prepare_data.py module from the data_preparation directory should be used. As a result, each sample will be split into text and lable and six files will be created. train.src, train.tgt, val.src, val.tgt, test.src, test.tgt. The .src files contains the text while .tgt files contains the labels.
To get the data statistics, the datasets_statistics notebook can be used.
The organize_labels_level_wise.py module seperates the labels level-wise.
The organize_labels_path_wise.py" module organized the labels in seperate paths.
The sweep_code is used for hyperparameter optimization.
The T5_BART_exps.py is the module used to get the resutls using T5 and BART model.
The "analyze_resutls" notebook is used to analyze the model resutls.

To run the

After preparing the data using the data preparation modules.
Specify the data, log, model and results directories and other parameters in the yaml file corresponsing to each dataset.
To train and test the model, please use the comman: python main.py xxx

where "xxx" is a three letter acronym for the dataset. It is either wos, nyt, or rcv indicating the three datasets WOS, NTY and RCV1-V2 respectively.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
data_preparation		data_preparation
logs/wos		logs/wos
README.md		README.md
T5_BART_exps.py		T5_BART_exps.py
analyze_results.ipynb		analyze_results.ipynb
label_tokenizer.py		label_tokenizer.py
main.py		main.py
model.py		model.py
nyt_config.yaml		nyt_config.yaml
rcv_config.yaml		rcv_config.yaml
requirements.txt		requirements.txt
sweep_code.py		sweep_code.py
utils.py		utils.py
wos_config.yaml		wos_config.yaml