Skip to content

malabz/HAlign-G

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HAlign-G: A rapid and low-memory multiple-genome aligner for large-scale closely related genomes

HAlign-G is a tool written in C++ for aligning multiple gemones. It runs on Linux and Windows.

Installation

Linux/WSL (Windows Subsystem for Linux) - from Anaconda

1.Install WSL for Windows. Instructional video 1 or 2 (Copyright belongs to the original work).

2.Download and install Anaconda. Download Anaconda for different systems here. Instructional video of anaconda installation 1 or 2 (Copyright belongs to the original work).

3.Install HAlign-G.

#1 Create and activate a conda environment for HAlign-G
conda create -n haligng_env
conda activate haligng_env

#2 Add channels to conda
conda config --add channels malab
conda config --add channels conda-forge

#3 Install HAlign-G
conda install -c malab -c conda-forge halign-g

#4 Test HAlign-G
halign-g -h

Usage

Usage of halign-g2:
halign-g Input_file Output_file [-r/--reference val] [-p/--threads val] [-bwt/--bwt val] [-dc/--divide_conquer val] [-sv/--svLen val] [-h/--help]
        Input_file      :  Input file/folder path[Please use .fasta as the file suffix or a forder]
        Output_file     :  Output file path[Please use .maf or .fasta as the file suffix]
        -r/--reference            : The reference sequence name [Please delete all whitespace] (defualt value:[Longest])
        -p/--threads              : The number of threads (defualt value:1)
        -bwt/--bwt                : The global BWT threshold (defualt value:15)
        -dc/--divide_conquer      : The divide & conquer Kband threshold (defualt value:10000)
        -sv/--svLen               : The structure variation length threshold (defualt value:200)
        -h/--help                 : Show this help message (defualt value:false)

HAlign-G1 is designed for multiple sequence alignment (MSA). Among the three versions, it provides the highest alignment quality and supports both FASTA and MAF formats.

Usage of halign-g2:
halign-g2 Input_file Output_file [-r/--reference val] [-p/--threads val] [-dc/--divide_conquer val] [-h/--help]
        Input_file      :  Input file/folder path[Please use .fasta as the file suffix or a forder]
        Output_file     :  Output file path[Please use .maf or .fasta as the file suffix]
        -r/--reference            : The reference sequence name [Please delete all whitespace] (defualt value:[Longest])
        -p/--threads              : The number of threads (defualt value:1)
        -dc/--divide_conquer      : The divide & conquer Kband threshold (defualt value:10000)
        -h/--help                 : Show this help message (defualt value:false)

HAlign-G2 is tailored for multi-genome alignment within the same species. It is optimized to detect a greater number of structural variations, while also supporting both FASTA and MAF formats.

Usage of halign-mum:
halign-mum Input_file Output_file [-r/--reference val] [-t/--threads val] [-l/--load val] [-f/--filter val] [-m/--memory val] [-h/--help]
        Input_file      :  Input folder path
        Output_file     :  Output file path[Please use .maf or .hal as the file suffix]
        -r/--reference            : The reference file name (defualt value:[Longest])
        -t/--threads              : The number of threads (defualt value:32)
        -l/--load                 : Whether the second run and the result directory has a postfix structure file (defualt value:0)
        -f/--filter               : Filter-level: 0-None, 1-General, 2-Strict (defualt value:2)
        -m/--memory               : Maximum available memory / GB (defualt value:28)
        -h/--help                 : Show this help message (defualt value:false)

HAlign-MUM is part of the HAlign-G2 framework, built on the same algorithmic logic, but extended to support cross-species genome alignment. Unlike HAlign-G1 and G2, it only supports the MAF format.

Contacts

The software tools are developed and maintained by ZOU's lab.

More tools and infomation can visit our github.

About

A rapid and low-memory multiple-genome aligner for closely related genomes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages