Showing 55 open source projects for "fastq"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    123FASTQ

    123FASTQ

    An intuitive and efficient tool for preprocessing Illumina FASTQ reads

    123FASTQ performs all the pre-processes of Illumina next-generation sequencing reads (FASTQ files) easier than ever.  Download the quick user manual for the latest version: https://dl.adbioinformatics.net/NGSNeeds/myTools/123Fastq_v1.3_Manual.pdf Authors: Milad Eidi, Samaneh Abdolalizadeh, Mohammad Hossein Nassirpour Supervisors: Javad Zahiri, PhD University of California San Diego  Masoud Garshasbi, PhD Tarbiat Modares University, Tehran, Iran If you use 123FASTQ, please cite this preprint: 123FASTQ: an intuitive and efficient tool for preprocessing Illumina FASTQ reads https://www.biorxiv.org/content/10.1101/2024.03.08.584032v1 ########################################################## Take care of the details and ensure you use the latest version. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3

    BBMap

    BBMap short read aligner, and other bioinformatic tools.

    ...BBNorm: Kmer-based error-correction and normalization tool. Dedupe: Simplifies assemblies by removing duplicate or contained subsequences that share a target percent identity. Reformat: Reformats reads between fasta/fastq/scarf/fasta+qual/sam, interleaved/paired, and ASCII-33/64, at over 500 MB/s. BBDuk: Filters, trims, or masks reads with kmer matches to an artifact/contaminant file. ...and more!
    Leader badge
    Downloads: 281 This Week
    Last Update:
    See Project
  • 4
    miRDeep*

    miRDeep*

    MiRDeep*

    Please cite: An, J., Lai, J., Lehman, M.L. and Nelson, C.C. (2013) miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data. Nucleic Acids Res, 41, 727-737. We will create index for you if you tell us your interested species (j.an@qut.edu.au). download command line version "MDS_command_line_Vxx.zip" clicking "Browse All Files" please find miRPlant in sourceforge for plant miRNA prediction.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Run applications fast and securely in a fully managed environment Icon
    Run applications fast and securely in a fully managed environment

    Cloud Run is a fully-managed compute platform that lets you run your code in a container directly on top of scalable infrastructure.

    Run frontend and backend services, batch jobs, deploy websites and applications, and queue processing workloads without the need to manage infrastructure.
    Try for free
  • 5
    FastQC

    FastQC

    A quality control analysis tool for high throughput sequencing data

    ...Its goal is to provide a simple way by which to check the quality of raw sequence data coming from high throughput sequencing pipelines. It does this by running a modular set of analyses on one or more raw sequence files in fastq or bam format. It then produces a report summarizing the results, and highlighting any areas where the library may appear unusual. This should then direct you to where your data may have problems and allow you to take necessary steps to correct it before doing any further analysis. FastQC is not tied to any specific type of sequencing technique, so it can be used to look at libraries of various experiment types (Genomic Sequencing, ChIP-Seq, RNA-Seq, BS-Seq etc etc).
    Downloads: 41 This Week
    Last Update:
    See Project
  • 6

    miRSim

    Seed-based RNA-Seq Simulator

    The miRSim tool can generate the synthetic RNA-Seq data in standard fastq/fasta format by utilizing the sequence-specific properties (i.e., seed and xseed (remaining part of the sequence after removing seed)). Additionally, miRSim also generates the ground truth in CSV format that provides information about genomic location, CIGAR string, sequence, and expression counts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    slimfastq

    An efficient lossless compression for fastq files.

    slimfastq is a cli application that compresses/decompresses fastq files. It features: * High compression ratio * Relatively low cpu/memory usage * Truly lossless compression/decompression
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    MPI-dot2dot

    A Parallel Tool to Find DNA Tandem Repeats on Multicore Clusters

    MPI-dot2dot is a parallel tool to accelerate the identification of Tandem Repeats on multisequence datasetes. This tool receives as input a multisequence file with FASTQ or FASTA formats. It uses MPI processes and OpenMP threads to exploit the compute capabilities of multicore clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    GapFiller

    A de novo local assembler for paired reads

    GapFiller is a seed-and-extend local assembler to fill the gap within paired reads. It can be used for both DNA and RNA and it has been tested on Illumina data. GapFiller can be used whenever a sequence is to be assembled starting from reads lying on its ends, provided a loose estimate of sequence length.
    Downloads: 0 This Week
    Last Update:
    See Project
  • CompAccelerator Icon
    CompAccelerator

    CompAccelerator is a highly configurable compensation solution designed from the HR perspective.

    Created by HR professionals that have spent years experiencing the pains of compensation processes, CompAccelerator is designed to solve the problems that you face every day. Deliver the actions you need, on the layout you want, with the calculations you need, all in a fraction of the time. Schedule a demo today and see how sophisticated functionality in an easy to use interface can save your team from the perils of compensation management.
    Learn More
  • 10
    NGSReadsTreatment

    NGSReadsTreatment

    Note: To run the new version use Java version 13.

    NGSReadsTreatment, a computational tool for the removal of duplicated reads in paired-end or single-end datasets. NGSReadsTreatment can handle reads from any platform with the same or different sequence lengths. Using the probabilistic structure Cuckoo Filter, the redundant reads are identified and removed by comparing the reads with themselves. Thus, no prerequisite is required beyond the set of reads. NGSReadsTreatment was compared with other redundancy removal tools in analyzing different...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    CusVarDB

    CusVarDB generated variant protein database from NGS-datasets

    ...Create the variant protein database Apart from the main modules, the program also supports additional functions such as 1. Download the SRA 2. Convert the SRA file to fastq file format 3. Download the annotation (ANNOVAR) database and Dry-run concept to customize the commands Executables are available at http://bioinfo-tools.com/Downloads/CusVarDB/V1.0.0/ Test dataset is available at http://bioinfo-tools.com/Downloads/CusVarDB/V1.0.0/test_dataset.rar
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    MoPAC

    The Modular Pipeline for the Analysis of CRISPR screens

    To facilitate the comparison of gene essentialities in two or more cell samples, we propose MoPAC (Modular Pipeline for Analysis of CRISPR screens), a Shiny-driven interactive tool for differential essentiality analysis in CRISPR/Cas9 screens. For installation and usage instructions please refer to the wiki page.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    RGT

    RGT

    Repeat Genotyping Tool

    ...This method is error prone because it is likely to mistakenly genotype structures that were not identified before, as unidentified SNPs, and of course the whole process is very time consuming. RGT identifies SSR structures from raw fastq reads, identifies and exports the gremlin alleles to the user, along with 2D plots of units counts (as electrophoresis plots). It also exports 3D plots of repeat units count vs each others in loci where there are more than one expanding repeat
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    selectseq

    Get specific sequences from a FASTA or FASTQ file.

    A command-line utility to manipulate biological sequences from a FASTA or FASTQ file. It can, given a list of identifiers, get only a subset of the sequences (or their complement, i.e., sequences NOT in the list). Can also get sequence number N only. Compressed sequences files are supported if readable by zcat.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    MarDRe is a de novo MapReduce-based parallel tool to remove duplicate and near-duplicate DNA reads through the clustering of single-end and paired-end sequences from FASTQ/FASTA datasets. This tool allows bioinformatics to avoid the analysis of not necessary reads, reducing the time of subsequent procedures with the dataset. MarDRe is the Big Data counterpart of ParDRe (link above), which employs HPC technologies (i.e., hybrid MPI/multithreading) to reduce runtime on multicore systems. Instead, MarDRe takes advantage of the MapReduce programming model to significantly improve ParDRe performance on distributed systems, especially on cloud-based infrastructures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    HSRA

    Hadoop spliced read aligner for RNA-seq data

    ...This tool allows bioinformatics researchers to efficiently distribute their mapping tasks over the nodes of a cluster by combining a fast multithreaded spliced aligner (HISAT2) with Apache Hadoop, which is a distributed computing framework for scalable Big Data processing. HSRA currently supports single-end and paired-end read alignments from FASTQ/FASTA datasets. Moreover, our tool uses the Hadoop Sequence Parser (HSP) library (link above) to efficiently read the input datasets stored on the Hadoop Distributed File System (HDFS), being able to process datasets compressed with Gzip and BZip2 codecs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    hppRNA

    A Snakemake-based handy parameter-free pipeline for RNA-Seq analysis

    hppRNA package is dedicated to the RNA-Seq analysis for a large number of samples simultaneously from the very beginning to the very end, which is formulated in Snakemake pipeline management system. It starts from fastq files and will produce gene/isoform expression matrix, differentially-expressed-genes, sample clusters as well as detection of SNP and fusion genes by combination of the state-of-the-art software. The first version handles protein-coding genes, lncRNAs and circRNAs and includes six core-workflows such as (1) Tophat - Cufflink - Cuffdiff; (2) Subread - featureCounts - DESeq2; (3) STAR - RSEM - EBSeq; (4) Bowtie - eXpress - edgeR; (5) kallisto - sleuth; (6) HISAT - StringTie - Ballgown. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    VoltMR

    Pure java NGS mapping soft run on Hadoop 2.0

    ...Using 100 core, VoltMR finish typical exome sample (10GB),mapping, sort, mark duplicate, local realignment in 30 minitue. It use about 10GB to 15GB RAM for each hadoop mapper and reducer. Currently, VoltMR take fastq as a input and output bam/ADAM format. For DNA mapping, GATK compatible realignment/recalbration followed after mapping. For RNA mapping, splice aware algorithm is implemented. Volt is open source, released as "LGPLv3"
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19

    wham_bam

    Simple script to convert all fastq files in a directory into bam files

    The script takes fastq files from sequence runs (or from bam files converted using bam2fastq) and aligns to a user-selected genome. Additional options to only convert reads above a certain mapping score, removing duplicates and generating bed files (requires Bedtools set in path).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    SeqPig

    Use Apache Pig to process your large sequencing datasets!

    ...It provides import and export functions for file formats commonly used for sequencing data, as well as a collection of Pig user-defined-functions (UDF’s) to help process aligned and unaligned sequence data. Currently SeqPig supports BAM/SAM, FastQ and Qseq input and output. For more information see the manual at http://seqpig.sourceforge.net/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    ClinQC

    ClinQC: A tool for quality control of Sanger and NGS data in clinic

    ClinQC is an integrated and user-friendly pipeline for quality control, filtering and trimming of Sanger and NGS sequencing data for hundred to thousands of samples/patients in a single run in clinical research. It can analyze raw sequencing data and produces unified output as FASTQ files per sample/patient with Sanger quality encoding. First, ClinQC convert input read files from their native formats to a common FASTQ format and remove adapters, and PCR primers. Next, it split barcoded samples, filter duplicates, contamination and low quality sequences and generates a QC report.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    ArtificialFastqGenerator

    Ouputs artificial FASTQ files derived from a reference genome.

    ArtificialFastqGenerator takes the reference genome (in FASTA format) as input and outputs artificial FASTQ files in the Sanger format. It can accept Phred base quality scores from existing FASTQ files, and use them to simulate sequencing errors. Since the artificial FASTQs are derived from the reference genome, the reference genome provides a gold-standard for calling variants (Single Nucleotide Polymorphisms (SNPs) and insertions and deletions (indels)).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    bio-cargo

    CARGO - Compressed ARchival for GenOmics

    CARGO is a high-level framework that can semi-automatically generate software systems optimized for the compressed storage of arbitrary types of large genomic data collections. Straightforward applications of CARGO methods to compress FASTQ and SAM format archives require only a few lines of code, produce solutions that match and sometimes outperform specialized format-tailored compressors, and scale well to multi-TB datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    An integrated pipeline for forensic analysis from SNP panel data. 1. SNP caller takes a FASTQ file and reference SNP panel as input and generates SNP calls. 2. Kinship analysis 3. Ancestry prediction 4. Data quality check 5. Replicate analysis 6. Mixture analysis module available by request
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Flexbar

    Flexbar

    flexible barcode and adapter removal for sequencing platforms

    ...It demultiplexes barcoded runs and removes adapter sequences. Moreover, trimming and filtering features are provided. Flexbar supports next-generation sequencing data in fasta and fastq format, e.g. from the Illumina platform. Reference: Matthias Dodt, Johannes T. Roehr, Rina Ahmed, Christoph Dieterich: Flexbar — flexible barcode and adapter processing for next-generation sequencing platforms. Biology 2012, 1(3):895-905.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next