Browse free open source Linguistics software and projects below. Use the toggles on the left to filter open source Linguistics software by OS, license, language, programming language, and project status.

  • Auth0 for AI Agents now in GA Icon
    Auth0 for AI Agents now in GA

    Ready to implement AI with confidence (without sacrificing security)?

    Connect your AI agents to apps and data more securely, give users control over the actions AI agents can perform and the data they can access, and enable human confirmation for critical agent actions.
    Start building today
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • 1
    Apertium: Machine Translation Toolbox

    Apertium: Machine Translation Toolbox

    The free and open-source rule-based machine translation platform

    Apertium is a toolbox to build open-source shallow-transfer machine translation systems, especially suitable for related language pairs: it includes the engine, maintenance tools, and open linguistic data for several language pairs.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 2
    oopinyinguide
    OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 3
    DSL-KeyPad

    DSL-KeyPad

    Multilingual input tool. Latin, Cyrillic, IPA, Math, historic, etc.

    “DSL KeyPad” is a utility written on AutoHotkey 2.0, designed for inputting a wide range of characters using hotkeys and auxiliary functions. Its primary focus is on enhancing input capabilities for Latin and Cyrillic scripts, allowing typing in multiple languages without the need for separate keyboard layouts for each language. Requires common QWERTY (English US)/ЙЦУКЕН (Russian) keyboard layouts. More than 6,300 Unicode characters are available. Additionaly, it supports typing on the Germanic Runes, Glagolitic, Old Turkic, Old Permic, Phoenician, Carian, Lycian, Ugaritic etc.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4

    Discriminative Language Editor

    Discriminative language editor based on ontologies

    Text editor in Java that is able to detect discriminative expressions while the user is typing. When the internal ontology-based analyzer detects a potential discriminative expression the user is advised by underscoring the related words in the text. A descriptive message about the issue is also shown to the user when the cursor is placed over the potential discriminative expression.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Create and run cloud-based virtual machines. Icon
    Create and run cloud-based virtual machines.

    Secure and customizable compute service that lets you create and run virtual machines.

    Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a good balance of price and performance. Compute optimized (C2) machines offer high-end vCPU performance for compute-intensive workloads. Memory optimized (M2) machines offer the highest memory and are great for in-memory databases. Accelerator optimized (A2) machines are based on the A100 GPU, for very demanding applications.
    Try for free
  • 5
    EzerKb is a virtual keyboard for Windows. It emulates a keyboard with, for example, Russian, Greek, or Hebrew characters without actually installing a keyboard driver for that language. EzerKb works with most (but not all) Windows programs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    A simple Java GUI tool for looking at the Spectrum and Cepstrum of a sound clip.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    Automatic Compound Processing (AuCoPro)

    Automatic compound splitting and semantic analysis of compounds

    The central problem to be addressed in this project concerns a multidisciplinary (linguistics and computational linguistics) investigation into sharing of knowledge and resources between closely-related languages, specifically relating to the automatic processing of compounds. Specifically, we will explore the possibility to create new knowledge about closely-related languages, and efficiently develop additional, more advanced resources for (a) compound segmentation; and (b) the semantic analysis of compounds; as such, the project will be divided into two interrelated subprojects, to be executed simultaneously. The focus in this project will be on Afrikaans (with Dutch as the closely-related, well-sourced language), which will lay grounds for future work on other closely-related language pairs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    BibleNLP

    Natural Language Processing using The Holy Bible

    This project attempts to develop natural language processing routines as applied to a Bible text domain. Many common technologies (e.g., tokenization, Brill POS tagger) are used in conjunction with theoretical paradigms (e.g., hierarchical word definition trees, phrasal concordance).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Loan management software that makes it easy. Icon
    Loan management software that makes it easy.

    Ideal for lending professionals who are looking for a feature rich loan management system

    Bryt Software is ideal for lending professionals who are looking for a feature rich loan management system that is intuitive and easy to use. We are 100% cloud-based, software as a service. We believe in providing our customers with fair and honest pricing. Our monthly fees are based on your number of users and we have a minimal implementation charge.
    Learn More
  • 10
    This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text. The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the following paper: Ehsan Emadzadeh, Azadeh Nikfarjam, and Graciela Gonzalez. 2011. Double Layered Learning for Biological Event Extraction from Text. In Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task, Portland, Oregon, June. Association for Computational Linguistic
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    CHALICE
    Connecting Historical Authorities with Links, Contexts and Entities. CHALICE is a historic placename gazetteer for the UK, published as Linked Data and linked to other widely-used sources of placename reference information on the semantic web.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Colloquium QDA

    Colloquium QDA

    A free and open source qualitative ethnographic interview coding tool.

    Colloquium QDA is a tool for custom coding and analyzing qualitative ethnographic interviews. To run, make sure you first have JRE 8 or later installed (http://www.oracle.com/technetwork/java/javase/downloads/). Colloquium QDA is an open source cross-platform Java Swing app utilizing an embedded Java DB with Lucene integrated search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Color to Word

    Color to Word

    Turn colors into words

    The program will turn a color into a list of 10 words, obtained according to a custom designed algorithm based on letter shape and position in the alphabet. - Click inside the frame on the left to pick a color through the color chooser window - The program will match the color with the colors corresponding to a list of all the English words contained in the file wordcolor.txt - The first 10 matches will appear in the frame on the right - Right-click - Copy to copy the word matches and the RGB values This version comes with a text file (wordcolor.txt) containing all the English words followed by Red, Green, Blue channel values for the corresponding color. The colors were obtained through a modified version of the program "Text to Color" by same author, available for download on GitHub and SourceForge on the profile page of Fonazza-Stent. The next version (coming soon) will include a tool to convert a custom word list into a word+color list named wordcolor.txt
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    DarlandPhilosophy

    Dennis J. Darland's Philosophy

    Representation of my philosophy (currently limited to philosophy of language) in the languages Prolog or Life. These languages must be acquired separately and Ruby is also needed. However the main purpose is to show how some philosophy problems can be solved. The source code and output are sufficient for that.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DawNLITE is a Natural-Language-based Image Transmoding Engine. The software transforms an image to a video as recorded by a virtual camera panning and zooming over the image, following a natural language text description of the image.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    Dendrarium

    System do pielęgnacji składnikowych drzew składniowych

    Dendrarium służy do wybierania i weryfikacji składnikowych drzew składniowych generowanych przez parser Świgra. System jest użytkowany w Instytucie Podstaw Informatyki PAN do tworzenia banku drzew składniowych dla języka polskiego Składnica.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Drug Extraction

    Drug name extraction

    Drug name recognition and normalisation/grounding to DrugBank ids and standard names. Package provides 2 taggers: 1. DrugTagger - CRF-based with DrugBank presence feature (see feature set for details). 2. DrugnameGazetteer - gazetteer/dictionary-based. Dictionary created from DrugBank.ca database. Both taggers include grounding/normalisation to DrugBank ids and standard names. Feature set: Word, Word-1, Word+1, Word-1_Word, Word_Word+1, DrugBankPresence, POS DrugBankPresence feature indicates the presence of the drug name in the DrugBank. Using CONLL-Evaluation: processed 32065 tokens with 3656 phrases; found: 3251 phrases; correct: 2786. accuracy: 95.25%; precision: 85.70%; recall: 76.20%; FB1: 80.67 Using GATE Corpus Benchmark: Strict: P: 0.65 R: 0.73 F1: 0.69 Lenient: P: 0.74 R: 0.84 F1: 0.78 The details of how to reproduce evaluation, see README. To use standalone version for tagging download DrugExtractionStandalone.tar.gz from Files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A small parser for the Esperanto language, intended as an API for other programs. The blog entry describing this program's function and motivation can be found here: http://coder32768.blogspot.com/2014/04/gort-klaatu-barada-esperanto.html The program takes text in Esperanto, and returns an object graph describing that utterance. The program has expansion points for transforms of this graph, in the spirit of transformational grammars.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This is a project to convert linguistic field data into other usable formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    An agent-based situated language learning simulation that focuses on lexical learning and grounding, featuring a unigram syntax structure and a CFG-based semantic grammar. Created as a MSc thesis project, using python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    InGen is a java-based tool that automatically extracts keywords from a given LaTeX-Document and creates an index for those keywords.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    IsItQt (Qt)

    IsItQt (Qt)

    Identifies if Linux program was created by Qt and version!

    IsItQt is a Linux console application to identify if the program was created using Qt and in most cases, using which version of Qt was it created. Article about usage: http://www.cplusplus.com/articles/y3TbqMoL/
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next