Open Source Java Natural Language Processing (NLP) Tools for Linux

Java Natural Language Processing (NLP) Tools for Linux

View 25 business solutions

Browse free open source Java Natural Language Processing (NLP) Tools for Linux and projects below. Use the toggles on the left to filter open source Java Natural Language Processing (NLP) Tools for Linux by OS, license, language, programming language, and project status.

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    VnCoreNLP

    VnCoreNLP

    A Vietnamese natural language processing toolkit

    VnCoreNLP is a Java-based natural language processing toolkit tailored for Vietnamese. It offers a fast and accurate pipeline for essential NLP tasks, facilitating research and application development in Vietnamese language processing. ​
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text, run a series of NLP annotators on the text, and produce a final set of annotations. Pipelines produce CoreDocuments, data objects that contain all of the annotation information, accessible with a simple API, and serializable to a Google Protocol Buffer. CoreNLP generates a variety of linguistic annotations, including parts of speech, named entities, dependency parses, and coreference.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    JWNL is a Java API for accessing the WordNet relational dictionary. WordNet is widely used for developing NLP applications, and a Java API such as JWNL will allow developers to more easily use Java for building NLP applications.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 4
    Phrasal

    Phrasal

    Statistical phrase-based machine translation system

    Stanford Phrasal is a state-of-the-art statistical phrase-based machine translation system, written in Java. At its core, it provides much the same functionality as the core of Moses. Distinctive features include: providing an easy to use API for implementing new decoding model features, the ability to translating using phrases that include gaps (Galley et al. 2010), and conditional extraction of phrase-tables and lexical reordering models. Developed by The Natural Language Processing Group at Stanford University, a team of faculty, postdocs, programmers and students who work together on algorithms that allow computers to process and understand human languages. Our work ranges from basic research in computational linguistics to key applications in human language technology, and covers areas such as sentence understanding, automatic question answering, machine translation, syntactic parsing and tagging, sentiment analysis.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Simplify IT and security with a single endpoint management platform Icon
    Simplify IT and security with a single endpoint management platform

    Automate the hardest parts of IT

    NinjaOne automates the hardest parts of IT, delivering visibility, security, and control over all endpoints for more than 20,000 customers. The NinjaOne automated endpoint management platform is proven to increase productivity, reduce security risk, and lower costs for IT teams and managed service providers. The company seamlessly integrates with a wide range of IT and security technologies. NinjaOne is obsessed with customer success and provides free and unlimited onboarding, training, and support.
    Learn More
  • 5
    AminePlatform

    AminePlatform

    Amine is a Multi-Layer Platform for the dev. of Intelligent Systems

    Amine is an Artificial Intelligence Multi-Layer Java Open Source Platform dedicated to the development of various kinds of Intelligent Systems and Agents (Knowledge-Based, Ontology-Based, Conceptual Graph -CG- Based, NLP, Reasoning and Learning, Natural Language Processing, etc.). Ontology, KB can be created and manipulated with various processes. CG theory is used as the main knowledge representation language. Amine provides two languages: PROLOG+CG which extends PROLOG with CG and Amine modules, and SYNERGY which is a visual activation/propagation based language. CGs are considered by SYNERGY as activable/executable graphs. See for more detail: //amine-platform.sourceforge.net/
    Leader badge
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . The actual development and issue tracking can be found here: https://bitbucket.org/cryanfuse/crgrep
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP compon
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    TXM

    TXM

    Unicode XML TEI text analysis platform

    TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP full text search engine (http://cwb.sourceforge.net) and a range of statistical functions (factorial analysis, classification, cooccurrency analysis, etc.) based on R packages (http://www.r-project.org). Read the scientific background at the Textométrie project web site http://textometrie.ens-lyon.fr/?lang=en. Read a full description at the TEI Tools wiki http://wiki.tei-c.org/index.php/TXM.
    Downloads: 15 This Week
    Last Update:
    See Project
  • MongoDB 8.0 on Atlas | Run anywhere Icon
    MongoDB 8.0 on Atlas | Run anywhere

    Now available in even more cloud regions across AWS, Azure, and Google Cloud.

    MongoDB 8.0 brings enhanced performance and flexibility to Atlas—with expanded availability across 125+ regions globally. Build modern apps anywhere your users are, with the power of a modern database behind you.
    Learn More
  • 10
    masmt

    masmt

    A frame work for Multi agent system development

    MaSMT is a java based multi-agent system development framework, especially designed for development of English to Sinhala machine translation system. MaSMT also capable to develop any multi-agent based system through its architecture. Reference: B. Hettige, A. S. Karunananda, G. Rzevski, Multi-agent solution for managing complexity in English to Sinhala Machine Translation, International Journal of Design & Nature and Ecodynamics, Volume 11, Issue 2, 2016, 88 – 96. B. Hettige, A. S. Karunananda, G. Rzevski, ” MaSMT: A Multi-agent System Development Framework for English-Sinhala Machine Translation”, International Journal of Computational Linguistics and Natural Language Processing (IJCLNLP), Volume 2 Issue 7 July 2013.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    This ohnlp project has released "pipelines" that were contributed by members of the OHNLP Consortium. The pipelines are based on the Apache UIMA framework. medKAT/P, MedCoref, MedTagger, MedXN, and cTAKES are licensed under Apache License V2.0. MedTime is licensed under GNU General Public License version 3.0 (GPLv3). cTAKES development has moved to apache.org. See http://ctakes.apache.org/
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    OpenCCG, the OpenNLP CCG Library, is a collection of natural language processing components and tools which provide support for parsing and realization with Combinatory Categorial Grammar (CCG).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Graphical Grammar Studio

    Graphical Grammar Studio

    An user friendly grammar tool for natural language processing tasks

    Full documentation with tutorials is included in the download package. Graphical Grammar Studio is a tool for applying grammars which behave as words acceptors/consumers and annotators. GGS grammars can be used to find and annotate sequences of words which respect certain conditions, in a given input. Its purpose is for creating NLP tools like phrase chunkers, named entity finders, pronoun co-reference solvers etc. A grammar is represented by a state machine which can be visualized, edited and applied. A grammar is organized in graphs of nodes. Nodes are used for consuming words from the input, for executing jumps to other graphs in the grammar or for creating annotations etc. GGS has a unique feature: It allows the user to write JavaScript code to be executed for nodes of the grammar. This is useful for checking grammatical agreements but not only. The user can: declare variables (including complex js structures), check for boolean conditions, use variables in annotations etc.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    IceNLP is an open source Natural Language Processing (NLP) toolkit for analyzing and processing Icelandic text. The toolkit is implemented in Java.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Maximum entropy is a powerful method for constructing statistical models of classification tasks, such as part of speech tagging in Natural Language Processing. Several example applications using maxent can be found in the OpenNLP Tools Library.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are interested in reuse, and we focus on common NLP tasks that are broadly useful for textmining.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    JVnSegmenter is a Java-based and open-source Vietnamese word segmentation tool. The segmentation model was trained on about 8,000 sentences using Conditional Random Fields (FlexCRFs). This tool would be useful for Vietnamese NLP community.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    MutationFinder is a biomedical natural language processing (NLP) system for extracting mentions of point mutations from free text. MutationFinder achieves high performance (99% precision, 81% recall on blind test data) as an information extraction system
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Next Generation Programming

    Next Generation Programming

    Compose Software Without Writing Any Programing Code

    "Next Generation Programming - Programming Without Coding Software" is a drag-drop wizard for creating simple or complex applications without writing any programming language code The Software is coded/designed with "Java Programming Language" for novice/expert programmers; Programmers can write softwares with visual tools : drag-drop components;visual editors... Programmers can use the software to compose of simple/complex applications : Database programs, circuit design, generate code and upload to chip for designed circuits (ESP8266, ESP32 chips) The Software in question is much simpler to use than PWCT (https://sourceforge.net/projects/doublesvsoop/) software. The Software has more features than PWCT software such as SCADA. Please start by looking at examples from the website first. In this way, you can learn the features of the software and how to use the software in a very short time. More Information (Documents, Videos, Examples ...) : negep.epizy.com
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    The Aikernel is an intelligence server and cell runtime environment that uses natural language processing and other pattern matching with Activators, Contexts, Concepts to allow multi tasking between installed cells.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Ansj Chinese word segmentation

    Ansj Chinese word segmentation

    Ansj word segmentation

    The real java implementation of ict. The word segmentation effect is faster than the open source version of ict. Chinese word segmentation, name recognition, part-of-speech tagging, user-defined dictionary. This is a java implementation of Chinese word segmentation based on n-Gram+CRF+HMM. The word segmentation speed reaches about 2 million words per second (tested under mac air), and the accuracy rate can reach more than 96%. At present, it has realized the functions of Chinese word segmentation, Chinese name recognition, user-defined dictionary, keyword extraction, automatic summarization, and keyword tagging. It can be applied to natural language processing and other aspects, and is suitable for various projects that require high word segmentation effects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP is a machine learning-based NLP library that provides tools for text-processing tasks such as tokenization, sentence segmentation, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    AutoSummary uses Natural Language Processing to generate a contextually-relevant synopsis of plain text. It uses statistical and rule-based methods for part-of-speech tagging, word sense disambiguation, sentence deconstruction and semantic analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text. The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the following paper: Ehsan Emadzadeh, Azadeh Nikfarjam, and Graciela Gonzalez. 2011. Double Layered Learning for Biological Event Extraction from Text. In Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task, Portland, Oregon, June. Association for Computational Linguistic
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.