Showing 188 open source projects for "scoring"

View related business solutions
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    ImageReward

    ImageReward

    [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences

    ImageReward is the first general-purpose human preference reward model (RM) designed for evaluating text-to-image generation, introduced alongside the NeurIPS 2023 paper ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. Trained on 137k expert-annotated image pairs, ImageReward significantly outperforms existing scoring methods like CLIP, Aesthetic, and BLIP in capturing human visual preferences. It is provided as a Python package (image-reward) that enables quick scoring of generated images against textual prompts, with APIs for ranking, scoring, and filtering outputs. Beyond evaluation, ImageReward supports Reward Feedback Learning (ReFL), a method for directly fine-tuning diffusion models such as Stable Diffusion using human-preference feedback, leading to demonstrable improvements in image quality.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    NGBoost

    NGBoost

    Natural Gradient Boosting for Probabilistic Prediction

    ngboost is a Python library that implements Natural Gradient Boosting, as described in "NGBoost: Natural Gradient Boosting for Probabilistic Prediction". It is built on top of Scikit-Learn and is designed to be scalable and modular with respect to the choice of proper scoring rule, distribution, and base learner. A didactic introduction to the methodology underlying NGBoost is available in this slide deck.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Automated Interpretability

    Automated Interpretability

    Code for Language models can explain neurons in language models paper

    The automated-interpretability repository implements tools and pipelines for automatically generating, simulating, and scoring explanations of neuron (or latent feature) behavior in neural networks. Instead of relying purely on manual, ad hoc interpretability probing, this repo aims to scale interpretability by using algorithmic methods that produce candidate explanations and assess their quality. It includes a “neuron explainer” component that, given a target neuron or latent feature, proposes natural language explanations or heuristics (e.g. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    X For You Feed Algorithm

    X For You Feed Algorithm

    Algorithm powering the For You feed on X

    X For You Feed Algorithm is the open-sourced core recommendation system that powers the For You feed on X (the social network formerly known as Twitter), and it represents one of the first times a major social platform has published production-level ranking code for public review and experimentation. The repository contains the full pipeline that ingests user engagement and content candidate data, processes it through retrieval, hydration, filtering, scoring, and selection layers, and ultimately ranks posts to show what appears in a user’s feed. At its heart, the system uses a transformer-based model adapted from xAI’s Grok architecture to predict probabilities for various user actions (such as likes, replies, reposts, clicks, and negative signals), then combines those into a weighted final score that drives ranking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • 5
    TreeQuest

    TreeQuest

    A Tree Search Library with Flexible API for LLM Inference-Time Scaling

    TreeQuest, developed by SakanaAI, is a versatile Python library implementing adaptive tree search algorithms—such as AB‑MCTS—for enhancing inference-time performance of large language models (LLMs). It allows developers to define custom state-generation and scoring functions (e.g., via LLMs), and then efficiently explores possible answer trees during runtime. With support for multi-LLM collaboration, checkpointing, and mixed policies, TreeQuest enables smarter, trial‑and‑error question answering by leveraging both breadth (multiple attempts) and depth (iterative refinement) strategies to find better outputs dynamically
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Job Recommend

    Job Recommend

    The basics of building a job recommendation workflow

    ...You can study how to transform raw text into features and how to evaluate simple heuristics or baseline models. The code encourages experimentation, inviting you to swap scoring rules, adjust weights, or plug in alternative representations. It serves as a starting point for understanding recommendation pipelines before moving to production-grade systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    CTFd

    CTFd

    CTFs as you need them

    ...It comes with everything you need to run a CTF and it's easy to customize with plugins and themes. Create your own challenges, categories, hints, and flags from the Admin Interface. Dynamic Scoring Challenges. Unlockable challenge support. Challenge plugin architecture to create your own custom challenges. Static & Regex-based flags. Custom flag plugins. Unlockable hints. File uploads to the server or an Amazon S3-compatible backend. Limit challenge attempts & hide challenges. Automatic bruteforce protection. Individual and Team-based competitions. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    TKD Scoring Wi-Fi

    TKD Scoring Wi-Fi

    TKD Scoring Wi-Fi Server supporting Android and IPhone clients

    ...This looks even more professional and easier to use than any of the programs I’ve ever seen used at comps.” and “Hey... just installed everything and it’s unreal! Everything works perfectly, looks great!” encouraged us to make it public for everybody Like any scoring system the “TKD Scoring Wi-Fi” system has few basic components: *TKD Scoring Wi-Fi Server(PC or Android) *TKD Scoring Wi-Fi Client(Android or IPhone) *TKD Scoring Wi-Fi Remote Score Display (Android)
    Leader badge
    Downloads: 49 This Week
    Last Update:
    See Project
  • 9
    StabilityMatrix

    StabilityMatrix

    Multi-Platform Package Manager for Stable Diffusion

    ...It provides a framework to run experiments systematically—capturing inputs, model configurations, outputs, and metrics—so researchers and practitioners can reason about differences in quality, robustness, and failure modes. The repository often bundles tooling for automated prompt sweeping, scoring heuristics (such as diversity, coherence, or task-specific metrics), and visualization helpers to make comparisons interpretable. This approach is useful for model selection, prompt engineering, and benchmarking new checkpoints against baseline models under reproducible conditions. By turning ad-hoc tests into tracked experiments, StabilityMatrix reduces bias, surfaces subtle regressions, and accelerates iteration when tuning generative systems.
    Downloads: 120 This Week
    Last Update:
    See Project
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
    Learn More
  • 10
    Empirical

    Empirical

    Test and evaluate LLMs and model configurations

    Empirical is the fastest way to test different LLMs and model configurations, across all the scenarios that matter for your application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Elfeed Emacs Web Feed Reader

    Elfeed Emacs Web Feed Reader

    An Emacs web feeds client

    Elfeed is an extensible web feed reader for Emacs, supporting both Atom and RSS. It requires Emacs 24.3 and is available for download from MELPA or el-get.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ZomboDB

    ZomboDB

    Making Postgres and Elasticsearch work together like it's 2023

    ZomboDB is a PostgreSQL extension that integrates Elasticsearch directly into Postgres, allowing for powerful full-text search and analytics capabilities. It manages Elasticsearch indices transparently, ensuring transactional consistency and enabling complex queries through SQL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Science Olympiad Scoring System

    Science Olympiad Scoring System

    Excel based scoring system for Science Olympiad tournaments

    ...NOTE: Excel 2008 for Mac does NOT support macros at all, thus many parts of this system won't work. Virtually any other version of Office will work. Be sure to signup for the mailing list to be informed of updates Note a 'SO Scoring Best Practices' PDF is available to give tips and tricks used at the National Tournament.
    Leader badge
    Downloads: 23 This Week
    Last Update:
    See Project
  • 14
    MLflow

    MLflow

    Open source platform for the machine learning lifecycle

    MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 15
    BIG-bench

    BIG-bench

    Beyond the Imitation Game collaborative benchmark for measuring

    ...Rather than focusing on a single metric or domain, it aggregates many hand-authored tasks that test reasoning, commonsense, math, linguistics, ethics, and creativity. Tasks are intentionally heterogeneous: some are multiple-choice with exact scoring, others are free-form generation judged by model-based or human evaluation. The suite provides a common JSON task format and an evaluation harness so research groups can contribute new tasks and reproduce results consistently. It emphasizes robustness analysis—looking at scale trends, calibration, and areas where models systematically fail—to guide model development beyond raw accuracy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Feast

    Feast

    Feature Store for Machine Learning

    ...Feast is the fastest path to manage existing infrastructure to productionize analytic data for model training and online inference. Make features consistently available for training and serving by managing an offline store (to process historical data for scale-out batch scoring or model training), a low-latency online store (to power real-time prediction), and a battle-tested feature server (to serve pre-computed features online). Avoid data leakage by generating point-in-time correct feature sets so data scientists can focus on feature engineering rather than debugging error-prone dataset joining logic. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    BentoML

    BentoML

    Unified Model Serving Framework

    BentoML simplifies ML model deployment and serves your models at a production scale. Support multiple ML frameworks natively: Tensorflow, PyTorch, XGBoost, Scikit-Learn and many more! Define custom serving pipeline with pre-processing, post-processing and ensemble models. Standard .bento format for packaging code, models and dependencies for easy versioning and deployment. Integrate with any training pipeline or ML experimentation platform. Parallelize compute-intense model inference...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Supermemory

    Supermemory

    Memory engine and app that is extremely fast, scalable

    ...It often incorporates clustering, semantic search, and summarization modules to reduce cognitive load and surface key ideas, which makes it useful for research, study, writing, and long-term project tracking. Users can interact with the system via conversational queries or traditional search interfaces, and the system leverages vector embeddings and memory scoring to prioritize the most relevant results.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Marvin

    Marvin

    A batteries-included library for building AI-powered software

    ...These functions differ from conventional ones in that they don’t rely on source code, but instead generate their outputs on-demand through AI. With AI functions, you don't have to write complex code for tasks like extracting entities from web pages, scoring sentiment, or categorizing items in your database. Just describe your needs, call the function, and you're done. AI functions work with native data types, so you can seamlessly integrate them into any codebase and chain them into sophisticated pipelines. In addition to AI functions, Marvin also introduces more flexible bots. Bots are highly capable AI assistants that can be given specific instructions and personalities or roles.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    ONNX

    ONNX

    Open standard for machine learning interoperability

    ...It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types. Currently we focus on the capabilities needed for inferencing (scoring). ONNX is widely supported and can be found in many frameworks, tools, and hardware. Enabling interoperability between different frameworks and streamlining the path from research to production helps increase the speed of innovation in the AI community.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Lingua-Py

    Lingua-Py

    The most accurate natural language detection library for Python

    Its task is simple: It tells you which language some text is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    openbench

    openbench

    Provider-agnostic, open-source evaluation infrastructure

    openbench is an open-source, provider-agnostic evaluation infrastructure designed to run standardized, reproducible benchmarks on large language models (LLMs), enabling fair comparison across different model providers. It bundles dozens of evaluation suites — covering knowledge, reasoning, math, code, science, reading comprehension, long-context recall, graph reasoning, and more — so users don’t need to assemble disparate datasets themselves. With a simple CLI interface (e.g. bench eval...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Qwen3-VL-Embedding

    Qwen3-VL-Embedding

    Multimodal embedding and reranking models built on Qwen3-VL

    Qwen3-VL-Embedding (with its companion Qwen3-VL-Reranker) is a state-of-the-art multimodal embedding and reranking model suite built on the open-sourced Qwen3-VL foundation, developed to handle diverse inputs including text, images, screenshots, and videos. The core embedding model maps such inputs into semantically rich vectors in a unified representation space, enabling similarity search, clustering, and cross-modal retrieval. The reranking model then precisely scores relevance between a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Atropos

    Atropos

    Language Model Reinforcement Learning Environments frameworks

    ...This framework facilitates experimentation with RLHF (Reinforcement Learning from Human Feedback), RLAIF, or multi-turn training approaches by abstracting environment logic, scoring, and logging into reusable components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    CLIP

    CLIP

    CLIP, Predict the most relevant text snippet given an image

    CLIP (Contrastive Language-Image Pretraining) is a neural model that links images and text in a shared embedding space, allowing zero-shot image classification, similarity search, and multimodal alignment. It was trained on large sets of (image, caption) pairs using a contrastive objective: images and their matching text are pulled together in embedding space, while mismatches are pushed apart. Once trained, you can give it any text labels and ask it to pick which label best matches a given...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next