A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Open Source OCR Engine
Contexts Optical Compression
OCR software, free and offline
PDF to Markdown with vision models
OCRmyPDF adds an OCR text layer to scanned PDF files
OCR expert VLM powered by Hunyuan's native multimodal architecture
OCR offline image text recognition command line windows program
Awesome multilingual OCR toolkits based on PaddlePaddle
A cross-platform software for text translation and recognition
Library for OCR-related tasks powered by Deep Learning
Visual Causal Flow
A pure Javascript Multilingual OCR
Ready-to-use OCR with 80+ supported languages
State-of-the-art (SoTA) text-to-video pre-trained model
Readest is a modern, feature-rich ebook reader
Implementation of Video Diffusion Models
Free open-source non-linear video editor
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
Convert AI papers to GUI
Implementation of Make-A-Video, new SOTA text to video generator
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Qwen3-omni is a natively end-to-end, omni-modal LLM
WindowTextExtractor allows you to get a text from any OS
Wan2.1: Open and Advanced Large-Scale Video Generative Model