Best Open Source Search Engines 2025

Search Engines

Search Engines Web Scrapers Clear Filters

Browse free open source Search Engines and projects below. Use the toggles on the left to filter open source Search Engines by OS, license, language, programming language, and project status.

Get the most trusted enterprise browser
Advanced built-in security helps IT prevent breaches before they happen

Defend against security incidents with Chrome Enterprise. Create customizable controls, manage extensions and set proactive alerts to keep your data and employees protected without slowing down productivity.

Download Chrome
The All-in-One Commerce Platform for Businesses - Shopify
Shopify offers plans for anyone that wants to sell products online and build an ecommerce store, small to mid-sized businesses as well as enterprise

Shopify is a leading all-in-one commerce platform that enables businesses to start, build, and grow their online and physical stores. It offers tools to create customized websites, manage inventory, process payments, and sell across multiple channels including online, in-person, wholesale, and global markets. The platform includes integrated marketing tools, analytics, and customer engagement features to help merchants reach and retain customers. Shopify supports thousands of third-party apps and offers developer-friendly APIs for custom solutions. With world-class checkout technology, Shopify powers over 150 million high-intent shoppers worldwide. Its reliable, scalable infrastructure ensures fast performance and seamless operations at any business size.

Learn More
1

WebHarvest - web data extraction tool

Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.

14 Reviews

Downloads: 45 This Week

Last Update: 2025-10-25
See Project
2

OpenWebSpider

OpenWebSpider is an Open Source multi-threaded Web Spider (robot, crawler) and search engine with a lot of interesting features!

4 Reviews

Downloads: 7 This Week

Last Update: 2017-03-12
See Project
3

Pavuk Web Spider and Performance Measure

A function-testing, performance-measuring, site-mirroring, web spider that is widely portable and capable of using scenarios to process a wide range of web transactions, including ssl and forms.

2 Reviews

Downloads: 4 This Week

Last Update: 2013-04-24
See Project
4

Fetchgals

A multi-threaded web spider that finds free porn thumbnail galleries by visiting a list of known TGPs (Thumbnail Gallery Posts). It optionally downloads the located pictures and movies. TGP list is included. Public domain perl script running on Linux.

2 Reviews

Downloads: 1 This Week

Last Update: 2013-03-12
See Project
5

WallPaper (alias crawlpaper)

WallPaper (alias crawlpaper) is a desktop changer (NOT a screensaver) which includes a web crawler for picture download, an audio stream ripper, an audio player, a mini mp3 tag editor,etc. Also included support for .zip and .rar files and an interface to the BerkleyDB code for small databases.

2 Reviews

Downloads: 1 This Week

Last Update: 2025-06-14
See Project
6

Web Spider, Web Crawler, Email Extractor

Free Extracts Emails, Phones and custom text from Web using JAVA Regex

In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender : https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk

Downloads: 2 This Week

Last Update: 2022-12-25
See Project
7

dorker-py

Descubre archivos, rutas escondidas realizando busquedas avanzadas

Dorking Google - Dorker Py Descubre archivos, rutas escondidas realizando busquedas avanzadas (ES) Discover files, hidden paths by performing advanced searches (EN)

Downloads: 2 This Week

Last Update: 2023-08-26
See Project
8

JSpider

A Java implementation of a flexible and extensible web spider engine. Optional modules allow functionality to be added (searching dead links, testing the performance and scalability of a site, creating a sitemap, etc ..

4 Reviews

Downloads: 1 This Week

Last Update: 2021-06-28
See Project
9

Generic Web Crawler (GWC)

A toolkit for crawling information from web pages by combining different kinds of "actions". Actions are simple operations such as navigation to a specified url or extraction of text from the html. Also available is a graphic user interface.

Downloads: 1 This Week

Last Update: 2015-10-10
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

AGEM Web Crawler & Spider

Es un software diseñado para suplir la necesidad de algunas personas de tener un Web Crawler o Spider duro, navega de forma automática por los diferentes sitios o paginas Web, extrayendo los enlaces a otras paginas.

Downloads: 0 This Week

Last Update: 2014-04-22
See Project
11

ASpider

Robust featureful multi-threaded CLI web spider using apache commons httpclient v3.0 written in java. ASpider downloads any files matching your given mime-types from a website. Tries to reg.exp. match emails by default, logging all results using log4j.

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
12

Arachnid Web Spider Framework

Arachnid is a Java-based web spider framework. It includes a simple HTML parser object that parses an input stream containing HTML content. Simple Web spiders can be created by sub-classing Arachnid and adding a few lines of code called after each page

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
13

Aracnis

Aracnis is a Java based framework for building distributed web spiders. These spiders can be used to accomplish a variety of tasks, for example, screen-scraping and link integrity checking.

Downloads: 0 This Week

Last Update: 2015-07-13
See Project
14

Arn0lD

A new Web Crawler including sophisticated searching process especialized by language !

Downloads: 0 This Week

Last Update: 2013-03-07
See Project
15

BTV Rename

The goal of this project is 100% TVDB recognition and SxxExx renaming of all files generated by BeyondTV while maintaining the BeyondTV database. Currently the project relies on TVrage and scans the entire folder which is selected by the user in the conf

Downloads: 0 This Week

Last Update: 2015-05-01
See Project
16

C++ web crawler library

arachne is a C++ library for HTTP crawling, link, text and metadata extraction designed to run in a distributed environment.

Downloads: 0 This Week

Last Update: 2014-02-28
See Project
17

Crawler.NET

Crawler.NET is a component-based distributed framework for web traversal intended for the .NET platform. It comprises of loosely coupled units each realizing a specific web crawler task. The main design goals are efficiency and flexibility.

1 Review

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
18

Distributed Webhunter

Webhunter is a distributed, multi-threaded web crawler designed for both general indexing and crawling the web for focused content.

Downloads: 0 This Week

Last Update: 2013-04-05
See Project
19

Easyspider - Distributed Web Crawler

Easy Spider is a distributed Perl Web Crawler Project from 2006

Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://easyperlspider.sourceforge.io/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/opensource/ It is fun to look at some code that is few years ago and to see how one has improved himself. If you want to write text automatically try https://www.artikelschreiber.com/en/ or https://www.unaique.net/en/!

1 Review

Downloads: 0 This Week

Last Update: 2025-03-16
See Project
20

Funnel - Web Spider

Funnel is a project for use on intranets, or selected sites on the Internet to gather together and index information from several different sources and make it available through a sane, usable interface.

Downloads: 0 This Week

Last Update: 2013-04-19
See Project
21

Harvest Web Indexing

Harvest is a web indexing package, originally disigned for distributed indexing, it can form a powerful system for indexing both large and small web sites. Also now includes Harvest-NG a highly efficient, modular, perl-based web crawler.

Downloads: 0 This Week

Last Update: 2013-04-09
See Project
22

J-Obey (Robots.txt Crawler Module)

J-Obey is a Java Library/package, which allows people writing their own crawlers to have a stable Robots.txt parser, if you are writing a web crawler of some sort you can use J-Obey to take out the hassle of writing a Robots.txt parser/intrepreter.

Downloads: 0 This Week

Last Update: 2015-08-05
See Project
23

Methabot Web Crawler

Methanol is a scriptable multi-purpose web crawling system with an extensible configuration system and speed-optimized architectural design. Methabot is the web crawler of Methanol.

2 Reviews

Downloads: 0 This Week

Last Update: 2013-05-15
See Project
24

Regular Expression web replication

Yet another web crawler? Yes, but this ones uses the full power of regular expressions to accept or reject, examine or ignore, save or refuse pages. You also use MIME types to do all this. Powerful and flexible.

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
25

Scrapeo

Web spider and SERP scrapper

Downloads: 0 This Week

Last Update: 2014-07-05
See Project