crawler
| Date | Project Name | 🎉 | ? | Tags |
|---|---|---|---|---|
| 01/27 |
Kimurai v2.2.0
* [Kimurai v2.2.0](https://github.com/vifreefly/kimuraframework) – Modern Ruby web scraping framework supporting headless or antidetect Chrome/Firefox browsing and HTTP requests for JavaScript-rendered sites.
Modern Ruby web scraping framework supporting headless or antidetect Chrome/Firefox browsing and HTTP requests for JavaScript-rendered sites.
|
8
|
Ruby 1069 ⭐2714 days old |
ruby headless-chrome scraper crawler scrapy kimurai |
| 01/27 |
The CROWler v1.1.15
* [The CROWler v1.1.15](https://github.com/pzaino/thecrowler) – Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
|
5
|
Go 51 ⭐805 days old |
golang go crawler crawling indexing indexer |
| 01/24 |
The CROWler v1.1.14
* [The CROWler v1.1.14](https://github.com/pzaino/thecrowler) – Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
|
5
|
Go 51 ⭐805 days old |
golang go crawler crawling indexing indexer |
| 01/23 |
The CROWler v1.1.13
* [The CROWler v1.1.13](https://github.com/pzaino/thecrowler) – Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
|
5
|
Go 51 ⭐805 days old |
golang go crawler crawling indexing indexer |
| 01/22 |
The CROWler v1.1.12
* [The CROWler v1.1.12](https://github.com/pzaino/thecrowler) – Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
|
5
|
Go 51 ⭐805 days old |
golang go crawler crawling indexing indexer |
| 01/20 |
The CROWler v1.1.11
* [The CROWler v1.1.11](https://github.com/pzaino/thecrowler) – Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
Self-hosted, event-driven platform for browser-based web crawling, scraping, detection, and automation with rulesets, plugins, agents, and search API.
|
5
|
Go 51 ⭐805 days old |
golang go crawler crawling indexing indexer |
| 01/19 |
Browsertrix Crawler v1.11.1
* [Browsertrix Crawler v1.11.1](https://github.com/webrecorder/browsertrix-crawler) – Standalone browser-based high-fidelity crawling system running customizable crawls in a single Docker container.
Standalone browser-based high-fidelity crawling system running customizable crawls in a single Docker container.
|
6
|
TypeScript 957 ⭐1911 days old |
javascript typescript crawler crawling wacz warc web-archiving |