Gergő Móricz
cb53b2b31e
patch for delay race condition
2025-06-19 16:22:14 +02:00
Gergő Móricz
7296b76e78
re-instate delay
2025-06-19 15:59:30 +02:00
Gergő Móricz
43bf5df711
remove logs
2025-06-19 15:54:05 +02:00
Gergő Móricz
eecf6081a2
feat(queue-jobs): implement new cc flow into singular addScrapeJob
2025-06-19 15:54:05 +02:00
Gergő Móricz
1de4f65b8c
bump sdk
2025-06-19 15:54:05 +02:00
Gergő Móricz
892dda19a0
fix(batch-scrape): actually store maxConcurrency
2025-06-19 15:54:05 +02:00
Gergő Móricz
59f496b269
lol i don't know how my own system works nvm this is useless
2025-06-19 15:54:05 +02:00
Gergő Móricz
33111af186
index test fix lol
2025-06-19 15:54:05 +02:00
Gergő Móricz
87aab1886f
feat(sdk): maxConcurrency
2025-06-19 15:54:05 +02:00
Gergő Móricz
2b367fcd78
fixes, more extensive testing
2025-06-19 15:54:05 +02:00
Gergő Móricz
c6232e6f46
fix
2025-06-19 15:54:05 +02:00
Gergő Móricz
8e4285a0f1
wip
2025-06-19 15:54:05 +02:00
Gergő Móricz
435cb1608b
wip
2025-06-19 15:53:16 +02:00
Gergő Móricz
a8e3c29664
feat(scrape, extract): creditsUsed, tokensUsed fields (FIR-2336) ( #1683 )
...
* fix(scrape): log FIRE-1 credits billed on failures properly
* fix dumb thinbgs
* feat(scrape, extract): creditsUsed fields
* fix(extract): call it tokensUsed
* Trigger Build
* dumb mistake, search does separate billing
2025-06-18 21:49:20 +02:00
Gergő Móricz
fbd81b4168
fix(scrape): log FIRE-1 credits billed on failures properly (FIR-2331) ( #1682 )
...
* fix(scrape): log FIRE-1 credits billed on failures properly
* fix dumb thinbgs
2025-06-18 21:47:58 +02:00
Gergő Móricz
ebc1de9d60
feat(crawl-status): refactor to work after a redis flush ( #1664 )
2025-06-18 18:58:04 +02:00
devin-ai-integration[bot]
cd2e0f868c
Add deployment type field to bug report template ( #1681 )
...
- Add 'Deployment Type' field to Environment section
- Allows users to specify Cloud (firecrawl.dev) vs Self-hosted
- Helps maintainers better triage issues based on deployment context
- Positioned logically after OS field in existing template structure
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Nick <nicolascamara29@gmail.com>
2025-06-18 12:26:15 -03:00
Thomas Kosmas
199115c7be
stop testing new mu
2025-06-18 00:48:50 +03:00
Thomas Kosmas
f46f845efc
fix: send the request to new mu version before the main one to achieve better sync
2025-06-17 20:37:45 +03:00
Thomas Kosmas
ee7b29b3f6
feat: Test mu v3 ( #1678 )
...
* Test mu v3
* fix env
2025-06-17 20:13:19 +03:00
Gergő Móricz
5ca8e2e98e
feat(index): store short titles and descriptions ( #1677 )
2025-06-17 19:09:07 +02:00
devin-ai-integration[bot]
9710bdffc0
Improve URL filtering error messages with specific denial reasons (FIR-2352) ( #1676 )
...
* Improve URL filtering error messages with specific denial reasons
- Add FilterResult and FilterLinksResult interfaces for structured error reporting
- Define DenialReason enum with specific, human-readable error messages
- Update filterURL method to return structured results with denial reasons
- Update filterLinks method to collect and return denial reasons for each URL
- Modify error handling in queue-worker.ts to use specific denial reasons
- Add comprehensive tests for different URL filtering scenarios
- Maintain backward compatibility while improving error specificity
Fixes: Misleading 'includePaths/excludePaths rules' error now shows actual denial reason (robots.txt, exclude patterns, depth limits, etc.)
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* Fix test compilation error for FilterLinksResult interface
- Update crawler.test.ts to use filteredLinks.links.length instead of filteredLinks.length
- Update test expectations to use filteredLinks.links array
- Resolves TypeScript compilation error preventing CI from passing
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: mogery@sideguide.dev <mogery@sideguide.dev>
2025-06-17 19:00:29 +02:00
Nicolas
c6482eaf2d
Nick: prevent additional logging on /extract scrapes
2025-06-13 18:17:17 -03:00
Gergő Móricz
ea321b4936
fix search test timeouts
2025-06-13 17:42:55 +02:00
Thomas Kosmas
38c5795282
feat(vertex): fix vertex ai provider bug and update model references to use "gemini-2.5-pro" ( #1668 )
2025-06-13 18:29:03 +03:00
Gergő Móricz
0bf23071ff
feat(index): add domain splitting for improved map querying ( #1666 )
v1.11.0
2025-06-13 15:22:45 +02:00
Gergő Móricz
07224b8cd4
feat: use index in search and extract ( #1660 )
2025-06-13 12:30:28 +02:00
Gergő Móricz
f296342731
feat(index): remove unused columns ( #1662 )
2025-06-12 16:51:40 +02:00
Gergő Móricz
89e42b1137
fix(api): remove query parameter sanitization that was breaking extracts ( #1661 )
2025-06-12 15:37:45 +02:00
Gergő Móricz
3c03d07051
feat: add credits_billed everywhere (FIR-2286) ( #1655 )
...
* feat: add credits_billed everywhere
also a bit of logging improvement for logJob
* fix(queue-worker): db auth check before doing rpc for crawl/batch_scrape
2025-06-11 23:06:55 +02:00
Nicolas
bf3b2a359a
Improve concurrency limit email notifications ( #1658 )
...
* Update email_notification.ts
* Update email_notification.ts
* Update email_notification.ts
2025-06-11 17:14:54 -03:00
Pulkit Saini
255be2a2ff
Fix PLAYWRIGHT_MICROSERVICE_URL env var to use /scrape endpoint ( #1654 )
...
The correct environment variable should be PLAYWRIGHT_MICROSERVICE_URL=http://playwright-service:3000/scrape instead of PLAYWRIGHT_MICROSERVICE_URL=http://playwright-service:3000/html
2025-06-11 16:53:32 +02:00
Gergő Móricz
19dd086eb3
improve auto recharge logging
2025-06-11 16:26:06 +02:00
Nicolas
9964d11c20
Update v1.ts
2025-06-09 17:13:56 -03:00
Arkit
117af9f6e2
fix(readme): clarify that scrape_options must be a ScrapeOptions instance in crawl_url and typo fixes. ( #1647 )
2025-06-08 20:52:10 -03:00
Nicolas
623d39801f
Merge branch 'main' of https://github.com/mendableai/firecrawl
2025-06-06 17:29:09 -03:00
Nicolas
07b77e1a1e
Update __init__.py
2025-06-06 17:23:57 -03:00
Gergő Móricz
a32911756e
fix(js-sdk/tests): fix the testing situation (FIR-2253) ( #1644 )
2025-06-06 22:23:56 +02:00
Gergő Móricz
4659155b76
remove logs
2025-06-06 00:25:28 +02:00
Gergő Móricz
3b6be76d3e
debug(index): time insights
2025-06-06 00:03:34 +02:00
Gergő Móricz
8ef3e8484a
feat(gcs-jobs): ditch exists check to cut lookup time in half ( #1641 )
2025-06-05 23:43:30 +02:00
Gergő Móricz
6e8873762a
feat(apps/test-suite): add Rafa's index benchmark notebook
2025-06-05 23:22:48 +02:00
Gergő Móricz
6d1b9bf1fe
debug(api/scrape): more logging
2025-06-05 22:52:41 +02:00
Gergő Móricz
0c7f864ea4
debug(api/scrape): increased logging to diagnose scrape fluke length
2025-06-05 22:51:25 +02:00
Gergő Móricz
4337992636
feat(sdk): Index parameters + other missing parameters ( #1638 )
2025-06-05 22:22:22 +02:00
Gergő Móricz
1de0ae392c
Index testing improvements (FIR-2214) ( #1637 )
...
* feat(api/tests/scrape): index improvements
* fix(api/test/scrape): add waits to allow batch insert to happen
* fix: ...
2025-06-05 22:10:06 +02:00
Gergő Móricz
78580f65df
feat(webhook): refactor callWebhook and add logWebhook (FIR-2218) ( #1629 )
...
* feat(webhook): refactor callWebhook and add logWebhook
* feat(queue-worker): fix crawl pre-finishing logic (#1628 )
* feat(ci): verify typescript errors
* fix(ci):
* feat(api/tests): add webhook tests + refactor batch scrape lib (#1630 )
* feat(api/tests): add webhook tests + refactor batch scrape lib
* fix(ci):
* feat(webhook/log): insert queue
2025-06-05 22:04:22 +02:00
Gergő Móricz
f050b169e2
feat(api/index): port queryIndexAtSplitLevel to RPC (FIR-2241) ( #1640 )
...
* feat(api/index): port queryIndexAtSplitLevel to RPC
* Update apps/api/src/services/index.ts
2025-06-05 22:02:41 +02:00
Gergő Móricz
a08d52e45d
feat(scrapeURL/index): don't put results by "dumb" engines into the index
2025-06-05 22:01:29 +02:00
Thomas Kosmas
af88218fad
feat: update mu ( #1639 )
...
* update to mu v2
* feat(ci): add RUNPOD_MUV2_POD_ID
* stupid change to make CI run
---------
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
2025-06-05 22:27:00 +03:00