firecrawl

mirror of https://github.com/mendableai/firecrawl.git synced 2026-01-15 09:03:42 +00:00

Author	SHA1	Message	Date
Nicolas	c897df36aa	Update __init__.py	2025-06-26 13:25:30 -03:00
Nicolas	03570cca43	Update test.py	2025-06-26 13:22:43 -03:00
Nicolas	71862037a8	Update __init__.py	2025-06-25 17:04:59 -03:00
Devin AI	c7af9a7da4	feat(python-sdk): add parsePDF parameter support - Add parsePDF field to ScrapeOptions class for Search API usage - Add parse_pdf parameter to both sync and async scrape_url methods - Add parameter handling logic to pass parsePDF to API requests - Add comprehensive tests for parsePDF functionality - Maintain backward compatibility with existing API The parsePDF parameter controls PDF processing behavior: - When true (default): PDF content extracted and converted to markdown - When false: PDF returned in base64 encoding with flat credit rate Resolves missing parsePDF support in Python SDK v2.9.0 Co-Authored-By: Micah Stairs <micah@sideguide.dev>	2025-06-25 19:18:35 +00:00
Nicolas	80f7177473	Nick: bump version	2025-06-20 12:05:15 -03:00
devin-ai-integration[bot]	09aabbedb5	feat: add followInternalLinks parameter as semantic replacement for allowBackwardLinks (#1684 ) * feat: add followInternalLinks parameter as semantic replacement for allowBackwardLinks - Add followInternalLinks parameter to crawl API with same functionality as allowBackwardLinks - Update transformation logic to use followInternalLinks with precedence over allowBackwardLinks - Add parameter to Python SDK crawl methods with proper precedence handling - Add parameter to Node.js SDK CrawlParams interface - Add comprehensive tests for new parameter and backward compatibility - Maintain full backward compatibility for existing allowBackwardLinks usage - Add deprecation notices in documentation while preserving functionality Co-Authored-By: Nick <nicolascamara29@gmail.com> * fix: revert accidental cache=True changes to preserve original cache parameter handling - Revert cache=True back to cache=cache in generate_llms_text methods - Preserve original parameter passing behavior for cache parameter - Fix accidental hardcoding of cache parameter to True Co-Authored-By: Nick <nicolascamara29@gmail.com> * refactor: rename followInternalLinks to crawlEntireDomain across API, SDKs, and tests - Rename followInternalLinks parameter to crawlEntireDomain in API schema - Update Node.js SDK CrawlParams interface to use crawlEntireDomain - Update Python SDK methods to use crawl_entire_domain parameter - Update test cases to use new crawlEntireDomain parameter name - Maintain backward compatibility with allowBackwardLinks - Update transformation logic to use crawlEntireDomain with precedence Co-Authored-By: Nick <nicolascamara29@gmail.com> * fix: add missing cache parameter to generate_llms_text and update documentation references Co-Authored-By: Nick <nicolascamara29@gmail.com> * Update apps/python-sdk/firecrawl/firecrawl.py --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Nick <nicolascamara29@gmail.com> Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>	2025-06-20 12:02:23 -03:00
Gergő Móricz	f8983fffb7	Concurrency limit refactor + `maxConcurrency` parameter (FIR-2191) (#1643 )	2025-06-20 10:45:36 +02:00
Nicolas	07b77e1a1e	Update __init__.py	2025-06-06 17:23:57 -03:00
Gergő Móricz	4337992636	feat(sdk): Index parameters + other missing parameters (#1638 )	2025-06-05 22:22:22 +02:00
Gergő Móricz	3557c90210	feat(js-sdk): auto mode proxy (FIR-2145) (#1602 ) * feat(js-sdk): auto mode proxy * Nick: py sdk --------- Co-authored-by: Nicolas <nicolascamara29@gmail.com>	2025-05-28 14:31:48 -03:00
Gergő Móricz	513f469b0f	feat(python-sdk/CrawlWatcher): remove max payload size from WebSocket (FIR-2038) (#1577 ) * feat(python-sdk/CrawlWatcher): remove max payload size from WebSocket * Update __init__.py --------- Co-authored-by: Nicolas <nicolascamara29@gmail.com>	2025-05-20 16:59:08 -03:00
devin-ai-integration[bot]	7ccbbec488	Fix LLMs.txt cache bug with subdomains and add bypass option (#1557 ) * Fix LLMs.txt cache bug with subdomains and add bypass option (#1519) Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev> * Nick: * Update LLMs.txt test file to use helper functions and concurrent tests Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev> * Remove LLMs.txt test file as requested Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev> * Change parameter name to 'cache' and keep 7-day expiration Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev> * Update generate-llmstxt-supabase.ts * Update JS and Python SDKs to include cache parameter Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev> * Fix LLMs.txt cache implementation to use normalizeUrl and exact matching Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev> * Revert "Fix LLMs.txt cache implementation to use normalizeUrl and exact matching" This reverts commit d05b9964677b7b2384453329d2ac99d841467053. * Nick: --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: hello@sideguide.dev <hello+firecrawl@sideguide.dev> Co-authored-by: Nicolas <nicolascamara29@gmail.com>	2025-05-16 16:29:09 -03:00
Nicolas	907cf1cf41	Update __init__.py	2025-05-08 20:29:20 -03:00
devin-ai-integration[bot]	21adf047be	[Bug Fix] Make WaitAction milliseconds field optional in firecrawl-py (#1533 ) * This fixes issue #1512 by making the milliseconds field optional in WaitAction and adding a validator to ensure exactly one of milliseconds or selector is provided. Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev> * Update firecrawl.py --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: hello@sideguide.dev <hello+firecrawl@sideguide.dev> Co-authored-by: Nicolas <nicolascamara29@gmail.com>	2025-05-08 20:27:27 -03:00
devin-ai-integration[bot]	0512ad6bce	Add delay parameter to crawl options in all SDKs (#1514 ) * Add delay parameter to crawl options in all SDKs Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev> * Update terminology from 'between crawl requests' to 'between scrapes' Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev> * Apply suggestions from code review --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: mogery@sideguide.dev <mogery@sideguide.dev> Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>	2025-05-02 18:00:15 +02:00
devin-ai-integration[bot]	f0b1507290	Fix: Handle both dict and model instances in actions parameter (#1508 ) * Fix: Handle both dict and model instances in actions parameter Co-Authored-By: Nicolas Camara <nicolascamara29@gmail.com> * Update __init__.py --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Nicolas Camara <nicolascamara29@gmail.com>	2025-04-29 13:06:12 -03:00
Nicolas	6dbfd54e2c	Update __init__.py	2025-04-29 12:22:30 -03:00
Rafael Miller	317fa43f9e	Fix sdk/schemas (#1507 ) * sdk-fix/schema-check * version bump * schema validation for extract and jsonOptions parameters * Update firecrawl.py --------- Co-authored-by: Nicolas <nicolascamara29@gmail.com>	2025-04-29 12:19:08 -03:00
Nicolas	a0a1675829	Nick: (#1506 )	2025-04-29 11:06:35 -03:00
Nicolas	8053a7cedd	Nick: updates on pypi	2025-04-28 15:01:12 -03:00
Arvid Andersson	c164370298	Webhook param for batch scrape (#1505 ) The API endpoint supports the webhook param, align the client to support this.	2025-04-28 13:27:10 -03:00
Rafael Miller	e3e730f2c1	Update version to 2.4.0 and enhance ExtractResponse model with additional fields for id, status, and expiresAt. (#1501 )	2025-04-25 23:19:08 -04:00
Nicolas	5c3951b42e	Update __init__.py	2025-04-24 12:31:02 -04:00
John Bledsoe	ca82015bca	Use async job status monitor for AsyncFirecrawlApp (#1498 )	2025-04-24 12:28:29 -04:00
Rafael Miller	e10d4c7b0c	[fix/sdk] kwargs params (#1490 ) * fix sdk kwargs params * version * Update __init__.py --------- Co-authored-by: Nicolas <nicolascamara29@gmail.com>	2025-04-22 15:15:32 -04:00
rafaelmmiller	a4323d8f23	fix:python-sdk	2025-04-19 12:53:37 -07:00
Nicolas	2c72097c3f	Nick:	2025-04-18 13:44:16 -07:00
Nicolas	c7df80e2a8	Update __init__.py	2025-04-18 13:42:21 -07:00
rafaelmmiller	91ebd140e5	version bump	2025-04-18 13:37:34 -07:00
rafaelmmiller	0aedef7210	fix	2025-04-18 13:37:09 -07:00
rafaelmmiller	79bc54c11e	scrape options fixing types	2025-04-18 13:00:05 -07:00
Nicolas	16439b1c7e	Nick: examples	2025-04-18 02:00:00 -07:00
Nicolas	a74b2dc59f	Nick: json config instead of extract config	2025-04-18 01:54:15 -07:00
Nicolas	06c54bc41c	Update __init__.py	2025-04-18 01:43:18 -07:00
Nicolas	9e67d7ba22	Nick:	2025-04-18 01:30:40 -07:00
Nicolas	37e076e151	Merge branch 'sdk-improv/async' of https://github.com/mendableai/firecrawl into sdk-improv/async	2025-04-18 01:28:39 -07:00
Nicolas	9ba1ae9ae1	Nick:	2025-04-18 01:28:31 -07:00
rafaelmmiller	55c04d615e	Merge branch 'sdk-improv/async' of https://github.com/mendableai/firecrawl into sdk-improv/async	2025-04-18 01:21:29 -07:00
rafaelmmiller	0915db515c	async functions	2025-04-18 01:20:16 -07:00
Nicolas	a3f3168212	Nick: python sdk 2.0	2025-04-18 01:15:14 -07:00
Nicolas	f3522666db	Nick: new examples	2025-04-18 01:13:53 -07:00
Nicolas	0001d6ea25	Merge branch 'sdk-improv/async' of https://github.com/mendableai/firecrawl into sdk-improv/async	2025-04-18 01:12:31 -07:00
Nicolas	0b62be5874	Update firecrawl.py	2025-04-18 01:12:24 -07:00
rafaelmmiller	5db76992de	Merge branch 'sdk-improv/async' of https://github.com/mendableai/firecrawl into sdk-improv/async	2025-04-18 01:07:30 -07:00
rafaelmmiller	8cd82b5600	async scrape	2025-04-18 01:06:58 -07:00
Nicolas	1aa0c092e0	Update firecrawl.py	2025-04-18 01:01:01 -07:00
Nicolas	390f3d44a3	Update firecrawl.py	2025-04-18 00:51:06 -07:00
Nicolas	d8792d2301	Update firecrawl.py	2025-04-18 00:48:07 -07:00
Nicolas	5e6e41ab17	Update firecrawl.py	2025-04-18 00:37:34 -07:00
rafaelmmiller	a655d24e7c	scrape params commentary	2025-04-18 00:29:20 -07:00

1 2 3 4 5

231 Commits