* Add branding format support to JS and Python SDKs
Co-Authored-By: abi@sideguide.dev <abimex@gmail.com>
* Add comprehensive unit tests for branding format in both SDKs
- Add JS SDK unit tests in branding.test.ts with 4 test cases
- Add Python SDK unit tests in test_branding.py with 5 test cases
- Update Python SDK normalize_document_input to handle colorScheme -> color_scheme conversion
- Add model_config extra='allow' to BrandingProfile for future extensibility
- All tests pass locally (25 JS tests, 5 Python tests)
Co-Authored-By: abi@sideguide.dev <abimex@gmail.com>
* Bump SDK versions for branding format release
- Bump JS SDK version from 4.4.1 to 4.5.0
- Bump Python SDK version from 4.5.0 to 4.6.0
Version bumps reflect the addition of branding format support and comprehensive unit tests.
Co-Authored-By: abi@sideguide.dev <abimex@gmail.com>
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: abi@sideguide.dev <abimex@gmail.com>
* cancelled added in stats if job is cancelled
* chore: stop tracking local venv in apps/python-sdk/.venv
* revert: exclude local docker-compose port mapping change from PR
* chore: ignore local SDK venv and remove from tracking
* redis rate limit url added
* removed redis rate limit
* Update crawl.py
* Update v2 async search type annotations from SearchResponse to SearchData
- Remove SearchResponse export from firecrawl.types for v2 usage
- Aligns type annotations with actual runtime behavior
- v2 async search methods already return SearchData directly
- v1 methods continue to use SearchResponse as expected
- Resolves Linear ticket ENG-3321
Co-Authored-By: rafael@sideguide.dev <rafael@sideguide.dev>
* bump version
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: rafael@sideguide.dev <rafael@sideguide.dev>
Co-authored-by: rafaelmmiller <150964962+rafaelsideguide@users.noreply.github.com>
* feat: add maxPages parameter to PDF parser
- Extend parsersSchema to support both string array ['pdf'] and object array [{'type':'pdf','maxPages':10}] formats
- Add shouldParsePDF and getPDFMaxPages helper functions for consistent parser handling
- Update PDF processing to respect maxPages limit in both RunPod MU and PdfParse processors
- Modify billing calculation to use actual pages processed instead of total pages
- Add comprehensive tests for object format parsers, page limiting, and validation
- Maintain backward compatibility with existing string array format
The maxPages parameter is optional and defaults to unlimited when not specified.
Page limiting occurs before processing to avoid unnecessary computation and billing
is based on the effective page count for fairness.
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* fix: correct parsersSchema to handle individual parser items
- Change union from array-level to item-level in parsersSchema
- Now accepts array where each item is either string 'pdf' or object {'type':'pdf','maxPages':10}
- When parser is string 'pdf', maxPages is undefined (no limit)
- When parser is object, use specified maxPages value
- Maintains backward compatibility with existing ['pdf'] format
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* fix: remove maxPages logic from scrapePDFWithParsePDF per PR feedback
- Remove maxPages parameter and truncation logic from scrapePDFWithParsePDF
- Keep maxPages logic only in scrapePDFWithRunPodMU where it provides cost savings
- Addresses feedback from mogery: pdf-parse doesn't cost anything extra to process all pages
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* test: add maxPages parameter tests for crawl and search endpoints
- Add crawl endpoint test with PDF maxPages parameter
- Add search endpoint test with PDF maxPages parameter
- Verify maxPages works end-to-end across all endpoints (scrape, crawl, search)
- Ensure schema inheritance and data flow work correctly
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* fix: remove problematic crawl and search tests for maxPages
- Remove crawl test that incorrectly uses direct PDF URL
- Remove search test that relies on unreliable external search results
- maxPages functionality verified through schema inheritance and data flow analysis
- Comprehensive tests already exist in parsers.test.ts for core functionality
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* feat: add maxPages parameter support to Python and JavaScript SDKs
- Add PDFParser class to Python SDK with max_pages field validation (1-1000)
- Update Python SDK parsers field to support Union[List[str], List[Union[str, PDFParser]]]
- Add parsers preprocessing in Python SDK to convert snake_case to camelCase
- Update JavaScript SDK parsers type to Array<string | { type: 'pdf'; maxPages?: number }>
- Add maxPages validation to JavaScript SDK ensureValidScrapeOptions
- Maintain backward compatibility with existing ['pdf'] string array format
- Support mixed formats in both SDKs
- Add comprehensive test files for both SDKs
Addresses GitHub comment requesting SDK support for maxPages parameter.
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* cleanup: remove temporary test files
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* fix: correct parsers schema to support mixed string and object arrays
- Fix parsers schema to properly handle mixed arrays like ['pdf', {type: 'pdf', maxPages: 5}]
- Resolves backward compatibility issue that was causing webhook test failures
- All parser formats now work: ['pdf'], [{type: 'pdf'}], [{type: 'pdf', maxPages: 10}], mixed arrays
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* Delete SDK_MAXPAGES_IMPLEMENTATION.md
* feat: increase maxPages limit from 1000 to 10000 pages
- Update backend Zod schema validation in types.ts
- Update JavaScript SDK client-side validation
- Update API test cases to use new 10000 limit
- Addresses GitHub comment feedback from nickscamara
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* fix: update Python SDK maxPages limit from 1000 to 10000
- Fix validation discrepancy between Python SDK (1000) and backend/JS SDK (10000)
- Ensures consistent maxPages validation across all SDKs
- Addresses critical bug identified in PR review
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* fix: remove SDK-side maxPages validation per PR feedback
- Remove maxPages range validation from JavaScript SDK validation.ts
- Remove maxPages range validation from Python SDK types.py
- Keep backend API validation as single source of truth
- Addresses GitHub comment from mogery
Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>
* Nick:
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: thomas@sideguide.dev <thomas@sideguide.dev>
Co-authored-by: Nicolas <nicolascamara29@gmail.com>