Major refactoring of browser strategy implementations to improve code organization and reliability:
- Move CrawlResultContainer and RunManyReturn types from async_webcrawler to models.py
- Simplify browser lifecycle management in AsyncWebCrawler
- Standardize browser strategy interface with _generate_page method
- Improve headless mode handling and browser args construction
- Clean up Docker and Playwright strategy implementations
- Fix session management and context handling across strategies
BREAKING CHANGE: Browser strategy interface has changed with new _generate_page method requirement
Moves common browser functionality into BaseBrowserStrategy class to reduce code duplication and improve maintainability. Key changes:
- Adds shared browser argument building and session management to base class
- Standardizes storage state handling across strategies
- Improves process cleanup and error handling
- Consolidates CDP URL management and container lifecycle
BREAKING CHANGE: Changes browser_mode="custom" to "cdp" for consistency
Reorganize browser strategy code into separate modules for better maintainability and separation of concerns. Improve Docker implementation with:
- Add Alpine and Debian-based Dockerfiles for better container options
- Enhance Docker registry to share configuration with BuiltinBrowserStrategy
- Add CPU and memory limits to container configuration
- Improve error handling and logging
- Update documentation and examples
BREAKING CHANGE: DockerConfig, DockerRegistry, and DockerUtils have been moved to new locations and their APIs have been updated.
Implements a new browser strategy that runs Chrome in Docker containers,
providing better isolation and cross-platform consistency. Features include:
- Connect and launch modes for different container configurations
- Persistent storage support for maintaining browser state
- Container registry for efficient reuse
- Comprehensive test suite for Docker browser functionality
This addition allows users to run browser automation workloads in isolated
containers, improving security and resource management.