
Add ability to capture web pages as MHTML format, which includes all page resources in a single file. This enables complete page archival and offline viewing. - Add capture_mhtml parameter to CrawlerRunConfig - Implement MHTML capture using CDP in AsyncPlaywrightCrawlerStrategy - Add mhtml field to CrawlResult and AsyncCrawlResponse models - Add comprehensive tests for MHTML capture functionality - Update documentation with MHTML capture details - Add exclude_all_images option for better memory management Breaking changes: None
4 lines
256 B
Plaintext
4 lines
256 B
Plaintext
7. **`screenshot`**, **`pdf`**, & **`capture_mhtml`**:
|
|
- If `True`, captures a screenshot, PDF, or MHTML snapshot after the page is fully loaded.
|
|
- The results go to `result.screenshot` (base64), `result.pdf` (bytes), or `result.mhtml` (string).
|