diff --git a/docs/md_v2/blog/index.md b/docs/md_v2/blog/index.md index 28ccfa6..f7c8494 100644 --- a/docs/md_v2/blog/index.md +++ b/docs/md_v2/blog/index.md @@ -4,6 +4,15 @@ Welcome to the Crawl4AI blog! Here you'll find detailed release notes, technical ## Latest Release +### [0.4.2 - Configurable Crawlers, Session Management, and Smarter Screenshots](releases/0.4.2.md) +*December 12, 2024* + +The 0.4.2 update brings massive improvements to configuration, making crawlers and browsers easier to manage with dedicated objects. You can now import/export local storage for seamless session management. Plus, long-page screenshots are faster and cleaner, and full-page PDF exports are now possible. Check out all the new features to make your crawling experience even smoother. + +[Read full release notes →](releases/0.4.2.md) + +--- + ### [0.4.1 - Smarter Crawling with Lazy-Load Handling, Text-Only Mode, and More](releases/0.4.1.md) *December 8, 2024* @@ -35,3 +44,4 @@ Curious about how Crawl4AI has evolved? Check out our [complete changelog](https - Star us on [GitHub](https://github.com/unclecode/crawl4ai) - Follow [@unclecode](https://twitter.com/unclecode) on Twitter - Join our community discussions on GitHub + diff --git a/docs/md_v2/blog/releases/0.4.2.md b/docs/md_v2/blog/releases/0.4.2.md new file mode 100644 index 0000000..6f8f39e --- /dev/null +++ b/docs/md_v2/blog/releases/0.4.2.md @@ -0,0 +1,86 @@ +## 🚀 Crawl4AI 0.4.2 Update: Smarter Crawling Just Got Easier (Dec 12, 2024) + +### Hey Developers, + +I’m excited to share Crawl4AI 0.4.2—a major upgrade that makes crawling smarter, faster, and a whole lot more intuitive. I’ve packed in a bunch of new features to simplify your workflows and improve your experience. Let’s cut to the chase! + +--- + +### 🔧 **Configurable Browser and Crawler Behavior** + +You’ve asked for better control over how browsers and crawlers are configured, and now you’ve got it. With the new `BrowserConfig` and `CrawlerRunConfig` objects, you can set up your browser and crawling behavior exactly how you want. No more cluttering `arun` with a dozen arguments—just pass in your configs and go. + +**Example:** +```python +from crawl4ai import BrowserConfig, CrawlerRunConfig, AsyncWebCrawler + +browser_config = BrowserConfig(headless=True, viewport_width=1920, viewport_height=1080) +crawler_config = CrawlerRunConfig(cache_mode="BYPASS") + +async with AsyncWebCrawler(config=browser_config) as crawler: + result = await crawler.arun(url="https://example.com", config=crawler_config) + print(result.markdown[:500]) +``` + +This setup is a game-changer for scalability, keeping your code clean and flexible as we add more parameters in the future. + +Remember: If you like to use the old way, you can still pass arguments directly to `arun` as before, no worries! + +--- + +### 🔐 **Streamlined Session Management** + +Here’s the big one: You can now pass local storage and cookies directly. Whether it’s setting values programmatically or importing a saved JSON state, managing sessions has never been easier. This is a must-have for authenticated crawls—just export your storage state once and reuse it effortlessly across runs. + +**Example:** +1. Open a browser, log in manually, and export the storage state. +2. Import the JSON file for seamless authenticated crawling: + +```python +result = await crawler.arun( + url="https://example.com/protected", + storage_state="my_storage_state.json" +) +``` + +--- + +### 🔢 **Handling Large Pages: Supercharged Screenshots and PDF Conversion** + +Two big upgrades here: + +- **Blazing-fast long-page screenshots**: Turn extremely long web pages into clean, high-quality screenshots—without breaking a sweat. It’s optimized to handle large content without lag. + +- **Full-page PDF exports**: Now, you can also convert any page into a PDF with all the details intact. Perfect for archiving or sharing complex layouts. + +--- + +### 🔧 **Other Cool Stuff** + +- **Anti-bot enhancements**: Magic mode now handles overlays, user simulation, and anti-detection features like a pro. +- **JavaScript execution**: Execute custom JS snippets to handle dynamic content. No more wrestling with endless page interactions. + +--- + +### 📊 **Performance Boosts and Dev-friendly Updates** + +- Faster rendering and viewport adjustments for better performance. +- Improved cookie and local storage handling for seamless authentication. +- Better debugging with detailed logs and actionable error messages. + +--- + +### 🔠 **Use Cases You’ll Love** + +1. **Authenticated Crawls**: Login once, export your storage state, and reuse it across multiple requests without the headache. +2. **Long-page Screenshots**: Perfect for blogs, e-commerce pages, or any endless-scroll website. +3. **PDF Export**: Create professional-looking page PDFs in seconds. + +--- + +### Let’s Get Crawling + +Crawl4AI 0.4.2 is ready for you to download and try. I’m always looking for ways to improve, so don’t hold back—share your thoughts and feedback. + +Happy Crawling! 🚀 +