1 Commits

Author SHA1 Message Date
UncleCode
f78c46446b feat(deep-crawling): improve URL normalization and domain filtering
Enhance URL handling in deep crawling with:
- New URL normalization functions for consistent URL formats
- Improved domain filtering with subdomain support
- Added URLPatternFilter to public API
- Better URL deduplication in BFS strategy

These changes improve crawling accuracy and reduce duplicate visits.
2025-03-06 22:45:57 +08:00