UncleCode
|
c38ac29edb
|
perf(crawler): major performance improvements & raw HTML support
- Switch to lxml parser (~4x speedup)
- Add raw HTML & local file crawling support
- Fix cache headers & async cleanup
- Add browser process monitoring
- Optimize BeautifulSoup operations
- Pre-compile regex patterns
Breaking: Raw HTML handling requires new URL prefixes
Fixes: #256, #253
|
2024-11-13 19:40:40 +08:00 |
|