UncleCode
f9fe6f89fe
feat(database): implement version management and migration checks during initialization
2024-11-17 18:09:33 +08:00
UncleCode
2a82455b3d
feat(crawl): implement direct crawl functionality and introduce CacheMode for improved caching control
2024-11-17 17:17:34 +08:00
UncleCode
4b45b28f25
feat(docs): enhance deployment documentation with one-click setup, API security details, and Docker Compose examples
2024-11-16 18:44:47 +08:00
UncleCode
9139ef3125
feat(docker): update Dockerfile for improved installation process and enhance deployment documentation with Docker Compose setup and API token security
2024-11-16 18:19:44 +08:00
UncleCode
c38ac29edb
perf(crawler): major performance improvements & raw HTML support
...
- Switch to lxml parser (~4x speedup)
- Add raw HTML & local file crawling support
- Fix cache headers & async cleanup
- Add browser process monitoring
- Optimize BeautifulSoup operations
- Pre-compile regex patterns
Breaking: Raw HTML handling requires new URL prefixes
Fixes : #256 , #253
2024-11-13 19:40:40 +08:00
UncleCode
c5aa1bec18
Merge pull request #229 from bizrockman/main
...
Preventing NoneType has no attribute get Errors
2024-11-06 07:31:07 +01:00
UncleCode
67a23c3182
feat(core): Release v0.3.73 with Browser Takeover and Docker Support
...
Major changes:
- Add browser takeover feature using CDP for authentic browsing
- Implement Docker support with full API server documentation
- Enhance Mockdown with tag preservation system
- Improve parallel crawling performance
This release focuses on authenticity and scalability, introducing the ability
to use users' own browsers while providing containerized deployment options.
Breaking changes include modified browser handling and API response structure.
See CHANGELOG.md for detailed migration guide.
2024-11-05 20:04:18 +08:00
bizrockman
796dbaf08c
Rename episode_11_3_Extraction_Strategies:_Cosine.md to episode_11_3_Extraction_Strategies_Cosine.md
...
Name that will work in Windows
2024-11-04 20:19:43 +01:00
bizrockman
3a3c88a2d0
Rename episode_11_2_Extraction_Strategies:_LLM.md to episode_11_2_Extraction_Strategies_LLM.md
...
Name that will work in Windows
2024-11-04 20:19:20 +01:00
bizrockman
870296fa7e
Rename episode_11_1_Extraction_Strategies:_JSON_CSS.md to episode_11_1_Extraction_Strategies_JSON_CSS.md
...
Name that will work in Windows
2024-11-04 20:18:58 +01:00
bizrockman
a28046c233
Rename episode_08_Media_Handling:_Images,_Videos,_and_Audio.md to episode_08_Media_Handling_Images_Videos_and_Audio.md
...
Name that will work in Windows
2024-11-04 20:18:26 +01:00
UncleCode
19c3f3efb2
Refactor tutorial markdown files: Update numbering and formatting
2024-10-30 20:58:07 +08:00
UncleCode
9307c19f35
Update documents, upload new version of quickstart.
2024-10-30 20:39:35 +08:00
UncleCode
3529c2e732
Update new tutorial documents and added to the docs folder.
2024-10-30 00:16:18 +08:00
UncleCode
4239654722
Update Documentation
2024-10-27 19:24:46 +08:00