26 Commits

Author SHA1 Message Date
UncleCode
f7574230a1 Update API server request object. text_docker file and Readme 2024-11-07 19:29:31 +08:00
UncleCode
b51263664e feat(api): add CORS support and static file serving, update root redirect 2024-11-05 21:02:47 +08:00
UncleCode
67a23c3182 feat(core): Release v0.3.73 with Browser Takeover and Docker Support
Major changes:
- Add browser takeover feature using CDP for authentic browsing
- Implement Docker support with full API server documentation
- Enhance Mockdown with tag preservation system
- Improve parallel crawling performance

This release focuses on authenticity and scalability, introducing the ability
to use users' own browsers while providing containerized deployment options.
Breaking changes include modified browser handling and API response structure.

See CHANGELOG.md for detailed migration guide.
2024-11-05 20:04:18 +08:00
UncleCode
c4c6227962 Creating the API server component 2024-11-04 20:33:15 +08:00
unclecode
ca0336af9e feat: Add error handling for rate limit exceeded in form submission
This commit adds error handling for rate limit exceeded in the form submission process. If the server returns a 429 status code, the client will display an error message indicating the rate limit has been exceeded and provide information on when the user can try again. This improves the user experience by providing clear feedback and guidance when rate limits are reached.
2024-07-08 20:24:00 +08:00
unclecode
65ed1aeade feat: Add rate limiting functionality with custom handlers 2024-07-08 20:02:12 +08:00
unclecode
d58286989c UPDATE DOCUMENTS 2024-06-30 00:34:02 +08:00
unclecode
144cfa0eda Switch to ChromeDriverManager due some issues with download the chrome driver 2024-06-26 13:00:17 +08:00
unclecode
8c77a760fc Fixed:
- Redirect "/" to mkdocs
2024-06-22 20:54:32 +08:00
unclecode
b9bf8ac9d7 Fix mounting the "/" to mkdocs site folder 2024-06-22 20:41:39 +08:00
unclecode
d6182bedd7 chore:
- Add demo page to the new mkdocs
- Set website home page to mkdocs
2024-06-22 20:36:01 +08:00
unclecode
e7705e661a ADD MKDocs 2024-06-21 17:56:54 +08:00
unclecode
b3a0edaa6d - User agent
- Extract Links
- Extract Metadata
- Update Readme
- Update REST API document
2024-06-08 17:59:42 +08:00
unclecode
8e73a482a2 feat: Add screenshot functionality to crawl_urls
The code changes in this commit add the `screenshot` parameter to the `crawl_urls` function in `main.py`. This allows users to specify whether they want to take a screenshot of the page during the crawling process. The default value is `False`.

This commit message follows the established convention of starting with a type (feat for feature) and providing a concise and descriptive summary of the changes made.
2024-06-07 15:23:32 +08:00
unclecode
0533aeb814 v0.2.3:
- Extract all media tags
- Take screenshot of the page
2024-06-07 15:23:13 +08:00
UncleCode
7381fa95e6
Merge pull request #3 from QIN2DIM/main
fix(main): UnicodeDecodeError
2024-05-23 09:29:28 +08:00
Unclecode
53d1176d53 chore: Update extraction strategy to support GPU, MPS, and CPU, add batch processing for CPU devices 2024-05-19 16:18:58 +00:00
QIN2DIM
5cee084340 fix(main): UnicodeDecodeError
File "T:\_GitHubProjects\Forks\crawl4ai\main.py", line 70, in read_index
    partials[filename[:-5]] = file.read()

UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 149: illegal multibyte sequence
2024-05-18 23:31:11 +08:00
Unclecode
bf00c26a83 chore: Update Dockerfile to install chromium-chromedriver and spacy library 2024-05-18 09:16:52 +00:00
unclecode
d7b37e849d chore: Update CrawlRequest model to use NoExtractionStrategy as default 2024-05-17 16:50:38 +08:00
unclecode
5b80be956d Update:
- Debug
- Refactor code for new version
2024-05-16 17:31:44 +08:00
unclecode
f6e59157bf - Test all methods
- Update index.hml
- Update Readme
- Resolve some bugs
2024-05-14 21:27:41 +08:00
ntohidi
aa126e436b Add CORS middleware for allowing all origins to make requests 2024-05-10 12:27:40 +02:00
unclecode
3ff1d15702 Change the project folder name from crawler to crawl4ai 2024-05-09 22:16:28 +08:00
unclecode
181250cb93 chore: Add function to clear the database 2024-05-09 19:42:43 +08:00
unclecode
b8e743cd8d Initial Commit 2024-05-09 19:10:25 +08:00