mirror of
https://github.com/unclecode/crawl4ai.git
synced 2025-11-25 09:23:21 +00:00
Implements a full-featured CLI for Crawl4AI with the following capabilities: - Basic and advanced web crawling - Configuration management via YAML/JSON files - Multiple extraction strategies (CSS, XPath, LLM) - Content filtering and optimization - Interactive Q&A capabilities - Various output formats - Comprehensive documentation and examples Also includes: - Home directory setup for configuration and cache - Environment variable support for API tokens - Test suite for CLI functionality
11 lines
331 B
YAML
11 lines
331 B
YAML
type: "llm"
|
|
provider: "openai/gpt-4o-mini"
|
|
api_token: "env:OPENAI_API_KEY"
|
|
instruction: "Extract all articles with their titles, authors, publication dates and main topics in a structured format"
|
|
params:
|
|
chunk_token_threshold: 4096
|
|
overlap_rate: 0.1
|
|
word_token_rate: 0.75
|
|
temperature: 0.3
|
|
max_tokens: 1000
|
|
verbose: true |