crawl4ai/docs/md_v2/basic/installation.md

# Installation 💻

Crawl4AI offers flexible installation options to suit various use cases. You can install it as a Python package, use it with Docker, or run it as a local server.

## Option 1: Python Package Installation (Recommended)

Crawl4AI is now available on PyPI, making installation easier than ever. Choose the option that best fits your needs:

### Basic Installation

For basic web crawling and scraping tasks:

```bash
pip install crawl4ai
playwright install # Install Playwright dependencies
```

### Installation with PyTorch

For advanced text clustering (includes CosineSimilarity cluster strategy):

```bash
pip install crawl4ai[torch]
```

### Installation with Transformers

For text summarization and Hugging Face models:

```bash
pip install crawl4ai[transformer]
```

### Full Installation

For all features:

```bash
pip install crawl4ai[all]
```

### Development Installation

For contributors who plan to modify the source code:

```bash
git clone https://github.com/unclecode/crawl4ai.git
cd crawl4ai
pip install -e ".[all]"
playwright install # Install Playwright dependencies
```

💡 After installation with "torch", "transformer", or "all" options, it's recommended to run the following CLI command to load the required models:

```bash
crawl4ai-download-models
```

This is optional but will boost the performance and speed of the crawler. You only need to do this once after installation.

## Option 2: Using Docker (Coming Soon)

Docker support for Crawl4AI is currently in progress and will be available soon. This will allow you to run Crawl4AI in a containerized environment, ensuring consistency across different systems.

## Option 3: Local Server Installation

For those who prefer to run Crawl4AI as a local server, instructions will be provided once the Docker implementation is complete.

## Verifying Your Installation

After installation, you can verify that Crawl4AI is working correctly by running a simple Python script:

```python
import asyncio
from crawl4ai import AsyncWebCrawler

async def main():
    async with AsyncWebCrawler(verbose=True) as crawler:
        result = await crawler.arun(url="https://www.example.com")
        print(result.markdown[:500])  # Print first 500 characters

if __name__ == "__main__":
    asyncio.run(main())
```

This script should successfully crawl the example website and print the first 500 characters of the extracted content.

## Getting Help

If you encounter any issues during installation or usage, please check the [documentation](https://crawl4ai.com/mkdocs/) or raise an issue on the [GitHub repository](https://github.com/unclecode/crawl4ai/issues).

Happy crawling! 🕷️🤖
ADD MKDocs 2024-06-21 17:56:54 +08:00			`# Installation 💻`

Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`Crawl4AI offers flexible installation options to suit various use cases. You can install it as a Python package, use it with Docker, or run it as a local server.`
UPDATE DOCUMENTS 2024-06-30 00:34:02 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`## Option 1: Python Package Installation (Recommended)`
ADD MKDocs 2024-06-21 17:56:54 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`Crawl4AI is now available on PyPI, making installation easier than ever. Choose the option that best fits your needs:`
ADD MKDocs 2024-06-21 17:56:54 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`### Basic Installation`
refactor: Update image description minimum word threshold in get_content_of_website_optimized 2024-08-02 15:55:32 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`For basic web crawling and scraping tasks:`
ADD MKDocs 2024-06-21 17:56:54 +08:00
chore: Update installation instructions with support for different modes 2024-06-30 00:22:17 +08:00			```bash
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`pip install crawl4ai`
			`playwright install # Install Playwright dependencies`
chore: Update installation instructions with support for different modes 2024-06-30 00:22:17 +08:00			```
ADD MKDocs 2024-06-21 17:56:54 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`### Installation with PyTorch`
ADD MKDocs 2024-06-21 17:56:54 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`For advanced text clustering (includes CosineSimilarity cluster strategy):`
1/Update setup.py to support following modes: - default (most frequent mode) - torch - transformers - all 2/ Update Docker file 3/ Update documentation as well. 2024-06-30 00:15:29 +08:00
			```bash
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`pip install crawl4ai[torch]`
ADD MKDocs 2024-06-21 17:56:54 +08:00			```

Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`### Installation with Transformers`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`For text summarization and Hugging Face models:`
1/Update setup.py to support following modes: - default (most frequent mode) - torch - transformers - all 2/ Update Docker file 3/ Update documentation as well. 2024-06-30 00:15:29 +08:00
			```bash
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`pip install crawl4ai[transformer]`
ADD MKDocs 2024-06-21 17:56:54 +08:00			```

Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`### Full Installation`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`For all features:`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00
			```bash
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`pip install crawl4ai[all]`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00			```

Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`### Development Installation`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`For contributors who plan to modify the source code:`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00
			```bash
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`git clone https://github.com/unclecode/crawl4ai.git`
			`cd crawl4ai`
			`pip install -e ".[all]"`
			`playwright install # Install Playwright dependencies`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00			```

Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`💡 After installation with "torch", "transformer", or "all" options, it's recommended to run the following CLI command to load the required models:`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00
			```bash
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`crawl4ai-download-models`
refactor: Update Dockerfile to install Crawl4AI with specified options This commit updates the Dockerfile to install Crawl4AI with the specified options. The `INSTALL_OPTION` build argument is used to determine which additional packages to install. If the option is set to "all", all models will be downloaded. If the option is set to "torch", only torch models will be downloaded. If the option is set to "transformer", only transformer models will be downloaded. If no option is specified, the default installation will be used. This change improves the flexibility and customization of the Crawl4AI installation process. 2024-08-01 17:56:19 +08:00			```

Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`This is optional but will boost the performance and speed of the crawler. You only need to do this once after installation.`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`## Option 2: Using Docker (Coming Soon)`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`Docker support for Crawl4AI is currently in progress and will be available soon. This will allow you to run Crawl4AI in a containerized environment, ensuring consistency across different systems.`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`## Option 3: Local Server Installation`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`For those who prefer to run Crawl4AI as a local server, instructions will be provided once the Docker implementation is complete.`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`## Verifying Your Installation`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`After installation, you can verify that Crawl4AI is working correctly by running a simple Python script:`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			```python
			`import asyncio`
			`from crawl4ai import AsyncWebCrawler`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`async def main():`
			`async with AsyncWebCrawler(verbose=True) as crawler:`
			`result = await crawler.arun(url="https://www.example.com")`
			`print(result.markdown[:500]) # Print first 500 characters`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`if __name__ == "__main__":`
			`asyncio.run(main())`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00			```

Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`This script should successfully crawl the example website and print the first 500 characters of the extracted content.`
refactor: Update Dockerfile to install Crawl4AI with specified options 2024-08-01 20:13:06 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`## Getting Help`
UPDATE DOCUMENTS 2024-06-30 00:34:02 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`If you encounter any issues during installation or usage, please check the [documentation](https://crawl4ai.com/mkdocs/) or raise an issue on the [GitHub repository](https://github.com/unclecode/crawl4ai/issues).`
UPDATE DOCUMENTS 2024-06-30 00:34:02 +08:00
Push async version last changes for merge to main branch 2024-09-24 20:52:08 +08:00			`Happy crawling! 🕷️🤖`