When selecting a chunking method, you can also enable auto-keyword or auto-question generation to increase retrieval rates. This feature uses a chat model to produce a specified number of keywords and questions from each created chunk, generating a layer of higher-level information from the original content.
Auto-keyword refers to the auto-keyword generation feature of RAGFlow. It uses a chat model to generate set of keywords or synonyms generated from each chunk to correct errors and enhance retrieval accuracy. This feature is implemented as a slider under **Page rank** on the **Configuration** page of your knowledge base.
- Between 3 and 5 (invlusive): Recommended if you have chunks of approximately 1,000 characters.
- Maximum: 30. If your chunk size increases, you can increase the value accordingly. Please note, as the value increases, the marginal benefit decreases.
An Auto-keyword value must be an integer. If you set it to a non-integer, say 1.7, it will be rounded down to the nearest integer, which in this case is 1.
Auto-question is a feature of RAGFlow that automatically generates questions from chunks of data using a chat model. These questions (e.g. who, what, and why) also help correct errors and improve the matching of user queries. You can find this feature as a slider under **Page rank** on the **Configuration** page of your knowledge base.
Values:
- 0: (Default) Disabled.
- 1 or 2: Recommended if you have chunks of approximately 1,000 characters.
- Maximum: 10. Can also be used to correct bad cases.
- Typical use cases: Scenarios requiring FAQ retrieval, such as product manuals and policy documents.
:::tip NOTE
An Auto-question value must be an integer. If you set it to a non-integer, say 1.7, it will be rounded down to the nearest integer, which in this case is 1.
The corresponding values relate closely to the chunking size in your knowledge base. However, if you are new to this feature and unsure which values to start with, here are some suggested values gathered from our community: