mirror of
				https://github.com/Unstructured-IO/unstructured.git
				synced 2025-10-26 07:30:45 +00:00 
			
		
		
		
	 b37b4689bc
			
		
	
	
		b37b4689bc
		
			
		
	
	
	
	
		
			
			### Description Remove all uses of python3.8 --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: rbiseck3 <rbiseck3@users.noreply.github.com>
arXiv Topic Modelling
This directory contains an example of how to use the arXiv python package (wrapper for the arXiv api), berTopic python package (transformer based topic modelling)
and several functions from the unstructured library to run topic modelling on queried arXiV research papers. This notebook is very simple, but can easily modified for more complicated use cases.
To get started, use the following steps:
- Ensure you have Python 3.10 or higher installed on your system
- Create a new Python virtual environment
- Run pip install -r requirements.txtto install the dependencies
- Run PYTHONPATH=. jupyter notebookfrom this directory to launch the notebook
At this point, you'll be able to run the topic modelling example notebook.