unstructured/examples/arxiv-topic-modelling
Roman Isecke b37b4689bc
drop python3.8 (#2372)
### Description
Remove all uses of python3.8

---------

Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com>
Co-authored-by: rbiseck3 <rbiseck3@users.noreply.github.com>
2024-01-09 23:37:30 +00:00
..
2024-01-09 23:37:30 +00:00
2023-11-02 09:43:26 -05:00

arXiv Topic Modelling

This directory contains an example of how to use the arXiv python package (wrapper for the arXiv api), berTopic python package (transformer based topic modelling) and several functions from the unstructured library to run topic modelling on queried arXiV research papers. This notebook is very simple, but can easily modified for more complicated use cases.

To get started, use the following steps:

  • Ensure you have Python 3.10 or higher installed on your system
  • Create a new Python virtual environment
  • Run pip install -r requirements.txt to install the dependencies
  • Run PYTHONPATH=. jupyter notebook from this directory to launch the notebook

At this point, you'll be able to run the topic modelling example notebook.