unstructured/examples/arxiv-topic-modelling
Matt Robinson d9c035edb1
docs: no more bricks (#1967)
### Summary

We no longer use the "bricks" terminology for partioning functions, etc
in the library. This PR updates various references to bricks within the
repo and the docs. This is just an initial pass to swap the terminology
out, it'll likely be helpful to reorganize the docs a bit as well.

---------

Co-authored-by: qued <64741807+qued@users.noreply.github.com>
Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com>
2023-11-02 09:43:26 -05:00
..
2023-11-02 09:43:26 -05:00
2023-11-02 09:43:26 -05:00

arXiv Topic Modelling

This directory contains an example of how to use the arXiv python package (wrapper for the arXiv api), berTopic python package (transformer based topic modelling) and several functions from the unstructured library to run topic modelling on queried arXiV research papers. This notebook is very simple, but can easily modified for more complicated use cases.

To get started, use the following steps:

  • Ensure you have Python 3.8 or higher installed on your system
  • Create a new Python virtual environment
  • Run pip install -r requirements.txt to install the dependencies
  • Run PYTHONPATH=. jupyter notebook from this directory to launch the notebook

At this point, you'll be able to run the topic modelling example notebook.