mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-12-24 13:44:05 +00:00
* Apply import sorting ruff . --select I --fix * Remove unnecessary open mode parameter ruff . --select UP015 --fix * Use f-string formatting rather than .format * Remove extraneous parentheses Also use "" instead of str() * Resolve missing trailing commas ruff . --select COM --fix * Rewrite list() and dict() calls using literals ruff . --select C4 --fix * Add () to pytest.fixture, use tuples for parametrize, etc. ruff . --select PT --fix * Simplify code: merge conditionals, context managers ruff . --select SIM --fix * Import without unnecessary alias ruff . --select PLR0402 --fix * Apply formatting via black * Rewrite ValueError somewhat Slightly unrelated to the rest of the PR * Apply formatting to tests via black * Update expected exception message to match 0d81564 * Satisfy E501 line too long in test * Update changelog & version * Add ruff to make tidy and test deps * Run 'make tidy' * Update changelog & version * Update changelog & version * Add ruff to 'check' target Doing so required me to also fix some non-auto-fixable issues. Two of them I fixed with a noqa: SIM115, but especially the one in __init__ may need some attention. That said, that refactor is out of scope of this PR.
SEC Sentiment Analysis Model
This directory contains an example of how to use the SEC API, the Unstructured SEC pipeline API,
and several bricks from the unstructured library to train a sentiment analysis model for the
risk factors section of S-1 filings. To get started, use the following steps:
- Ensure you have Python 3.8 or higher installed on your system
- Create a new Python virtual environment
- Run
pip install -r requirements.txtto install the dependencies - Run
PYTHONPATH=. jupyter notebookfrom this directory to launch the notebook
At this point, you'll be able to run the sentiment analysis example notebook.