Fixes recursion limit error that was being raised when partitioning
Excel documents of a certain size.
Previously we used a recursive method to find subtables within an excel
sheet. However this would run afoul of Python's recursion depth limit
when there was a contiguous block of more than 1000 cells within a
sheet. This function has been updated to use the NetworkX library which
avoids Python recursion issues.
* Updated `_get_connected_components` to use `networkx` graph methods
rather than implementing our own algorithm for finding contiguous groups
of cells within a sheet.
* Added a test and example doc that replicates the `RecursionError`
prior to the change.
* Added `networkx` to `extra_xlsx` dependencies and `pip-compile`d.
#### Testing:
The following run from a Python terminal should raise a `RecursionError`
on `main` and succeed on this branch:
```python
import sys
from unstructured.partition.xlsx import partition_xlsx
old_recursion_limit = sys.getrecursionlimit()
try:
sys.setrecursionlimit(1000)
filename = "example-docs/more-than-1k-cells.xlsx"
partition_xlsx(filename=filename)
finally:
sys.setrecursionlimit(old_recursion_limit)
```
Note: the recursion limit is different in different contexts. Checking
my own system, the default in a notebook seems to be 3000, but in a
terminal it's 1000. The documented Python default recursion limit is
1000.