mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-06-27 02:30:08 +00:00

Part two of: https://github.com/Unstructured-IO/unstructured/pull/2842 Main changes compared to part one: * hash computation includes element's sequence number on page, page number, document filename and its text * there are more test for deterministic behavior of IDs returned by partitioning functions + their uniqueness (guaranteed at the document level, and high probability across multiple documents) This PR addresses the following issue: https://github.com/Unstructured-IO/unstructured/issues/2461
24 lines
380 B
HTML
24 lines
380 B
HTML
<!DOCTYPE html>
|
|
<html>
|
|
|
|
<head>
|
|
<title>Simple Nested HTML</title>
|
|
</strong>
|
|
|
|
<body>
|
|
<h1>Example heading.</h1>
|
|
<div>
|
|
<span>This is a span.</span>
|
|
<span>This is another span.</span>
|
|
</div>
|
|
<br>
|
|
<h1>Example heading.</h1>
|
|
<div>
|
|
<span>This is a span.</span>
|
|
<span>This is another span.</span>
|
|
</div>
|
|
|
|
</body>
|
|
|
|
</html>
|