mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-06-27 02:30:08 +00:00

Currently we [filter img
tags](2addb19473/unstructured/partition/html/partition.py (L226-L229)
)
before tags are converted to Elements by the html partitioner. More
importantly we also don’t currently have a defined “block” / mapping to
support these. This adds these mappings and logic to process.
It also respects `extract_image_block_types` and
`extract_image_block_to_payload` (as we do with pdfs) to determine
whether base64 is included in the metadata.
The partitioned Image Elements sets the text to the img tag’s alt text
if available.
The partitioned Image Elements include the [url in the
metadata](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/documents/elements.py#L209)
(rather than image_base64) if the img tag src is a url.
## Testing
unit tests have been added for explicit coverage.
existing integration tests and other unit test fixtures have been
updated to account for `Image` elements now present
---------
Co-authored-by: ryannikolaidis <ryannikolaidis@users.noreply.github.com>
138 lines
4.5 KiB
HTML
138 lines
4.5 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="utf-8"/>
|
|
<meta content="width=device-width, initial-scale=1.0" name="viewport"/>
|
|
<title>
|
|
</title>
|
|
</head>
|
|
<body>
|
|
<h1 class="Title" id="e1f5fcc433282a2fa999a1dec593f59a">
|
|
Welcome to your team space!
|
|
</h1>
|
|
<p class="NarrativeText" id="c1a9308ef0747d4d4e4224516ecddb18">
|
|
We've added some suggestions and placeholders. Everything is customizable.
|
|
</p>
|
|
<p class="NarrativeText" id="9fe679c448c45285aa560d5baa9f634a">
|
|
Get started with page templates:
|
|
</p>
|
|
<p class="UncategorizedText" id="12417eb1d253a51abccfc40d8997d7a2">
|
|
Template - Project plan
|
|
</p>
|
|
<p class="UncategorizedText" id="a4badd544200f7ecea78294c7aacc954">
|
|
Template - Meeting notes
|
|
</p>
|
|
<p class="UncategorizedText" id="02a0bfe6b0d146cb2c448c6015a044b3">
|
|
Template - Weekly status report
|
|
</p>
|
|
<p class="NarrativeText" id="6f4ad946533ed790b0f2b0c14579e408">
|
|
Check out Get the most out of your team space for more tips.
|
|
</p>
|
|
<h1 class="Title" id="25a3e4bdfe373e340a842e2a5e5bd88d">
|
|
About
|
|
</h1>
|
|
<p class="NarrativeText" id="40b2705662d37139613e14e7583d1547">
|
|
What is your team all about?
|
|
</p>
|
|
<h1 class="Title" id="2af202b40e0d26ac56612727b0a8061c">
|
|
Mission and vision
|
|
</h1>
|
|
<p class="NarrativeText" id="4f12363a074edf7fbde93a4e424933f1">
|
|
What is your team's mission? What is your vision?
|
|
</p>
|
|
<h1 class="Title" id="cbf6597e9e1412cdc347ca2c48754289">
|
|
Meet the team
|
|
</h1>
|
|
<p class="UncategorizedText" id="af3236ec30847a0d5e80d5c4c48d24b3">
|
|
Add team members to your space.
|
|
</p>
|
|
<img alt="" class="Image" id="33831fbc138ef739d88d4f83b4cfc58d"/>
|
|
<h1 class="Title" id="240725efee18f416b470f886d83e54a3">
|
|
Team member
|
|
</h1>
|
|
<p class="UncategorizedText" id="a8359a51dc7bc16fc9f2f412dfad01d7">
|
|
Role
|
|
</p>
|
|
<p class="UncategorizedText" id="4d2982f8ec1f943ba5887ea5e1c41722">
|
|
Responsibility
|
|
</p>
|
|
<img alt="" class="Image" id="1709eac9e1289421c96b86fa773e85ba"/>
|
|
<h1 class="Title" id="8e408d997b6afdcc6dc7c5d2f60d51fe">
|
|
Team member
|
|
</h1>
|
|
<p class="UncategorizedText" id="f86b21d5900d7c26053ce0d49624e22b">
|
|
Role
|
|
</p>
|
|
<p class="UncategorizedText" id="ad6b52393cba4295aa11d461df801ec9">
|
|
Responsibility
|
|
</p>
|
|
<img alt="" class="Image" id="3fa16ff3939638c6415d5d1367aa01be"/>
|
|
<h1 class="Title" id="92cda6e10ddc39a6274a39bd28d78fd6">
|
|
Team member
|
|
</h1>
|
|
<p class="UncategorizedText" id="c6bb501cb86fef4a7e6af33b44408860">
|
|
Role
|
|
</p>
|
|
<p class="UncategorizedText" id="1e867147aebd2e2042c0b79216eb8ad6">
|
|
Responsibility
|
|
</p>
|
|
<h1 class="Title" id="72969103d9798a14b6937a5f17e95250">
|
|
Contact us
|
|
</h1>
|
|
<p class="NarrativeText" id="a1f62f9caaa9e0ab38abfecc9992beb6">
|
|
How can someone reach out to your team?
|
|
</p>
|
|
<div class="EmailAddress" id="d6507473bd42ae2c5043ef9682f5b71f">
|
|
team@email.com
|
|
</div>
|
|
<p class="UncategorizedText" id="d68042b1765da182a599d7f147d2abef">
|
|
Tickets
|
|
</p>
|
|
<p class="UncategorizedText" id="717b067188e80741597eb37455bf4fbe">
|
|
Jira board
|
|
</p>
|
|
<p class="UncategorizedText" id="16455e060585b3e0817764ca31c32151">
|
|
#channel
|
|
</p>
|
|
<h1 class="Title" id="f773ae2bc874cb28cff580d0b63a627a">
|
|
Important Pages
|
|
</h1>
|
|
<p class="NarrativeText" id="8a7363b7d1eb2cb37430121d27168de0">
|
|
List them here
|
|
</p>
|
|
<img alt="" class="Image" id="030568cacd3b66ce8ee6c6c3c9be840f"/>
|
|
<h1 class="Title" id="fd9d745f22dffbb155b2e8022e2dc2e4">
|
|
Onboarding FAQs
|
|
</h1>
|
|
<p class="UncategorizedText" id="71d0ef13e2b308bf6c79c3153f3ed35f">
|
|
Add resources for new hires
|
|
</p>
|
|
<img alt="" class="Image" id="7e882f807cf95f54e80ea3d7b75f6edd"/>
|
|
<h1 class="Title" id="16fda0efe288d0c8d1cf18b1037b5b0e">
|
|
Meeting notes
|
|
</h1>
|
|
<p class="NarrativeText" id="8bf5be7f0d4a4b5248347885f68f6b89">
|
|
Add links to meeting notes
|
|
</p>
|
|
<img alt="" class="Image" id="56b696bc7b11d0f3e1165cb157426dcc"/>
|
|
<h1 class="Title" id="c6fe156426f03a42912623025777f8c8">
|
|
Team goals
|
|
</h1>
|
|
<p class="NarrativeText" id="d8f7425068e3b4e6e99affa00d268060">
|
|
List them here
|
|
</p>
|
|
<h1 class="Title" id="e4589df20d851e29530dbf5f97444eca">
|
|
Team news
|
|
</h1>
|
|
<p class="NarrativeText" id="37a3e4a1755417a6944ff64115257147">
|
|
Create a blog post to share team news. It will automatically appear here once it's published.
|
|
</p>
|
|
<h1 class="Title" id="e26ff7fd8e8e12c8aa704e6f97275fbf">
|
|
Blog stream
|
|
</h1>
|
|
<p class="NarrativeText" id="18220fb2182492f64b3504513de4fbef">
|
|
Create a blog post to share news and announcements with your team and company.
|
|
</p>
|
|
</body>
|
|
</html>
|