haystack/setup.py

import os
import re
from io import open

from setuptools import find_packages, setup


def parse_requirements(filename):
    """
    Parse a requirements pip file returning the list of required packages. It exclude commented lines and --find-links directives.

    Args:
        filename: pip requirements requirements

    Returns:
        list of required package with versions constraints

    """
    with open(filename) as file:
        parsed_requirements = file.read().splitlines()
    parsed_requirements = [line.strip()
                           for line in parsed_requirements
                           if not ((line.strip()[0] == "#") or line.strip().startswith('--find-links') or ("git+https" in line))]

    return parsed_requirements


def get_dependency_links(filename):
    """
     Parse a requirements pip file looking for the --find-links directive.
    Args:
        filename:  pip requirements requirements

    Returns:
        list of find-links's url
    """
    with open(filename) as file:
        parsed_requirements = file.read().splitlines()
    dependency_links = list()
    for line in parsed_requirements:
        line = line.strip()
        if line.startswith('--find-links'):
            dependency_links.append(line.split('=')[1])
    return dependency_links


dependency_links = get_dependency_links('requirements.txt')
parsed_requirements = parse_requirements('requirements.txt')


def versionfromfile(*filepath):
    infile = os.path.join(*filepath)
    with open(infile) as fp:
        version_match = re.search(
                r"^__version__\s*=\s*['\"]([^'\"]*)['\"]", fp.read(), re.M
        )
        if version_match:
            return version_match.group(1)
        raise RuntimeError("Unable to find version string in {}.".format(infile))


here = os.path.abspath(os.path.dirname(__file__))
_version: str = versionfromfile(here, "haystack", "_version.py")

setup(
    name="farm-haystack",
    version=_version,
    author="Malte Pietsch, Timo Moeller, Branden Chan, Tanay Soni",
    author_email="malte.pietsch@deepset.ai",
    description="Neural Question Answering & Semantic Search at Scale. Use modern transformer based models like BERT to find answers in large document collections",
    long_description=open("README.md", "r", encoding="utf-8").read(),
    long_description_content_type="text/markdown",
    keywords="QA Question-Answering Reader Retriever semantic-search search BERT roberta albert squad mrc transfer-learning language-model transformer",
    license="Apache",
    url="https://github.com/deepset-ai/haystack",
    download_url=f"https://github.com/deepset-ai/haystack/archive/{_version}.tar.gz",
    packages=find_packages(exclude=["*.tests", "*.tests.*", "tests.*", "tests"]),
    dependency_links=dependency_links,
    install_requires=parsed_requirements,
    python_requires=">=3.7.0",
    tests_require=["pytest"],
    classifiers=[
        "Intended Audience :: Science/Research",
        "License :: OSI Approved :: Apache Software License",
        "Programming Language :: Python :: 3",
        "Topic :: Scientific/Engineering :: Artificial Intelligence",
    ]
)
[setup] version tag added to Haystack fix #1175 (#1216) 2021-06-22 12:43:26 +05:00			`import os`
			`import re`
add setup.py 2019-11-27 14:02:23 +01:00			`from io import open`

			`from setuptools import find_packages, setup`

Fix for installing PyTorch on Windows OS (#159) 2020-06-18 17:43:38 +02:00
			`def parse_requirements(filename):`
			`"""`
			`Parse a requirements pip file returning the list of required packages. It exclude commented lines and --find-links directives.`

			`Args:`
			`filename: pip requirements requirements`

			`Returns:`
			`list of required package with versions constraints`

			`"""`
			`with open(filename) as file:`
			`parsed_requirements = file.read().splitlines()`
			`parsed_requirements = [line.strip()`
			`for line in parsed_requirements`
Bump FARM version to 0.4.7 (#340) 2020-09-04 17:29:14 +02:00			`if not ((line.strip()[0] == "#") or line.strip().startswith('--find-links') or ("git+https" in line))]`
Add Table Reader (#1446) * first draft / notes on new primitives * wip label / feedback refactor * rename doc.text -> doc.content. add doc.content_type * add datatype for content * remove faq_question_field from ES and weaviate. rename text_field -> content_field in docstores. update tutorials for content field * update converters for . Add warning for empty * Add first draft of TableReader * renam label.question -> label.query. Allow sorting of Answers. * Add calculation of answer scores * WIP primitives * Adapt input and output to new primitives * Add doc strings * Add tests * update ui/reader for new Answer format * Improve Label. First refactoring of MultiLabel. Adjust eval code * fixed workflow conflict with introducing new one (#1472) * Add latest docstring and tutorial changes * make add_eval_data() work again * fix reader formats. WIP fix _extract_docs_and_labels_from_dict * fix test reader * Add latest docstring and tutorial changes * fix another test case for reader * fix mypy in farm reader.eval() * fix mypy in farm reader.eval() * WIP ORM refactor * Add latest docstring and tutorial changes * fix mypy weaviate * make label and multilabel dataclasses * bump mypy env in CI to python 3.8 * WIP refactor Label ORM * WIP refactor Label ORM * simplify tests for individual doc stores * WIP refactoring markers of tests * test alternative approach for tests with existing parametrization * WIP refactor ORMs * fix skip logic of already parametrized tests * fix weaviate behaviour in tests - not parametrizing it in our general test cases. * Add latest docstring and tutorial changes * fix some tests * remove sql from document_store_types * fix markers for generator and pipeline test * remove inmemory marker * remove unneeded elasticsearch markers * add dataclasses-json dependency. adjust ORM to just store JSON repr * ignore type as dataclasses_json seems to miss functionality here * update readme and contributing.md * update contributing * adjust example * fix duplicate doc handling for custom index * Add latest docstring and tutorial changes * fix some ORM issues. fix get_all_labels_aggregated. * update drop flags where get_all_labels_aggregated() was used before * Add latest docstring and tutorial changes * add to_json(). add + fix tests * fix no_answer handling in label / multilabel * fix duplicate docs in memory doc store. change primary key for sql doc table * fix mypy issues * fix mypy issues * haystack/retriever/base.py * fix test_write_document_meta[elastic] * fix test_elasticsearch_custom_fields * fix test_labels[elastic] * fix crawler * fix converter * fix docx converter * fix preprocessor * fix test_utils * fix tfidf retriever. fix selection of docstore in tests with multiple fixtures / parameterizations * Add latest docstring and tutorial changes * fix crawler test. fix ocrconverter attribute * fix test_elasticsearch_custom_query * fix generator pipeline * fix ocr converter * fix ragenerator * Add latest docstring and tutorial changes * fix test_load_and_save_yaml for elasticsearch * fixes for pipeline tests * fix faq pipeline * fix pipeline tests * Add latest docstring and tutorial changes * fix weaviate * Add latest docstring and tutorial changes * trigger CI * satisfy mypy * Add latest docstring and tutorial changes * satisfy mypy * Add latest docstring and tutorial changes * trigger CI * fix question generation test * fix ray. fix Q-generation * fix translator test * satisfy mypy * wip refactor feedback rest api * fix rest api feedback endpoint * fix doc classifier * remove relation of Labels -> Docs in SQL ORM * fix faiss/milvus tests * fix doc classifier test * fix eval test * fixing eval issues * Add latest docstring and tutorial changes * fix mypy * WIP replace dataclasses-json with manual serialization * Add latest docstring and tutorial changes * revert to dataclass-json serialization for now. remove debug prints. * update docstrings * fix extractor. fix Answer Span init * fix api test * Adapt answer format * Add latest docstring and tutorial changes * keep meta data of answers in reader.run() * Fix mypy * fix meta handling * adress review feedback * Add latest docstring and tutorial changes * Allow inference on GPU * Remove automatic aggregation * Add automatic aggregation * Add latest docstring and tutorial changes * Add torch-scatter dependency * Add wheel to torch-scatter dependency * Fix requirements * Fix requirements * Fix requirements * Adapt setup.py to allow for wheels * Fix requirements * Fix requirements * Add type hints and code snippet * Add latest docstring and tutorial changes Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: Markus Paff <markuspaff.mp@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> 2021-10-15 16:34:48 +02:00
Fix for installing PyTorch on Windows OS (#159) 2020-06-18 17:43:38 +02:00			`return parsed_requirements`


			`def get_dependency_links(filename):`
			`"""`
			`Parse a requirements pip file looking for the --find-links directive.`
			`Args:`
			`filename: pip requirements requirements`

			`Returns:`
			`list of find-links's url`
			`"""`
			`with open(filename) as file:`
			`parsed_requirements = file.read().splitlines()`
			`dependency_links = list()`
			`for line in parsed_requirements:`
			`line = line.strip()`
			`if line.startswith('--find-links'):`
			`dependency_links.append(line.split('=')[1])`
			`return dependency_links`


			`dependency_links = get_dependency_links('requirements.txt')`
			`parsed_requirements = parse_requirements('requirements.txt')`
add setup.py 2019-11-27 14:02:23 +01:00
[setup] version tag added to Haystack fix #1175 (#1216) 2021-06-22 12:43:26 +05:00
			`def versionfromfile(*filepath):`
			`infile = os.path.join(*filepath)`
			`with open(infile) as fp:`
			`version_match = re.search(`
			`r"^__version__\s=\s['\"]([^'\"]*)['\"]", fp.read(), re.M`
			`)`
			`if version_match:`
			`return version_match.group(1)`
			`raise RuntimeError("Unable to find version string in {}.".format(infile))`


			`here = os.path.abspath(os.path.dirname(__file__))`
			`_version: str = versionfromfile(here, "haystack", "_version.py")`

add setup.py 2019-11-27 14:02:23 +01:00			`setup(`
Update package name 2019-11-27 16:17:45 +01:00			`name="farm-haystack",`
[setup] version tag added to Haystack fix #1175 (#1216) 2021-06-22 12:43:26 +05:00			`version=_version,`
add setup.py 2019-11-27 14:02:23 +01:00			`author="Malte Pietsch, Timo Moeller, Branden Chan, Tanay Soni",`
			`author_email="malte.pietsch@deepset.ai",`
Bump haystack version (#559) 2020-11-06 09:53:47 +01:00			`description="Neural Question Answering & Semantic Search at Scale. Use modern transformer based models like BERT to find answers in large document collections",`
Update setup.py 2020-11-02 20:15:10 +01:00			`long_description=open("README.md", "r", encoding="utf-8").read(),`
			`long_description_content_type="text/markdown",`
Bump haystack version (#559) 2020-11-06 09:53:47 +01:00			`keywords="QA Question-Answering Reader Retriever semantic-search search BERT roberta albert squad mrc transfer-learning language-model transformer",`
add setup.py 2019-11-27 14:02:23 +01:00			`license="Apache",`
			`url="https://github.com/deepset-ai/haystack",`
[setup] version tag added to Haystack fix #1175 (#1216) 2021-06-22 12:43:26 +05:00			`download_url=f"https://github.com/deepset-ai/haystack/archive/{_version}.tar.gz",`
add setup.py 2019-11-27 14:02:23 +01:00			`packages=find_packages(exclude=[".tests", ".tests.", "tests.", "tests"]),`
Fix for installing PyTorch on Windows OS (#159) 2020-06-18 17:43:38 +02:00			`dependency_links=dependency_links,`
add setup.py 2019-11-27 14:02:23 +01:00			`install_requires=parsed_requirements,`
Remove Python 3.6 support (#1059) * Remove Python 3.6 support * change cache key for CI 2021-06-01 15:24:44 +02:00			`python_requires=">=3.7.0",`
add setup.py 2019-11-27 14:02:23 +01:00			`tests_require=["pytest"],`
			`classifiers=[`
			`"Intended Audience :: Science/Research",`
			`"License :: OSI Approved :: Apache Software License",`
			`"Programming Language :: Python :: 3",`
			`"Topic :: Scientific/Engineering :: Artificial Intelligence",`
Refactoring of the `haystack` package (#1624) * Files moved, imports all broken * Fix most imports and docstrings into * Fix the paths to the modules in the API docs * Add latest docstring and tutorial changes * Add a few pipelines that were lost in the inports * Fix a bunch of mypy warnings * Add latest docstring and tutorial changes * Create a file_classifier module * Add docs for file_classifier * Fixed most circular imports, now the REST API can start * Add latest docstring and tutorial changes * Tackling more mypy issues * Reintroduce from FARM and fix last mypy issues hopefully * Re-enable old-style imports * Fix some more import from the top-level package in an attempt to sort out circular imports * Fix some imports in tests to new-style to prevent failed class equalities from breaking tests * Change document_store into document_stores * Update imports in tutorials * Add latest docstring and tutorial changes * Probably fixes summarizer tests * Improve the old-style import allowing module imports (should work) * Try to fix the docs * Remove dedicated KnowledgeGraph page from autodocs * Remove dedicated GraphRetriever page from autodocs * Fix generate_docstrings.sh with an updated list of yaml files to look for * Fix some more modules in the docs * Fix the document stores docs too * Fix a small issue on Tutorial14 * Add latest docstring and tutorial changes * Add deprecation warning to old-style imports * Remove stray folder and import Dict into dense.py * Change import path for MLFlowLogger * Add old loggers path to the import path aliases * Fix debug output of convert_ipynb.py * Fix circular import on BaseRetriever * Missed one merge block * re-run tutorial 5 * Fix imports in tutorial 5 * Re-enable squad_to_dpr CLI from the root package and move get_batches_from_generator into document_stores.base * Add latest docstring and tutorial changes * Fix typo in utils __init__ * Fix a few more imports * Fix benchmarks too * New-style imports in test_knowledge_graph * Rollback setup.py * Rollback squad_to_dpr too Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> 2021-10-25 15:50:23 +02:00			`]`
add setup.py 2019-11-27 14:02:23 +01:00			`)`