Kevin Hu ebdaa0e359
feat(ingest): Feast ingestion integration (#2605)
* Add feast testing setup

* Init Feast test script

* Add feast to dependencies

* Update feast descriptors

* Sort integrations

* Working feast pytest

* Clean up feast docker-compose file

* Expand Feast tests

* Setup feast classes

* Add continuous and bytes data to feature types

* Update field type mapping

* Add PDLs

* Add MLFeatureSetUrn.java

* Comment out feast setup

* Add snapshot file and update inits

* Init Feast golden files generation

* Clean up Feast ingest

* Feast testing comments

* Yield Feature snapshots

* Fix Feature URN naming

* Update feast MCE

* Update Feature URN prefix

* Add MLEntity

* Update golden files with entities

* Specify feast sources

* Add feast source configs

* Working feast docker ingestion

* List entities and features before adding tables

* Add featureset names

* Remove unused

* Rename feast image

* Update README

* Add env to feast URNs

* Fix URN naming

* Remove redundant URN names

* Fix enum backcompatibility

* Move feast testing to docker

* Move URN generators to mce_builder

* Add source for features

* Switch TypeClass -> enum_type

* Rename source -> sourceDataset

* Add local Feast ingest image builds

* Rename Entity -> MLPrimaryKey

* Restore features and keys for each featureset

* Do not json encode source configs

* Remove old source properties from feature sets

* Regenerate golden file

* Fix race condition with Feast tests

* Exclude unknown source

* Update feature datatype enum

* Update README and fix typos

* Fix Entity typo

* Fix path to local docker image

* Specify feast config and version

* Fix feast env variables

* PR fixes

* Refactor feast ingest constants

* Make feature sources optional for back-compatibility

* Remove unused GCP files

* adding docker publish workflow

* Simplify name+namespace in PrimaryKeys

* adding docker publish workflow

* debug

* final attempt

* final final attempt

* final final final commit

* Switch to published ingestion image

* Update name and namespace in java files

* Rename FeatureSet -> FeatureTable

* Regenerate codegen

* Fix initial generation errors

* Update snapshot jsons

* Regenerated schemas

* Fix URN formats

* Revise builds

* Clean up feast URN builders

* Fix naming typos

* Fix Feature Set -> Feature Table

* Fix comments

* PR fixes

* All you need is Urn

* Regenerate snapshots and update validation

* Add UNKNOWN data type

* URNs for source types

* Add note on docker requirement

* Fix typo

* Reorder aspect unions

* Refactor feast ingest functions

* Update snapshot jsons

* Rebuild

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-06-09 15:07:04 -07:00

52 lines
1.7 KiB
Python

import pytest
from datahub.ingestion.run.pipeline import Pipeline
from tests.test_helpers import mce_helpers
# from datahub.ingestion.run.pipeline import Pipeline
# from tests.test_helpers import mce_helpers
from tests.test_helpers.docker_helpers import wait_for_port
# make sure that mock_time is excluded here because it messes with feast
@pytest.mark.slow
def test_feast_ingest(docker_compose_runner, pytestconfig, tmp_path):
test_resources_dir = pytestconfig.rootpath / "tests/integration/feast"
with docker_compose_runner(
test_resources_dir / "docker-compose.yml", "feast"
) as docker_services:
wait_for_port(docker_services, "testfeast", 6565)
# container listens to this port once test cases have been setup
wait_for_port(docker_services, "testfeast_setup", 6789)
# Run the metadata ingestion pipeline.
pipeline = Pipeline.create(
{
"run_id": "feast-test",
"source": {
"type": "feast",
"config": {
"core_url": "localhost:6565",
"use_local_build": True,
},
},
"sink": {
"type": "file",
"config": {
"filename": f"{tmp_path}/feast_mces.json",
},
},
}
)
pipeline.run()
pipeline.raise_from_status()
# Verify the output.
output = mce_helpers.load_json_file(str(tmp_path / "feast_mces.json"))
golden = mce_helpers.load_json_file(
str(test_resources_dir / "feast_mce_golden.json")
)
mce_helpers.assert_mces_equal(output, golden)