datahub/wherehows-etl/build.gradle

73 lines
1.9 KiB
Groovy
Raw Normal View History

2015-11-19 14:39:21 -08:00
apply plugin: 'application'
mainClassName = 'metadata.etl.Launcher'
configurations {
//Libraries needed at compilation time but not to be
//exported as part of the distribution
provided
all*.exclude group: 'org.slf4j', module: 'slf4j-log4j12'
all*.exclude group: 'log4j'
2016-10-19 17:08:07 -07:00
all*.resolutionStrategy {
dependencySubstitution {
substitute module('org.slf4j:slf4j-log4j12') with module('ch.qos.logback:logback-classic:1.1.7')
//prefer 'log4j-over-slf4j' over 'log4j'
substitute module('log4j:log4j') with module('org.slf4j:log4j-over-slf4j:1.7.21')
}
}
2015-11-19 14:39:21 -08:00
}
dependencies {
compile project(':wherehows-common')
compile project(':wherehows-hadoop')
Restli Client for populating espresso/oracle datasets and schema metadata (#349) * add dali view owner etl * add idpc ui * add the internal flag to switch linkedin internal features * add idpc ui * add the internal flag to switch linkedin internal features * DSS-3495, implement the UI for IDPC JIRA part * DSS-4076, update the metric view since data model changed * DSS-4092, add metric into search and advanced search * update metric database table name and fix the refId and refIdType issue * remove duplicated idpc entry and javascript log * Add fetch_owner hive script * support Appworx flow and job definition and execution * implement the Appworx log parser * bring the script finder back * update the script finder source table name * add the flow_path into lineage and extract the script info * fix the appwors flow job and lineage extract issues * bring the git location back to lineage script node * sort the script finder lineage info by type * bring the script info back for lineage job tab * fix the master branch merge issue * fix the oracle unixtime calculating issue * shorten the flow&job extract interval time to 2 hours instead of 1 day * shorten the appworx refresh time * add license header; include RUNNING chains from SO_JOB_QUEUE for Appworx * implement the list view for metrics * Modify /dataset POST method to perform INSERT or UPDATE of the DatasetRecord * apply the list view css change to metric * upgrade idpc and script finder to ember 2.6.2 * metadata dashboard confidential field data collecting * implement the confidential fields of metadata dashboard * metadata dashboard dataset description collecting * update the final table name * update the final table name for other load function * exchange the source target of cfg_object_name_map * implement the description tab for metadata dashboard * add the load dataset and field comments function * implemented the bar and pie chart for description * implement the ownership section for metadata dashboard * fix the issue that appworx lineage job running too long * add the table job attempt source code * implemented the idpc compliance section * Security Compliance Tab UI (#246) * Add back WhereHows internal tracking (#251) * DSS-5178 DSS-5277: Implements Compliance and Confidential Spec Adds 'logs/' to ignored files Updates EmberSelectorComponent to handle a list of string options or list of options with value and label, flags the currently selected option, and bubble change actions with 'selectionDidChange' action DSS-5178: Removes previous updates to search.js: moving jQuery + DOM heavy imperative implementation to Ember component DSS-5178: Adds templates and components DropRegion and DraggableItem DSS-5178: Adds getSecuritySpec action and compliance types to Dataset controller, cleans up Datasets route and removes inline securitySpec fetch from route DSS-5178: Updates templates for compliance spec DSS-5178: Adds compliance component and updates template Adds .DS_Store to gitignore DSS-5277: Adds dataset-confidential component to DOM, Creates DatasetConfidential component, refactors out data handling from component DSS-5277: Moves data fetching to Dataset Route model and set model data on controller, Adds template for confidential spec component DSS-5178: Moves view related complianceTypes to component DSS-5277 DSS-5178: Adds styling for tab content * DSS-5277 DSS-5178: Adds support for modifying compliancePurgeEntities that don't currently have identifierFields persisted on the remote, PurgeableEntityFieldIdentifierType enum is sourced in client * DSS-5178 DSS-5277: Adds dataType field to UI for schema field name search result. Refactors processSchema into parseSchema to get fields and types * DSS-5277 Fixes bug with missing params property on controller depending on route entry point * DSS-5543: Fixes rendering of datasets in detailview navigating from sidebar/ treeview (#259) * DSS-5677: Changes component from block syntax to inline. Add property for creating a new PrivacyCompliancePolicy and SecuritySpecification for statasets without either * DSS-5677: Adds ability to create a new PrivacyCompliancePolicy and SecuritySpecification from the client UI. Also fixes issue with matching fields and data type properties on schema with inconsistent shapes * DSS-5677: Add create banner for datasets without Privacy policy or Security specification * DSS-5677: Updates UI to more closely match spec, changes search input behaviour to filter from search * ADD ESPRESSO_DATASET_METADATA_ETL job to fetch Espresso metadata from Nuage * Update Nuage load process, fix owner subtype and source * Add VOLDEMORT ETL job to fetch datasets from Nuage * Add KAFKA ETL job to fetch topics from Nuage * skip KAFKA topics starting with 'test' when fetching from Nuage * Merges front-end changes from master -> DSS-5178 DSS-5577 DSS-5677 DSS-5277 DSS-5677 * DSS-5784: Fixes issue with AdvancedSearch and ScriptFinder URL queries being RFC-3986 incompliant * ScriptFinder Controller add URL decoding for Json fields (#290) * DSS-5888 Adds configuration support for Piwik environment tracking. Setting the 'tracking.piwik.siteid' to a value will get rendered in the template and consumed by the tracking initializer * DSS-5888 DSS-5875 Adds tracking for users. Adds client side tracking for keyword and init for Piwik script module * Fixes mismatch with compliance api property name: privacyCompliancePolicy != privacyCompliance * DSS-5888 Fixes tracking userId for noscript tag * DSS-5865 Removes spinner on metadata/dashboard/idpc-compliance fail * DSS-6177 Removed unused links in Metric Detail page * Update Appworx Execution and Lineage jobs (#321) * DSS-6197: Adds default value for classification property on security specification if not defined * DSS-6198: Fixes issue with nested fields not getting rendered in the schema for compliance and confidential tabs * DSS-6018 Adds ui feature to track feedback on user search results relevance using a up/down voting mechanism * Make unit tests buildable again for backend and web (#325) * Make unit tests buildable again for backend and web. * Add back fest dependency so the tests can stay more of less the same as before. * Generate code coverage reports (#334) * Add playCoverage task to run code coverage using JaCoco for backend and web. * Add jacocoTestReport task to run code coverage for testNG-based tests in wherehows-common & metadata-etl. * Add data platform filter for dashboard APIs (#322) * Add data platform filter for dashboard APIs * Add exception handling for Espresso and Kafka ETL job * restli client to populate espresso and oracle metadata
2017-03-21 11:22:49 -07:00
compile project(':restli-client')
2015-11-19 14:39:21 -08:00
compile externalDependency.jsch
compile externalDependency.http_client
compile externalDependency.http_core
compile externalDependency.jackson_databind
compile externalDependency.jackson_core
compile externalDependency.jackson_annotations
compile externalDependency.json_path
compile externalDependency.akka
compile externalDependency.slf4j_api
compile externalDependency.slf4j_log4j
2015-12-16 16:58:32 -08:00
compile externalDependency.hive_exec
compile externalDependency.hadoop_hdfs
2016-03-15 12:02:54 -07:00
compile externalDependency.jython
compile externalDependency.mysql
compile externalDependency.htrace
compile fileTree(dir: 'extralibs', include: ['*.jar']) // externalDependency.oracle/teradata/gsp
provided project(":wherehows-hadoop")
Restli Client for populating espresso/oracle datasets and schema metadata (#349) * add dali view owner etl * add idpc ui * add the internal flag to switch linkedin internal features * add idpc ui * add the internal flag to switch linkedin internal features * DSS-3495, implement the UI for IDPC JIRA part * DSS-4076, update the metric view since data model changed * DSS-4092, add metric into search and advanced search * update metric database table name and fix the refId and refIdType issue * remove duplicated idpc entry and javascript log * Add fetch_owner hive script * support Appworx flow and job definition and execution * implement the Appworx log parser * bring the script finder back * update the script finder source table name * add the flow_path into lineage and extract the script info * fix the appwors flow job and lineage extract issues * bring the git location back to lineage script node * sort the script finder lineage info by type * bring the script info back for lineage job tab * fix the master branch merge issue * fix the oracle unixtime calculating issue * shorten the flow&job extract interval time to 2 hours instead of 1 day * shorten the appworx refresh time * add license header; include RUNNING chains from SO_JOB_QUEUE for Appworx * implement the list view for metrics * Modify /dataset POST method to perform INSERT or UPDATE of the DatasetRecord * apply the list view css change to metric * upgrade idpc and script finder to ember 2.6.2 * metadata dashboard confidential field data collecting * implement the confidential fields of metadata dashboard * metadata dashboard dataset description collecting * update the final table name * update the final table name for other load function * exchange the source target of cfg_object_name_map * implement the description tab for metadata dashboard * add the load dataset and field comments function * implemented the bar and pie chart for description * implement the ownership section for metadata dashboard * fix the issue that appworx lineage job running too long * add the table job attempt source code * implemented the idpc compliance section * Security Compliance Tab UI (#246) * Add back WhereHows internal tracking (#251) * DSS-5178 DSS-5277: Implements Compliance and Confidential Spec Adds 'logs/' to ignored files Updates EmberSelectorComponent to handle a list of string options or list of options with value and label, flags the currently selected option, and bubble change actions with 'selectionDidChange' action DSS-5178: Removes previous updates to search.js: moving jQuery + DOM heavy imperative implementation to Ember component DSS-5178: Adds templates and components DropRegion and DraggableItem DSS-5178: Adds getSecuritySpec action and compliance types to Dataset controller, cleans up Datasets route and removes inline securitySpec fetch from route DSS-5178: Updates templates for compliance spec DSS-5178: Adds compliance component and updates template Adds .DS_Store to gitignore DSS-5277: Adds dataset-confidential component to DOM, Creates DatasetConfidential component, refactors out data handling from component DSS-5277: Moves data fetching to Dataset Route model and set model data on controller, Adds template for confidential spec component DSS-5178: Moves view related complianceTypes to component DSS-5277 DSS-5178: Adds styling for tab content * DSS-5277 DSS-5178: Adds support for modifying compliancePurgeEntities that don't currently have identifierFields persisted on the remote, PurgeableEntityFieldIdentifierType enum is sourced in client * DSS-5178 DSS-5277: Adds dataType field to UI for schema field name search result. Refactors processSchema into parseSchema to get fields and types * DSS-5277 Fixes bug with missing params property on controller depending on route entry point * DSS-5543: Fixes rendering of datasets in detailview navigating from sidebar/ treeview (#259) * DSS-5677: Changes component from block syntax to inline. Add property for creating a new PrivacyCompliancePolicy and SecuritySpecification for statasets without either * DSS-5677: Adds ability to create a new PrivacyCompliancePolicy and SecuritySpecification from the client UI. Also fixes issue with matching fields and data type properties on schema with inconsistent shapes * DSS-5677: Add create banner for datasets without Privacy policy or Security specification * DSS-5677: Updates UI to more closely match spec, changes search input behaviour to filter from search * ADD ESPRESSO_DATASET_METADATA_ETL job to fetch Espresso metadata from Nuage * Update Nuage load process, fix owner subtype and source * Add VOLDEMORT ETL job to fetch datasets from Nuage * Add KAFKA ETL job to fetch topics from Nuage * skip KAFKA topics starting with 'test' when fetching from Nuage * Merges front-end changes from master -> DSS-5178 DSS-5577 DSS-5677 DSS-5277 DSS-5677 * DSS-5784: Fixes issue with AdvancedSearch and ScriptFinder URL queries being RFC-3986 incompliant * ScriptFinder Controller add URL decoding for Json fields (#290) * DSS-5888 Adds configuration support for Piwik environment tracking. Setting the 'tracking.piwik.siteid' to a value will get rendered in the template and consumed by the tracking initializer * DSS-5888 DSS-5875 Adds tracking for users. Adds client side tracking for keyword and init for Piwik script module * Fixes mismatch with compliance api property name: privacyCompliancePolicy != privacyCompliance * DSS-5888 Fixes tracking userId for noscript tag * DSS-5865 Removes spinner on metadata/dashboard/idpc-compliance fail * DSS-6177 Removed unused links in Metric Detail page * Update Appworx Execution and Lineage jobs (#321) * DSS-6197: Adds default value for classification property on security specification if not defined * DSS-6198: Fixes issue with nested fields not getting rendered in the schema for compliance and confidential tabs * DSS-6018 Adds ui feature to track feedback on user search results relevance using a up/down voting mechanism * Make unit tests buildable again for backend and web (#325) * Make unit tests buildable again for backend and web. * Add back fest dependency so the tests can stay more of less the same as before. * Generate code coverage reports (#334) * Add playCoverage task to run code coverage using JaCoco for backend and web. * Add jacocoTestReport task to run code coverage for testNG-based tests in wherehows-common & metadata-etl. * Add data platform filter for dashboard APIs (#322) * Add data platform filter for dashboard APIs * Add exception handling for Espresso and Kafka ETL job * restli client to populate espresso and oracle metadata
2017-03-21 11:22:49 -07:00
provided project(":restli-client")
2015-11-19 14:39:21 -08:00
testCompile externalDependency.testng
}
task copyFiles(type: Copy, dependsOn: compileJava) {
from configurations.provided
into 'src/main/resources/jar'
include 'schemaFetch.jar'
}
sourceSets {
main {
java {
srcDir 'src/java'
}
resources {
srcDir 'src/resources'
}
}
}
jar {
dependsOn 'copyFiles'
manifest {
attributes 'Main-Class': 'metadata.etl.Launcher'
}
}