John Plaisted 6ece2d6469
Start adding java ETL examples, starting with kafka etl. (#1805)
Start adding java ETL examples, starting with kafka etl.

We've had a few requests to start providing Java examples rather than Python due to type safety.

I've also started to add these to metadata-ingestion-examples to make it clearer these are *examples*. They can be used directly or as a basis for other things.

As we port to Java we'll move examples to contrib.
2020-09-11 13:04:21 -07:00
..

About this OpenLDAP ETL

The openldap-etl provides you ETL channel to communicate with an OpenLDAP server.

OpenLDAP Docker Image

Attention

The docker compose is for macOS environment. If you are running in a Linux environment, use the offical osxia/docker-openldap This docker compose file comes with a OpenLDAP server and Php LDAP Admin portal, and it is based on this with modification.

Start OpenLDAP and Php LDAP admin

docker-compose up 

Login via ldapadmin

Head to localhost:7080 with your browser, enter the following credential to login

Login:cn=admin,dc=example,dc=org
Password:admin

Seed Group, Users

Import sample-ldif.txt to come up with your organization from PhpLDAPAdmin portal. sample-ldif.txt contains information about

  • group: we set up a people group
  • peoples under people group: here are Simpons family member under people group.

Run ETL Script

Once we finish setting up our organization, we are about to run openldap-etl.py script. In this script, we query a user by his given name: Homer, we also filter result attributes to a few. We also look for Homer's manager, if there is one. This script is mostly based on ldap-etl.py. However, there is an important attribute sAMAccountName which is not exist in OpenLDAP. So we have to modify it a little bit. Once we find Homer, we assemble his information and his manager's name to corp_user_info, as a message of MetadataChangeEvent topic, publish it. After Run pip install --user -r requirements.txt, then run python openldap-etl.py, you are expected to see

{'auditHeader': None, 'proposedSnapshot': ('com.linkedin.pegasus2avro.metadata.snapshot.CorpUserSnapshot', {'urn': "urn:li:corpuser:'Homer Simpson'", 'aspects': [{'active': True, 'email': 'hsimpson', 'fullName': "'Homer Simpson'", 'firstName': "b'Homer", 'lastName': "Simpson'", 'departmentNumber': '1001', 'displayName': 'Homer Simpson', 'title': 'Mr. Everything', 'managerUrn': "urn:li:corpuser:'Bart Simpson'"}]}), 'proposedDelta': None} has been successfully produced!