Job Title: Data Engineer

Location: Houston, TX.

Duration: 6 months





Job Responsibilities


We are free from legacy concerns and you will have the opportunity to use
modern technologies and to work with a cutting-edge data science platform using
Pachyderm on top of Kubernetes.


We need strong technical expertise in Data Engineering, but beyond that this
is an opportunity to help us setup a best-practice data science process, to
help us determine the direction of future tooling, and to be a central part of
a team that will spearhead how the company engages in Data Science.




?              Strong experience with Python and relevant libraries (PySpark,
Pandas, etc).

?              The ability to work across structured, semi-structured, and
unstructured data, extracting information and identifying irregularities and
linkages across disparate data sets.

?              Meaningful experience in Distributed Processing (Spark,
Hadoop, EMR, etc).

?              Deep understanding of Information Security principles to
ensure compliant handling and management of client data.

?              Experience working collaboratively in a close-knit team and in
clearly communicating complex solutions.

?              Experience in traditional data warehousing / ETL tools (SAP
HANA, Informatica, Talend, Pentaho, DataStage, etc)

?              Experience and interest in cloud infrastructure (Azure, AWS,
Google Platform, Databricks, etc) and containerisation (Kubernetes, Docker,




?              Experience programming with Julia.

?              Experience or interest in building robust and practical data
pipelines on top of cloud infrastructure (Pachyderm, Kubeflow, etc).


Bonuses to include as part of your application

?              Links to online profiles you use such as Github, Twitter etc.

?              A description of your work history (whether as a resume, or
LinkedIn profile).




