Senior Data Engineer
We are looking for a passionate and experienced Senior Data Engineer with strong Python or Scala experience to join our expanding analytics team consisting of data- and algorithm-focused individuals. In this team, you will participate in growing and improving BrainFinance’s data environment and as such, will be responsible for building, deploying, and maintaining data pipelines for batch and real-time data analytics. You will be working alongside experienced developers and researchers as part of a growing and exciting team of quantitative-minded individuals at the core of the business’ operations. We expect the candidate to be comfortable working in a dynamic start-up environment requiring high autonomy, resourcefulness, and strong problem-solving skills.
As part of the analytics team, you will be contributing to the development of our data environment through the integration and evaluation of a high number of data sources. You will be working closely with both our technical experts as well as our machine learning experts to build and support the analytical operations.
With us you will:
Develop and maintain data pipelines;
Evaluate and integrate third-party data;
Analyze, parse and extract data from structured and unstructured datasets;
Gather and process raw data at scale;
Create and maintain various datasets using complex data transformation both in batch and real-time modes;
Implement event-based and status-based rule-engines;
Identify and correct data quality issues;
Design and maintain machine learning serving infrastructure;
Work closely with internal partners (technology, machine learning, and business experts);
BS/BA in Computer Science, or equivalent experience;
3+ years of professional experience in data pipeline development and maintenance;
Strong experience in data engineering processes developing scripts to integrate and normalize third-party data using APIs as well as web scraping/crawling (beautifulsoup, scrapy, selenium, etc.);
Experience with most of the following technologies: Batch processing (Spark or MapReduce), Streaming processing (Spark-Streaming or other), Event Processing (Kafka, RabbitMQ, etc.) NoSQL Database (Elastic Search, MongoDB, Cassandra, etc.) Visualization Tool (Tableau, GCP Data Lab, etc.) Containerization (Docker), Micro Service architectures, Pipelining frameworks (Luigi and/or Airflow);
Strong knowledge of computer science fundamentals: data structures, algorithms, programming languages, distributed systems, and information retrieval;
Experience writing automated unit and functional tests;
Strong knowledge of UNIX environment including shell scripting;
Experience with versioning tools (Git);
Experience working in an Agile development environment;
Excellent communication skills.
Bonus Points :
Experience with other programming languages (Scala, Java, C#);
Knowledge of data mining and machine learning;
Experience with transactional, geospatial, text and image data;
Experience working with analytics teams;
Experience in the finance industry;
A genuine thirst for knowledge;
Something looks off?