We are looking for a 'Machine Learning (ML) Data' engineer who will partner with application teams and assist with data analysis and research; conduct tactical data extracts and build balanced 'ML and Data pipelines'; deploy AI/ML models and build reports to measure deployed model efficiency.
Key Responsibilities:
1. Understanding business objectives and developing models that help to achieve them, along with metrics to track their progress.
2. Analyzing the ML algorithms that could be used to solve a given problem and ranking them by their success probability.
3. Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world.
4. Verifying data quality, and/or ensuring it via data cleaning.
5. Supervising the data acquisition process if more data is needed.
6. Finding available datasets online that could be used for training.
7. Defining validation strategies.
8. Defining the preprocessing or feature engineering to be done on a given dataset.
9. Defining data augmentation pipelines.
10. Training models and tuning their hyperparameters.
11. Analyzing the errors of the model and designing strategies to overcome them.
12. Deploying models to production.
Required:
Hands-on experience in data warehouse, ETL, data modeling & reporting.
4-5 years of hands-on experience in productizing and deploying Big Data platforms and applications, Hands-on experience working with Relational/SQL, distributed columnar data stores/NoSQL databases, time-series databases, Spark streaming, Kafka, Hive, Redshift, and more
Familiarity with data pipelines and ML pipelines right from Data Extraction to Insights generation
Highly skilled in SQL, Python, Spark, AWS S3, Hive Data Catalog, Parquet, Redshift, Airflow, and Tableau or similar tools.
Proven experience in building a custom enterprise data warehouse or implementing tools like data catalogs, spark, tableau, kubernetes, and docker
Deep knowledge of data structures and algorithms.
Strong verbal and written communications skills are a must and work effectively across internal and external organizations and virtual teams.
AWS Certified Data Engineer
If you are interested in this opportunity, please send your resume to careers@viai.tech with the subject line.
BDNT LABS is a pioneering technology company based in Adilabad. As trailblazers in the tech industry, we continually push the boundaries to deliver cutting-edge solutions. We're currently seeking a seasoned architect to champion our software development endeavors.