preloader

Data Science Engineer

Active Jobs

Location: Remote (Core PST hours) – Multiple locations across the US

Duration: 6 months, with potential for extension through other project phases

Responsibilities:

  • Design, develop, test, deploy, maintain, and enhance Machine Learning Pipelines utilizing K8s/AKS based Argo Workflow Orchestration solutions.
  • Participate in and contribute to design reviews with the platform engineering team, determining design, technologies, project priorities, deadlines, and deliverables.
  • Collaborate closely with Data Lake and Data Science teams to comprehend data structure and machine learning algorithms.
  • Understand ETL pipelines, ingress/egress methodologies, and design patterns.
  • Implement real-time Argo workflow pipelines, integrate pipelines with machine learning models, and translate data and model results into the business stakeholders’ Data Lake.
  • Develop distributed Machine Learning Pipelines for training & inferencing using Argo, Spark & AKS.
  • Construct highly scalable backend REST APIs to gather data from Data Lake and other use-cases/scenarios.
  • Deploy applications in Azure Kubernetes Service using GitLab CICD, Jenkins, Docker, Kubectl, Helm, and Manifest.
  • Experience in branching, tagging, and maintaining versions across different environments in GitLab.
  • Review code developed by other developers and provide feedback to ensure best practices, accuracy, testability, and efficiency.
  • Debug, track, and resolve issues by analyzing the sources of issues and their impact on application, network, or service operations and quality.
  • Conduct functional, benchmark, and performance testing and tuning for built workflows.
  • Assess, design, and optimize resource capacities (e.g., Memory, GPU, etc.) for ML-based resource-intensive workloads.


Required Skills/Technologies:

  • Bachelor’s/Master’s degree in Computer Science or Data Science.
  • 5 to 8 years of experience in software development and with data structures/algorithms.
  • 5 to 7 years of experience with programming languages Python or JAVA, database languages (e.g., SQL), and NoSQL.
  • 5 years of experience in developing large-scale infrastructure, distributed systems or networks, experience with compute technologies, storage architecture.
  • Strong understanding of microservices architecture and experience with building and deploying RestAPI’s using Python, Flask, and Django.
  • 5 years of experience with Unit and Functional test cases using PyTest, UnitTest, and Mocking External Services for functional and non-functional requirements.
  • Strong understanding and experience with Kubernetes for availability and scalability of the application in Azure Kubernetes Service.
  • Experience in building and deploying applications with Azure, using third-party tools (e.g., Docker, Kubernetes, and Terraform).
  • Experience with cloud tools like Azure and Google Cloud Platform.
  • Experience with development tools, CI/CD pipelines such as GitLab CI/CD, Artifactory, Cloudbees, and Jenkins.


Preferred Skills/Attributes:

  • Python, Kubernetes, Argo Workflow, Argo Event, Hive, SQL, NoSQL, RestAPI’s, Helm, Docker, Jenkins.
  • This job description outlines the expectations, qualifications, and responsibilities of the Cloud Infrastructure Manager role, focusing on expertise in cloud technologies, data science, and infrastructure development

Leave a Reply