Active Jobs
Location: Remote (Core PST hours) – Multiple locations across the US
Duration: 6 months, with potential for extension through other project phases
Responsibilities:
- Design, develop, test, deploy, maintain, and enhance Machine Learning Pipelines utilizing K8s/AKS based Argo Workflow Orchestration solutions.
- Participate in and contribute to design reviews with the platform engineering team, determining design, technologies, project priorities, deadlines, and deliverables.
- Collaborate closely with Data Lake and Data Science teams to comprehend data structure and machine learning algorithms.
- Understand ETL pipelines, ingress/egress methodologies, and design patterns.
- Implement real-time Argo workflow pipelines, integrate pipelines with machine learning models, and translate data and model results into the business stakeholders’ Data Lake.
- Develop distributed Machine Learning Pipelines for training & inferencing using Argo, Spark & AKS.
- Construct highly scalable backend REST APIs to gather data from Data Lake and other use-cases/scenarios.
- Deploy applications in Azure Kubernetes Service using GitLab CICD, Jenkins, Docker, Kubectl, Helm, and Manifest.
- Experience in branching, tagging, and maintaining versions across different environments in GitLab.
- Review code developed by other developers and provide feedback to ensure best practices, accuracy, testability, and efficiency.
- Debug, track, and resolve issues by analyzing the sources of issues and their impact on application, network, or service operations and quality.
- Conduct functional, benchmark, and performance testing and tuning for built workflows.
- Assess, design, and optimize resource capacities (e.g., Memory, GPU, etc.) for ML-based resource-intensive workloads.
Required Skills/Technologies:
- Bachelor’s/Master’s degree in Computer Science or Data Science.
- 5 to 8 years of experience in software development and with data structures/algorithms.
- 5 to 7 years of experience with programming languages Python or JAVA, database languages (e.g., SQL), and NoSQL.
- 5 years of experience in developing large-scale infrastructure, distributed systems or networks, experience with compute technologies, storage architecture.
- Strong understanding of microservices architecture and experience with building and deploying RestAPI’s using Python, Flask, and Django.
- 5 years of experience with Unit and Functional test cases using PyTest, UnitTest, and Mocking External Services for functional and non-functional requirements.
- Strong understanding and experience with Kubernetes for availability and scalability of the application in Azure Kubernetes Service.
- Experience in building and deploying applications with Azure, using third-party tools (e.g., Docker, Kubernetes, and Terraform).
- Experience with cloud tools like Azure and Google Cloud Platform.
- Experience with development tools, CI/CD pipelines such as GitLab CI/CD, Artifactory, Cloudbees, and Jenkins.
Preferred Skills/Attributes:
- Python, Kubernetes, Argo Workflow, Argo Event, Hive, SQL, NoSQL, RestAPI’s, Helm, Docker, Jenkins.
- This job description outlines the expectations, qualifications, and responsibilities of the Cloud Infrastructure Manager role, focusing on expertise in cloud technologies, data science, and infrastructure development