Cloud/DevOps Engineering, Proficient in scripting (e.g., Python, Bash, or PowerShell); Go/Rust is a plus, Strong expertise in Terraform, Terragrunt, Helm, Kubernetes, and Docker
Groundup.ai is a Singapore-based AI startup that helps companies to reduce unplanned downtime of industrial assets without needing a huge learning curve and high-risk deployments on the ground.
Cloud Infrastructure & Automation
○ Architect and manage scalable, secure infrastructure on GCP, Azure, and
occasionally OCI/AWS.
○ Implement and manage Infrastructure as Code (IaC) primarily using Terraform
and occasionally with Terragrunt, and Helm.
● CI/CD Pipelines
○ Design and optimize CI/CD workflows using GitHub Actions, Jenkins, and
GitHub Enterprise (reusable workflows, OIDC federation).
○ Ensure seamless deployment pipelines from code commit to production for
microservices and AI workloads.
● Container Orchestration
○ Manage Docker containers using tools such as Portainer, Docker Image
repositories, Kubernetes clusters, including GPU node infrastructure for AI
workloads.
○ Support canary releases, blue-green deployments, and auto-scaling
strategies.
○ Implement and manage serverless deployments on Google Cloud Platform
(Cloud Functions, Cloud Run).
● Resource Planning & Hardware Estimation
○ Assist in hardware estimation for both on-premise and cloud environments,
based on resource requirements such as the number of sensors and storage
needs.
○ Ensure robust backup strategies and data redundancy for all infrastructure
components.
○ Assist the team in auditing the on-cloud and on-premises resources.
● Security & Compliance
○ Enforce cloud security best practices: image hardening, secret management,
IAM least privilege, SBOMs, and vulnerability scanning.
○ Collaborate on compliance requirements (SOC 2, ISO 27001), and respond to
audits and incidents proactively.
○ Configure and manage Cloudflare for enhanced security and performance.
● Monitoring & Observability
○ Build and maintain observability stacks using Grafana, Prometheus, Loki,
Tempo, Datadog, OpenTelemetry, and Sentry.
○ Diagnose and resolve performance bottlenecks across compute, storage, and
networking layers.
○ Monitor and optimize cloud spending to ensure cost-efficiency.
○ Develop and implement disaster recovery plans, conducting regular drills to
ensure business continuity.
● Team Collaboration
○ Partner with engineers to embed DevOps best practices.
○ Establish and enforce documentation standards for infrastructure, processes,
and troubleshooting guides.
○ Use Plane for sprint planning, incident tracking, and delivery visibility.
For more info about this company and its career >> visit https://groundup.ai/careers/
Freelancing Malaysia@2025 Managed by Heyram Solutions 201103052949 (PG0278884-P)