Lead AI System & Infra Architect
Date: 9 Dec 2025
Location: Singapore, Singapore
Company: Singtel Group
Powering the Future with AIDA
To lead the next phase of our AI evolution, we’ve launched a new business unit AIDA – Artificial Intelligence & Data Analytics – a strategic engine driving our transformation designed to scale our AI ambitions with precision and purpose. This marks a pivotal shift in how we operate, innovate, and serve to embed intelligence into every layer of our business.
At Singtel, this is more than a technology upgrade. It’s a strategic transformation that redefines how value is created across the enterprise core— augmenting human capabilities and unlocking entirely new potential. It is a transformation journey by aligning people, platforms, and processes under one cohesive strategy. Our mission is to build AI literacy, and foster a culture where intelligence empowers people.
We welcome you to join us on a transformational journey that’s reshaping the telecommunications industry — and redefining what’s possible with AI at its core. Grow with us in a workplace that champions innovation, embraces agility, and puts human potential at the heart of everything we do.
Be a Part of Something BIG!
We are seeking an experienced Lead AI Infrastructure Architect to design and oversee the enterprise wide AI infrastructure that supports model training, fine tuning, inference and agentic AI workloads. You will be responsible for shaping the overall architecture for on premise and cloud based AI services, including platform components, GPU infrastructure, container orchestration, networking, security and operational readiness.
This role requires deep expertise in AI system architecture, hands on familiarity with open source AI infrastructure components, and strong capability to translate AI and business requirements into scalable and secure platform designs. You will work closely with solution architects, AI engineers, platform teams, cyber and system security and cloud teams to ensure the organisation has a robust foundation for current and future AI capabilities.
Make an Impact by:
- Lead the design of end to end AI infrastructure architecture covering on premise and cloud environments for model training, inference, RAG pipelines and agentic AI workloads.
- Define the technical blueprint for AI platform services including container orchestration, data pathways, network topology, GPU clusters, storage and observability.
- Architect and guide the setup of Red Hat OpenShift environments (or equivalent) for AI workloads including Ray, Kubeflow, MLflow, or similar distributed ML frameworks and experience with vLLM, other inference and serving engine.
- Design and integrate cloud based AI infrastructure e.g. Azure, AWS or GCP), including compute, GPU architecture, networking, IAM and data access patterns.
- Oversee infrastrucre capacity planning for GPU and CPU clusters, including utilisation monitoring, cost modelling and optimisation.
- Lead design reviews for AI infrastructure proposals from platform teams and vendors to ensure compliance with enterprise architecture principles.
- Work with networking teams to define connectivity requirements, zero trust boundaries, firewall rules, load balancing and traffic engineering for AI systems.
- Collaborate with cyber and information security to ensure platform hardening, identity management, data protection and secure use of open source components.
- Oversee observability and operational readiness for AI infrastructure including logging, metrics, tracing, GPU health, versioning and rollback strategy.
- Support engineering teams by designing deployment patterns for Ray clusters, Kubeflow pipelines, fine tuning services, and high availability model serving.
- Drive alignment across data platform, cloud engineering, DevSecOps and solution architects on the AI infrastructure roadmap and integration approach.
- Advise leadership on new technologies, open source adoption, performance benchmarks and emerging AI infrastructure patterns.
- Guide the migration of existing systems to modern AI platform architecture while ensuring performance and minimal operational disruption.
- Define disaster recovery and business continuity strategy for AI platform components.
- Ensure AI infrastructure designs support future scale, reliability, security and extensibility for enterprise use.
- Run proof of concept studies to validate new infrastructure solutions, evaluate GPU stack performance and compare open source frameworks.
Skills for Success:
- Bachelor or Master degree in Computer Science, Engineering, Information Systems or a related technical field.
- Candidates with specialisation in infrastructure architecture, cloud platform or distributed systems are preferred.
- More than 8 years of experience in infrastructure, on-premise and cloud architecture with strong exposure to distributed systems, container orchestration and AI platform design.
- At least 3 years in a senior architect role designing large scale platforms or AI infrastructure.
- Strong knowledge of AI infrastructure including model hosting, distributed inference, GPU clusters and LLM performance optimisation.
- Hands on experience with Red Hat OpenShift (or equivalent) for container orchestration and AI workload deployment.
- Familiarity with Ray, Kubeflow, MLflow, or similar distributed ML frameworks and experience with vLLM, other inference and serving engine.
- Deep understanding of cloud computing architecture including compute, storage, networking, load balancing, autoscaling and IAM.
- Strong understanding of network architecture including routing, security zones, firewall rules and service mesh patterns.
- Experience designing secure API integration through MCP, API gateways or service mesh.
- Strong capability in infrastructure as code, DevSecOps practices, CI and CD and automation frameworks.
- Experience designing observability stack for AI infrastructure including logs, metrics, events and tracing.
- Familiar with enterprise standards for data access, encryption, compliance and model safety controls.
- Strong analytical and system thinking capability with the ability to simplify complex architecture for stakeholders.
- Collaborative mindset and ability to work with cross functional teams across cloud, platform, cybersecurity and engineering.
- Strong communication and documentation skills when presenting architecture decisions and design principles.
- Ability to guide engineers, influence platform direction and provide clear technical leadership.
- Passion to stay updated on AI infrastructure innovations and new open source technologies.
Are you ready to say hello to BIG Possibilities?
Take the leap with Singtel to unlock new opportunities and accelerate your growth. Apply now and start your empowering career!