Summary
Overview
Work History
Education
Skills
Websites
Certification
References
Timeline
Generic

SAI TEJA T

Richmond

Summary

Results-driven Sr. DevOps/Cloud Engineer and BI Operations Specialist with over 12 years of experience delivering scalable infrastructure, automation, and analytics solutions across AWS, Azure, and GCP. Proven expertise in implementing end-to-end CI/CD pipelines using tools like Azure DevOps, Jenkins, GitHub Actions, Terraform, Ansible, and Chef, with strong capabilities in containerization (Docker, Kubernetes), and infrastructure as code.

Skilled in BI platform support and automation, optimizing Tableau, Looker, and Qlik Sense environments to drive data-informed decisions. Adept at managing cloud-native services, serverless architecture, and security compliance frameworks. Hands-on experience in scripting with Python, Shell, and PowerShell for automating deployments, monitoring, and configuration management.

Demonstrated success in leading infrastructure modernization efforts, migrating databases to AWS RDS, integrating observability tools like Datadog, Grafana, and Splunk, and supporting high-availability platforms in Agile/DevOps environments. Trusted for cross-functional collaboration, process optimization, and mentoring DevOps teams to operational excellence.

Overview

11
11
years of professional experience
1
1
Certification

Work History

SRE/DevOps/Cloud Engineer/ BI Operations Engineer

Expedia Group
Seattle
10.2020 - Current
  • Monitored and managed Tableau extract refreshes and Looker PDT schedules to provide timely and accurate delivery of dashboards to business stakeholders.
  • Troubleshoot Looker dashboards by identifying broken LookML references, permission issues, or stale PDTs, and work with data teams to resolve root causes.
  • Provided day-to-day support for Tableau and Looker users, including access requests, report performance issues, and dashboard version control.
  • Performed weekly Qlik Sense platform maintenance, including reload monitoring, license audits, resource usage tracking, and log review, ensuring 99.9% platform uptime.
  • Automated Terraform and AWS CloudFormation deployments using GitHub Actions enable reliable infrastructure provisioning with version control.
  • Integrated OpenStack with CI/CD pipelines using Jenkins and GitLab for automated infrastructure provisioning.
  • Creating automation with Python for auto-discovering AWS public endpoints, and scanning them using Rapid7 InsightVM.
  • AWS administration and infrastructure automation using Terraform and CloudFormation.
  • Using Terraform to set up the AWS infrastructure, such as launching the EC2 instances, S3 bucket objects, VPCs, and subnets.
  • Using Terraform in AWS Virtual Cloud to automatically set up and modify settings by interfacing with control layers is also used to create and compose all the components necessary to run applications.
  • Implemented centralized secrets management using AWS Secrets Manager and Parameter Store, integrated with CI/CD pipelines, and Lambda functions to eliminate hardcoded credentials.
  • Used PagerDuty analytics to drive SRE improvements, identifying high-noise services, and reducing alert fatigue by tuning thresholds and deduplicating events.
  • Created system alerts using various Datadog tools, and alerted application teams based on the escalation matrix.
  • Centralized security logging with Elasticsearch, and maintaining 45 AWS.
  • Configured SSL and secure endpoints for OpenStack APIs to ensure encrypted communication.
  • Implemented centralized secrets management using HashiCorp Vault, eliminating plaintext secrets across CI/CD pipelines, Terraform deployments, and container workloads.
  • Using Kubernetes to manage containerized applications using its nodes, ConfigMaps, namespaces, service meshes, selectors, services, and deployed application containers as pods.
  • Evaluated OpenStack integrations with Kubernetes (Magnum) and Docker containers.
  • Managed Docker orchestration and containerization using Kubernetes, scaling, and managing Docker containers.
  • Integrated Python-based scanning tools, like Bandit, Trivy, and custom linters, into CI/CD for early vulnerability detection.
  • Created Datadog dashboards for various applications, and monitored real-time and historical metrics.
  • I set up Zabbix and Grafana for monitoring and dashboards related to the server.
  • Create multiple GitLab pipelines with automation using Terraform for different Java applications.
  • Managed users, groups, permissions, and sudo policies, ensuring secure access controls in multi-user Linux environments.

Sr. AZURE DevOps Engineer/SRE

Comcast
Denver
04.2018 - 09.2020
  • Configured Azure VPN Point-to-Site, virtual networks, and custom security solutions.
  • Developed build and release pipelines for .NET Core and Java projects using Azure DevOps.
  • Designed nested templates to automate Azure resource creation across multiple environments.
  • Experience with Azure Site Recovery, Operations Management Suite, PowerShell scripts, and ARM templates.
  • Experience in managing hosting plans for Azure infrastructure, implementing, and deploying workloads on Azure virtual machines (VMs).
  • Monitored Kubernetes clusters by deploying sidecar Prometheus exporters as a data aggregator, and Grafana as a data visualization platform.
  • Deployed Azure Kubernetes Service (AKS) using Resource Manager Templates and Terraform, and created Python scripts for deploying Big Data clusters in Kubernetes and OpenShift environments.
  • Developing microservices onboarding tools, leveraging Python and Azure DevOps, allows for easy creation and maintenance of build jobs, Kubernetes deployments, and services.
  • Managed serverless services, created and configured HTTP triggers in Azure Functions, with application insights for monitoring, and performed load testing on the applications using Azure DevOps Services.
  • Installed and configured Pivotal Cloud Foundry (PCF) Application Manager, configured LDAP for authorization, and configured the log generator for logs in PCF (Splunk).
  • Converted existing Terraform modules that had version conflicts to utilize ARM during Terraform deployments to enable more control, or to address missing capabilities.
  • Working with Terraform templates to automate the Azure IaaS virtual machines using Terraform modules, and deploying virtual machine scale sets in a production environment.
  • Controlled and automated application deployments, updates, and orchestrated deployments using Kubernetes.
  • Deployed workload and configuration builds using Docker, Kubernetes, and Azure CLI.
  • Managed Dynatrace Guardian to obtain, debug, and update custom monitors and plug-ins.
  • Orchestration improvements to the Dynatrace deployment to reduce upgrade time.
  • Integrated Dynatrace with Active Directory, email servers, and event management.
  • Worked on Helm charts to configure the CI/CD pipeline and install relevant plugins.
  • Created and managed a Docker deployment pipeline for custom application images in the cloud using Jenkins.
  • Managed local deployments in Kubernetes, created a local namespace, and deployed application containers.
  • Maintained automation configuration management tools like Chef, and continuous integration, deployment, and constant monitoring solutions.
  • Managed Chef playbooks with Chef roles. Used the file module in the Chef Playbook to copy and remove files on nodes.
  • Responsible for installing Jenkins master and slave nodes, and configuring Jenkins builds for continuous integration and delivery.
  • Created Jenkins pipelines for several downstream/upstream job configurations based on dependencies from other applications, and based on release methodologies.
  • Developed custom solutions in C# and PowerShell to validate the availability, consistency, and compliance of environments.
  • Working with the source control management tool GitLab, and creating Git repositories with specified branching strategies.
  • Developed new Splunk apps to monitor the application log volume (event count), indexing volume, missing events, and missing hosts, source, or source type from Splunk monitoring.
  • Written PowerShell scripts for archiving and moving older log files to Azure Storage, and automation scripts using Python Boto3.
  • Implemented database deployment using the CI/CD process on Azure SQL Database.
  • Deployed a LAMP server from the command line and migrated the MySQL database and PHP code from Windows Server to CentOS (Red Hat).

DevOps/AWS/GCP Cloud Engineer

RMS
Newark
01.2017 - 04.2018
  • Built and configured virtual data centers in AWS for Data Warehouse support, including VPC, subnets, security groups, and Elastic Load Balancer.
  • Developed Terraform scripts to automate AWS infrastructure for web servers, ELB, CloudFront, databases, EC2, and S3 buckets.
  • Managed automation design and implementation in AWS using CloudFormation for efficient data center infrastructure.
  • Migrated MSSQL Server database from Rackspace to AWS while providing daily operational support.
  • Created Ansible playbooks in YAML for user management, application installation, and Jenkins integration for CI automation.
  • Utilized Kubernetes to deploy and manage Docker containers across multiple namespaces for scalable applications.
  • Conducted high availability and scalability testing on Kubernetes clusters under simulated loads to ensure optimal performance.
  • Oversaw daily administration tasks in AWS and GCP environments, including monitoring setup using Stackdriver.
  • Utilized Ansible Tower to optimize scheduling for multiple configurations, and scale cluster run times.
  • Virtualized servers with Docker for test and development environments, configuring automation via containers.
  • Developed container support for cloud environments, deploying applications within Docker containers at an enterprise level.
  • Conducted Kubernetes cluster testing for pod availability, component health, and security considerations.
  • Executed high availability and scalability testing on multiple Kubernetes clusters under heavy loads.
  • Created jobs in Jenkins for build, integration testing, and deployment automation.
  • Configured the Jenkins CI tool for regression testing automation using the Selenium Plugin with test cases.
  • Developed cross-cloud automation scripts in Python using boto3 (AWS) and Google Cloud SDK, enabling unified resource management and reporting across AWS and GCP environments.
  • Built Python-based GCP workflows using Cloud Functions and Cloud Scheduler, automating storage lifecycle tasks, VM snapshots, and billing alerts.
  • Migrated Oracle databases to AWS RDS, covering Oracle, Postgres, and MySQL, including views and stored procedures.
  • Developed infrastructure automation scripts using Python and Ruby for improved efficiency.
  • Managed the release and deployment processes of large-scale Java/J2EE web applications.
  • Configured NoSQL databases, including HBase, MongoDB, and Cassandra, enhance data handling capabilities.
  • Utilized Apache Mesos and Marathon with CloudFormation templates on Ubuntu for resource management.

Site Reliability Engineer

Citibank
Irving
12.2016 - 12.2017
  • Utilized AWS Elastic Beanstalk for deploying and scaling Java-based web applications and services.
  • Authored test cases and modules, integrating with the CI tool, Bamboo.
  • Rewrote ARM templates and Azure CLI scripts into Terraform and AWS CLI, standardizing infrastructure as code across multi-cloud teams.
  • Translated IAC templates from CloudFormation and Terraform (AWS) to ARM templates (Azure), standardizing deployment processes.
  • Configured Jenkins pipelines for microservices builds, deploying to the Docker registry and Kubernetes.
  • Installed Nagios for performance monitoring, managing alerts for server disk space.
  • Monitored hosts and services daily using the Nagios monitoring tool.
  • Authored multiple playbooks, and created roles in Ansible for enhanced environment management.
  • Implemented Ansible alongside Ansible Tower to automate software development processes.
  • Established tagging standards for the identification and ownership of EC2 instances and AWS resources.
  • Deployed the ExtraHop Monitoring tool on AWS to identify workloads for migration and optimize performance.

Build and Release Engineer

Sorenson Media Group
Salt Lake City
09.2015 - 11.2016
  • Created Chef and Ansible playbooks for Linux and Windows server administration.
  • Managed Ansible playbooks to generate images in a private cloud environment.
  • Oversaw the installation, configuration, and maintenance of Jenkins Continuous Integration (CI) processes.
  • Established CI for new branches, automated builds, managed plugins, and secured Jenkins environment.
  • Implemented branching and build/release strategies using Git for version control.
  • Authored Python scripts with CloudFormation to automate the deployment of EC2 and VPC services.
  • Automated daily tasks and deployments by writing Shell and YAML scripts to eliminate manual processes.
  • Converted Java projects into Maven projects by creating POM files and managing dependencies.
  • Deployed SQL scripts in Oracle and Ab Initio tags across multiple test environments.
  • Supported Linux and Windows environments in a lab, or a hybrid cloud setup.

Unix/Linux System Administrator

App Tree Solutions
Hyderabad
08.2014 - 07.2015
  • Assisted in the installation and configuration of Linux/Unix operating systems (e.g., Ubuntu, CentOS, Red Hat).
  • Used SSH to remotely log into servers and run basic maintenance tasks.
  • Developed automation scripting in Python using Puppet to deploy and manage Java applications across Linux servers.
  • Troubleshooting Linux network, security-related issues, and capturing packets using tools such as IP tables, firewalls, TCP wrappers, and NMAP.
  • Work experience in hosting ASP, ASP.NET, and HTML web applications on Windows 2003 servers.
  • Worked with basic Linux/Unix commands (ls, cd, mkdir, rm, cp, mv) for file and directory management.
  • Navigated and managed the Linux/Unix file system structure.

Education

Master of Science - Computer Science

San Francisco Bay University
Fremont, CA
12-2016

MBA - Computer Science

Campbellsville University
Campbellsville, KY

Skills

  • Kubernetes Management
  • CI/CD Automation (Jenkins, GitHub Actions, Azure DevOps, and GitLab)
  • Docker Orchestration
  • Cloud (AWS, Azure, and GCP)
  • Scripting Languages (Python, Java, PowerShell, NET)
  • BI tools (Tableau, Looker, and Qlik Sense)
  • Monitoring Tools (DataDog, Splunk, and Nagios)
  • Version Control Tools (GitHub, Bitbucket)
  • Infrastructure as Code (Terraform, CloudFormation, and ARM templates)
  • Operating Systems (Linux, Unix, and Windows servers)
  • Database (MySQL, Cassandra, MongoDB, DynamoDB)

Certification

  • AWS Certified Developer
  • Kubernetes Administrator

References

References available upon request.

Timeline

SRE/DevOps/Cloud Engineer/ BI Operations Engineer

Expedia Group
10.2020 - Current

Sr. AZURE DevOps Engineer/SRE

Comcast
04.2018 - 09.2020

DevOps/AWS/GCP Cloud Engineer

RMS
01.2017 - 04.2018

Site Reliability Engineer

Citibank
12.2016 - 12.2017

Build and Release Engineer

Sorenson Media Group
09.2015 - 11.2016

Unix/Linux System Administrator

App Tree Solutions
08.2014 - 07.2015

Master of Science - Computer Science

San Francisco Bay University

MBA - Computer Science

Campbellsville University
SAI TEJA T