Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Devarshi Pathak

Aldie

Summary

Experienced DevSecOps and Site Reliability Engineer with over 20 years of expertise in cloud infrastructure, security automation, and system reliability. Proficient in AWS, Kubernetes, and CI/CD tools, with a strong focus on automating security controls, building resilient systems, and ensuring high availability. Adept at incident management, performance optimization, and cost-effective cloud operations. Passionate about building secure, scalable, and reliable systems, and mentoring teams to adopt best practices for security and operational excellence

Overview

17
17
years of professional experience
1
1
Certification

Work History

Sr SRE/DevOps Consultant

Apex Systems / CapitalOne
10.2024 - Current
  • Part of CapitalOne’s CORE team responsible for entire Consumer Identity platform supporting more than 25 ASVs (Application services)
  • Providing first line of support in on-call rotation once a week
  • Leading a team of SREs, taking ownership of large-scale system reliability initiatives
  • Designing and implementing complex system architectures with a focus on scalability and resilience
  • Focusing on driving DevOps culture within the organization, promoting automation and collaboration
  • Developing and maintaining automation scripts for deployment, monitoring, and remediation tasks using tools like Ansible, Terraform, and custom scripts
  • Implementing robust monitoring systems to proactively identify potential issues and trigger alerts for timely response
  • Analyzing system performance and making capacity adjustments to anticipate future demand
  • Contributing to codebases for building and maintaining infrastructure components, including writing clean, maintainable code
  • Performing SLO analysis helps organizations track, measure, and improve service quality, ensuring that they meet their service commitments effectively using tools like New Relic, Splunk, Observe and other APM tools
  • Own the availability, reliability, and performance of critical services and systems
  • Define and implement Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets to measure and ensure system health
  • Ensure that services meet agreed-upon SLAs (Service Level Agreements) by driving reliability engineering practices across the organization
  • Proficient in AWS services like EC2, S3, RDS, Lambda, and CloudFormation, with a strong focus on reliability, scalability, and cost optimization
  • Skilled in implementing monitoring solutions using CloudWatch, automating deployments through CI/CD pipelines, and driving performance improvements for mission-critical services
  • Experienced in incident management, disaster recovery, and security best practices on AWS
  • Using custom in-house tools like Cloud Doctro, AION, Smart Ops, Counslr, PagerDuty, Hawkeye, Ozone, Cloud Radar and many more.

Subject Matter Expert

CompTIACompTIA
10.2017 - Current
  • Helped develop exam objectives, validate content, and write questions for the CompTIA Linux+ exam
  • Helped develop exam objectives, validate content, and write questions for the CompTIA Cloud+ CV0-003 exam.

Adjunct Associate Professor

University of Maryland Global Campus
08.2017 - Current
  • Teaching Linux Systems Administration (CMIT), SDEV-400 Secure Cloud Programming

DevSecOps Consultant/Engineer

Department
06.2024 - 10.2024
  • Helping DOL secure and enhance DevSecOps initiatives for Appian and WCMS projects
  • Creating modern and secure DevSecOps pipeline to automate and ease deployment of various components and applications
  • Using various tools like Gitlab, Jenkins, Sonarqube, Trivy, Pally, Qualys, argoCD and EKS to rapidly build and deploy the code in Dev/Stage and Production environments
  • Created secure CI pipeline and generated various vulnerability reports like SAST, DAST and SBOM
  • Participated in daily scrum and by weekly client sprint planning
  • Participated in various sprint demo to achieve MVP goals
  • Worked with CloduOps, Development and Security team to create collaborative CI/CD pipelines.

Lead DevSecOps Engineer/Architect

Karthikconsulting, USCG
09.2023 - 06.2024
  • Helping United States Coast Guard’s HERMN team managing secure software factory platform running on Amazon Gov Cloud EKS environment
  • Managing EKS based managed Kubernetes cluster that is hosting Cost Guar’s major software development
  • Daily activity involves managing secure CI/CD pipelines, scanning containers using tools like Trivy, Synk and OpenSCAP
  • Used Confluence for documentation management, creating detailed project documentation
  • Attending SOC and ATO compliance meetings to provide support for security reviews and solutions
  • Deploying microservices on EKS platforms
  • Developed and maintained large-scale applications utilizing advanced TypeScript features such as generics, decorators, and type inference to ensure type safety and improve code maintainability
  • Reviewing Gitlab SAST and DAST compliance report and fixing open vulnerabilities in microservices platforms and software libraries
  • Utilized XML for data exchange between different systems and applications, ensuring structured and well-formed data communication
  • Conducting IT security
  • Risk assessment accordance with FedRAMP, FISMA and NIST 800-53a compliance
  • Developing security controls, threat models, threat analysis and risk mitigations
  • Designed and implemented RESTful APIs in PHP, enabling robust backend services for various client applications
  • Performing vulnerability assessment using various tools like Nessus, Nikto, RedHat Clair and Kubescope/Kivarno
  • Implanting firewall using AWS WAF and DoD approved tools and technologies
  • Using Secure container registries from DoD Iron Bank and Harbor
  • Implementing Security Event Management (SIEM) and Incidents using tools like Wazuh, Nessus and Qualys
  • Performing monthly patching on cloud resources and EKS based managed Kubernetes platform
  • Implementing security polices, trainings and standards across all departments and IT resources including Cloud services, Linux servers and web applications
  • Conducting security audits on a weekly basis.

DevSecOps and Cloud Engineering Director

Universal
08.2022 - 09.2023
  • Managing the team of DevOps and Security engineers in the US and abroad
  • Helping secure Google Cloud, Kubernetes clusters and cloud resources
  • Leveraged advanced PostgreSQL features such as partitioning, indexing, and full-text search to improve application performance and scalability
  • Developing SOC, DevOps and production release strategies and plans
  • Supporting health care products like uVAx, uConsult, uWellness and mobile apps for the organization
  • Supporting government agencies like DHS (CBP), FENMA and many more in the HIPAA compliant environment
  • Create DR and BC plans along with IT security assessment strategy and plans
  • Created RESTful and GraphQL APIs in Node.js, enabling efficient communication between frontend and backend services
  • Used open-source tools like Wazuh, Nessus and Qualys for internal vulnerability assessment and file integrity monitoring
  • Integrated TypeScript with popular frameworks like Angular and React, enhancing the robustness and scalability of front-end applications
  • Performed security audits of web applications and cloud computing resources
  • Created security policies and implemented the same for all IT resources.

Director

DTIS, Digital Trusted Identity Service
10.2019 - 08.2022
  • Of DevSecOps and Cloud Technologies
  • Leading DevOps and IT-ops team within the organization
  • Helping DBA and Dev team for the better
  • Providing robust and automated build and deployment in various stages of the software development lifecycle
  • Managing large portfolios in both commercial and public sectors
  • Used Confluence for documentation and knowledge management, creating detailed project documentation, user guides, and technical specifications to facilitate team collaboration and information sharing
  • Helping government agencies like TSA, FBI, NASA, etc
  • To hire employees securely
  • Providing 24x7 on-call support for many websites, applications, and cloud resources
  • Migrated legacy applications to cloud providers like AWS and Azure, saving more than $30M in operating costs
  • Managing a portfolio of large commercial accounts worth more than $250M
  • Supporting high transactions product like IdentityX that helps large international financial institutes
  • Created architectural design for EKS cluster for some of the commercial applications
  • Developed scalable and high-performance server-side applications using Node.js, employing asynchronous programming and event-driven architecture
  • Built PagerDuty and Icinga2 dashboards for the executive team to visualize platform health
  • Build Hashicorp Vault for better management of secrets
  • Built Icinga2 monitoring cluster for applications and servers monitoring from the ground up
  • Created multiple production environments AWS for various applications
  • Created centralized docker registry for entire origination to manage docker images
  • Created custom Centos OS-based images to be used as AMI for AWS as well as on-prem server buildouts
  • Installed and configured the following products
  • Managed incidents, changes, and service requests in ServiceNow, ensuring timely resolution and adherence to ITIL best practices
  • Supported applications and infrastructure required to comply with FISMA, FedRAMP, DoD, and NIST standards
  • Supported Microsoft Project Management Server (EPM) and Primavera
  • Supported applications are:
  • Application Based Transport – UBER/Wingz/LYFT Q management systems for large airports
  • AAAE clearing house - Employee badging systems for airports
  • FBI proxy - FBI channelling partner service for criminal background check
  • QStartr – Taxicab Q system for large airports
  • Access/Preenrollment for local Sheriff’s offices across the nation
  • NASA/FINRA employee background check services
  • IdentityX – Fingerprint authentication for financial institutes
  • Technical responsibilities:
  • Created a custom operating system for microservices deployment
  • Docker and Kubernetes compatible images
  • CI/CD pipeline on OKD clusters
  • (Openshift 3.x and 4.x version)
  • HAProxy HA cluster for various environments
  • ELK cluster (15+ nodes)
  • Hashicorp Vault for secret management
  • Site24x7 and working dashboards
  • Icigna2 HA cluster and dashboards
  • PagerDuty and dashboards
  • Multiple AWS environment setup from scratch
  • Setup multi nodes EKS cluster from scratch with filebeat, metricsbeat and logstash
  • On-site Kubernetes (50 nodes cluster) on VMWare
  • Created Xwiki application environment for the company documentation
  • Automated infrastructure as code with tools like Ansible and salt
  • Perform Vulnerability scan using Tenable enterprise “Nessus scan”
  • Fix CIS and STIG benchmark vulnerabilities using Ansible automated way
  • Patch Linux system and create dashboards using the Patchman tool
  • Built RabbitMQ cluster (3 nodes) on an internal VMWare-based environment to store fingerprint image metadata
  • Managing the production cluster
  • Built Kafka cluster (3 nodes) on an internal VMware-based environment to store fingerprint images and converted PDF reports
  • Managing the production cluster.

Enterprise Cloud Solutions Lead/Manager

Fannie Mae
07.2019 - 10.2019
  • Supported applications are:
  • Single Family loan processing system
  • Portfolio transaction services
  • Technical responsibilities:
  • Worked on building enterprise container solutions on private and public clouds
  • Wrote custom Docker file and converted legacy applications to Docker containers and deployed them with helm charts on Kubernetes
  • Built mid-size cluster on OpenShift and Rancher/RKE
  • Also created small size cluster on AWS EKS
  • Created CI/CD pipelines using OpenShift build-in solution
  • Converted spring boot application to Docker-based images in test and staging environment
  • Created Jupyterhub/Jupyterlab cluster on top of OpenShift for the development team.

DevOps Tech Lead

Alarm.com
06.2017 - 07.2019
  • Supported applications are:
  • Alarm.com dealer onboarding system
  • AWS cloud
  • 911 extended module
  • ADT pulse connected applications
  • Technical responsibilities:
  • Completed large migration project of legacy containerized application to highly secure, scalable Kubernetes environment that saved millions of dollars
  • Completed RKE-based cluster on VMWare Virtual environment running docker containers on Kubernetes managed platform
  • Created dockerized Kubernetes environment from scratch and deployed PHP/MySQL-based application
  • Built a large cluster of Kubernetes cluster with Rancher/RKE technology and integrated with F5 load balancer for high traffic critical customer-facing applications
  • Automated CI/CD build pipeline using Gitlab runners running on Kubernetes
  • Integrated application metrics and monitoring with Icinga2, PagerDuty, SumoLogic, Grafana, Prometheus, Influxdb and Wavefront
  • Managed large Hadoop cluster based on Cloudera express running Spark jobs for business intelligence data
  • Built two large Hadoop clusters in the Staging and Production environment
  • Built secure, scalable centralized docker registry with Nexus OSS front-end from scratch
  • Created multiple dashboards with TICK (Telegraf, Influxdb, Chronograf, and Kapacitor) stack for Docker, Linux hosts statistics, analysis, and monitoring
  • Supported Microsoft Project Management Server (EPM) and Primavera
  • Created VMWare vRA automation blueprint to deploy Dockerized VMware host and integrated with internal docker registry to build .net and docker enabled VMware environment thought company
  • Built open-source Kafa clusters in testing and production environment for IoT devices.

Comcast
01.2013 - 07.2017

Lead DevOps Engineer, Sr. System Engineer

Corpus Inc Washington
02.2009 - 01.2013
  • Worked as a 24x7 Production support engineer for Comcast’s UDB team which manages a highly available cluster application environment that provides Entitlements to Comcast customers
  • The team also manages other applications like Title availability service, digital Locker, Resume point, Streams, Caller ID, Data Ingest, and many more
  • As team members, used to manage large number of physical, virtual, and microservices instances in a production environment
  • Providing on-call support for all customer-facing applications and services, metrics, monitoring, scripting, etc
  • Hands-on experience on backend systems like Hadoop, Cassandra, MySQL, Hazelcast, and Riak
  • Working knowledge of Ansible playbooks, puppet configuration management, and Jenkins for automated deployments
  • Have worked on many projects that involved heavy scripting using Perl, Python, Ruby, and PHP
  • Created many monitoring scripts and a configuration management interface called RMI to ease Nagios configuration management easy
  • Hands-on experience with virtualization technology like VMWare, KVN, and Xen
  • Also worked on messaging systems like RabbitMQ and Redis
  • Installed, configured, and maintained ELK (Elasticsearch, Logstash, and Kibana) stack
  • Performed security patches to Linux (OpenSSH/OpenSSL/Glibc) and VMWare (ESXi/VCenter) servers
  • Provide 24x7 on-call support once a month
  • Supporting large-scale Java applications providing Video-on-demand and linear schedule data
  • Installing and Managing security patches, OS upgrades, etc
  • Deployment automation using Jenkins, Ansible, Puppet, and Docker
  • Created a few Nagios plugins in Perl, and Python to provide monitoring and metrics for the production applications.

Sr. Production Operation Engineer

Comcast
05.2008 - 12.2012
  • Worked as a UNIX Production support engineer for, ’s TVSearch, Video Search, TVPlanner, and video on Demand Team
  • Have implemented more than 200 Servers on ESXi Virtualization technology
  • Helped plan and execute VMWare ESXi to Xen on Dell R900 platform
  • Implemented a bunch of Red Hat Enterprise Linux 5.3 64-bit kickstart server installations
  • Also, installed and configured PXE boot for diskless and auto boot process
  • Developed Nagios monitoring customer PERL/Python code for Systems and software downtime alerts
  • Upgraded JBOSS / Jdk in Production and non-Production environment
  • Installed, configured, and managed Webmethods 7.1.2 on RedHat Linux to integrate a Video-on-demand metadata hub with a Java-based VOD search application system running on Weblogic 8.1
  • Used Weblogic Workshop control to seamlessly connect with Wemethods
  • Developed MySQL/PHP-based Video on Demand assets tracking system as well as Xen host and Guest dynamic search system
  • Provide 24x7 on-call support once a month
  • Supporting large-scale Java applications providing Video-on-demand and linear schedule data
  • Installing and Managing OS patch, Java security patch
  • Using Xymon, Nagios, Cacti, and Bamboo tools for builds and system monitoring
  • Providing support for Jboss, Apache Tomcat, Hadoop, and Solr products
  • Installation and configuration of Redhat, Xen, and KVM.

UNIX Lead Developer

Duke University Heath Technology Systems
09.2008 - 12.2008
  • Worked on Solaris OS migration from 8 to 10 and RHEL 3 and 4 to RHEL 5.2
  • Managing 50+ Servers of HP, IBM, and Sun
  • Installed, and configured Veritas FS 5.0 and Netbackup on Solaris and Linux Servers
  • Wrote custom script in PERL/PHP/Shell for OS, Database migration
  • Writing PERL modules for Nagios and Cacti for Java-based applications hosted on JBOSS, Apache, and Tombact on Linux Serves
  • Installed and configured Veritas Cluster system for OS and Java application availability
  • Writing technical documents for upgrades, applications, and log monitoring
  • System log management using Splunk on Linux servers
  • Major work included in Nagios, Cacti, Splunk, Jboss, and Apache products
  • Code design with PERL, PHP, MySQL, Python, and Dtrace for application testing, monitoring, and debugging
  • Use Control-M for scheduling jobs
  • Used Linux LVM + Xen to create high availability Linux Environments.

Infrastructure Management Specialist

Fannie Mae
05.2008 - 08.2008
  • Environment: UNIX, Veritas, ClearCase, ClearQuest, Subversion, Oracle, Weblogic, SunOne, Web server, AutoSys-Remedy, PERL-PHP, and Shell programming
  • Worked as an Infrastructure Manage specialist for Disaster Recovery System setup and integration, built scripts using PERL, shell scripting and Upgraded ClearCase Server version 7 in UNIX Environment
  • Installing, configuring, and Administration of Global & Sparse Zones (LDOMS) on Solaris 10 Servers
  • Support provided for Central Log analysis and monitoring team for Log storage for different teams within an organization
  • Daily activities included Backup and Analysis of Logs in a Production environment
  • Java builds deployments in Development System testing and Production environments
  • Updated documents on SharePoint sites
  • LDAP and Main server management
  • Configure JDBC connection pool in config.xml
  • Responsible for understanding application environment and changes to tune WebLogic parameters in support of application stability
  • Worked closely with the development team, infrastructure DBA, application architect, and performance test engineer to tune the application adjusting thread pool sizes, bean pool sizes, database connection pools, and JVM heap size
  • Weblogic, SunOne Web server, Sun Cluster 3.1 on Solaris 10, Veritas Volume Manager support, administration, and troubleshooting
  • Installation and administration experience with HP OVO 8
  • X
  • Used Dtrace for OS / Users logs
  • Digital Certificates installation for Weblogic Servers for Production and non-Production Environments
  • Different job automation, Calling System commands, scanning networks, file manipulation using UNIX Shell and PERL/PHP scripting
  • Installation, configuration, and integration of Nagios on RHEL 5 servers
  • Nagios Plugin development using perl, php, python and shell scripting
  • Clearcase to Subversion migration for large Java based application code for health system.

Education

Master of Science - Information Security

Strayer University
Washington
09-2015

Skills

  • Automation and CI/CD pipeline
  • Infrastructure automation (IaC)
  • Monitoring, logging and Observability
  • Incident management
  • DevSecOps implementations
  • Cloud Infrastructure Management
  • Capacity Planning & Performance Tuning
  • Disaster Recovery (DR) & Business Continuity
  • Cost optimization
  • Container Orchestration & Management
  • Version Control & Collaboration

Certification

  • CKA (Certified Kubernetes Administrator)
  • CKAD (Certified Kubernetes Application Developer)
  • KCAN (Kubernetes and Cloud Native Associate)
  • CompTIA Linux+
  • CompTIA PenTest+

Timeline

Sr SRE/DevOps Consultant

Apex Systems / CapitalOne
10.2024 - Current

DevSecOps Consultant/Engineer

Department
06.2024 - 10.2024

Lead DevSecOps Engineer/Architect

Karthikconsulting, USCG
09.2023 - 06.2024

DevSecOps and Cloud Engineering Director

Universal
08.2022 - 09.2023

Director

DTIS, Digital Trusted Identity Service
10.2019 - 08.2022

Enterprise Cloud Solutions Lead/Manager

Fannie Mae
07.2019 - 10.2019

Subject Matter Expert

CompTIACompTIA
10.2017 - Current

Adjunct Associate Professor

University of Maryland Global Campus
08.2017 - Current

DevOps Tech Lead

Alarm.com
06.2017 - 07.2019

Comcast
01.2013 - 07.2017

Lead DevOps Engineer, Sr. System Engineer

Corpus Inc Washington
02.2009 - 01.2013

UNIX Lead Developer

Duke University Heath Technology Systems
09.2008 - 12.2008

Sr. Production Operation Engineer

Comcast
05.2008 - 12.2012

Infrastructure Management Specialist

Fannie Mae
05.2008 - 08.2008

Master of Science - Information Security

Strayer University
Devarshi Pathak