Codex Github Mcp Jobs in Usa
250 positions found — Page 3
This isn't a traditional AppSec role. It sits right at the intersection of AI-driven development, SaaS security, and financial-grade risk management—and they need someone who can help shape how security evolves alongside modern engineering.
Why this role stands out:
- Ownership of AppSec across 20+ SaaS applications in a highly regulated financial environment
- Direct involvement in securing AI-assisted development (Copilot, Cursor, Codex)
- Hands-on with AI-powered pentesting tools + modern SAST/DAST pipelines
- Opportunity to define secure AI coding guardrails (this is a big focus area for them)
- High collaboration with engineering, risk, and compliance—this is a true partner role, not a silo
What they're looking for:
- Strong background in application security + secure SDLC (SAST, DAST, SCA)
- Experience with code review (Python, C#, Java, or JavaScript)
- Exposure to AI-driven development environments and their security implications
- Ability to balance technical risk with business impact (this is key in their environment)
- Bonus if you've worked with SSPM tools or SaaS security at scale
- Location: Boston (4 days onsite)
Compensation: $130-155k Responsibilities:
- Participate in full lifecycle development of SDLC and implement all DevOps procedures to manage and support the CI/CD process including the automation of the build, test, deploy pipelines and configuration management.
- Employ best practices for designing automation processes and utilities that can be easily used by the development teams.
- Design and develop a best practice release management process that employs separation of control and proper approvals.
- Closely partner with the security and infrastructure teams to incorporate corporate standards into the CI/CD and provisioning processes.
- Maintaining source control management system and integrating it with software build and deployment.
- Responsibility for the build environment: resolve build issues, help coordinate complex software test environments and software releases.
- Monitoring of Applications operational processes, escalating and facilitating failure resolution as appropriate.
Qualifications: Required
- 5+ years of professional experience of working with the full software development life cycle and designing/developing best practice CI/CD pipelines, GitHub Actions, Ansible (IaC), Terraform/CloudFormation, K8s, test automation, static code analysis, Artifactory and release management processes.
- Proficient in at least two of the following Windows batch/PowerShell, bash, Python.
- Knowledgeable about networking (TCP, UDP, ICMP, ARP, DNS, TLS, HTTP, SSH, NAT, firewall, load balancing, etc).
- Strong experience with managing and support of Windows/Linux Servers.
- Good understanding of deployment of various platforms such as web/REST API, messaging bus/queue, application services, Microservices and Cloud Serverless components/managed platform.
- Experience working with relational databases/SQL and no-SQL, other database technologies are a plus.
- A curiosity concerning technology and the ability to learn new systems and tools quickly.
- Excellent communication skills and the ability to work in a collaborative environment.
Preferred
- Experience with Cloud solutions i.e Azure (VNet, privateLink, Blob storage, Azure SQL, Web App, Data Factory, Client, AKS, ARO, SQL Server/Cosmos) / AWS (VPC, EC2, S3, Route53, ECS, EKS, RDS, ALB/NLB).
- Experience with code-quality (SonarQube, GitHub Enterprise Advanced Security/CodeQL, Jfrog Artifactory + Xray).
- Experience with containers and orchestration technologies (Docker, K8s, OpenShift).
- Experience with application telemetry, monitoring and alerting solutions (Splunk, LogicMonitor, AWS CloudWatch, Azure Insight or similar).
Job description
Role:: Senior Framework Architect - Angular (x1)
Location:: Irving, TX
The Senior Framework architect will lead the development of the Angular codebase for our internal design system, strongly contributing to the development and strategic technical direction of internal frameworks, products and systems. You will ensure stability and scalability of the framework, and work closely with the rest of our framework development team, and with the CSS lead.
The framework architect will be integrated within the Design team to produce code that aligns to the standards defined in our internal design library. Your primary task is to help build and maintain the internal Angular framework, which is used to create innovative and intuitive digital products that deliver best-in-class user experience and usability to our clients, both internally and externally. In this role, you will have opportunities to partner with Technology colleagues to provide support for onboarding to the Design System and to better understand how your work fits into the strategic objectives of the organization.
Responsibilities
Lead the development of the Angular Framework that is aligned to our internal Design System components.
Familiarity with and help with support of the React UI Library
Work with the team to understand priority and urgency, while escalating blockers or delays
Investigate bugs, and provide support to reduce risk for our users
Handle framework upgrades and feature requests
Ensure clear migration path for applications to remain on latest technology and design standards
Follow internal standards for build processes and publishing to ensure stability of framework
Keep the framework current with the latest trends both internally and externally
Provide technical analysis and solutions to issues and technical direction
Required Skills:
8-10 years experience writing professional-quality shared component libraries with expertise in n TypeScript, Angular, and a solid foundational understanding of HTML/CSS
Expertise in working with reusable code that is integrated with modern design systems
Write high-quality code that is well-documented and easy to maintain
Quality of work and speed of execution are crucial for success in this role.
A growth mindset and willingness to learn and adapt in a fast-paced environment
Strong attention to detail & analytical skills
Experience delivering with an agile methodology and using bitbucket/github and jira to manage development
Experience in development of end to end testing, unit testing
Strong communication skills, and ability to raise escalate concerns when appropriate
Stay up to date on the latest software development trends and technologies
Support for developers looking to onboard and contribute to the design system
Interest in working with Design Systems at scale, and developing within the structures of a design driven framework
Desirable Skills & Experience
Interest in Design, methodologies of design systems
Interest in enablement of AI in conjunction with maintenance and alignment to Design Systems
Keen interest in, or knowledge of, banking or finance
Education:
Bachelor's/University degree or equivalent experience
Skills
Mandatory Skills : Design systems
location: Irving, Texas
job type: Permanent
work hours: 8am to 4pm
education: Bachelors
responsibilities:
Job description Role:: Senior Framework Architect - Angular (x1)
Location:: Irving, TX
- The Senior Framework architect will lead the development of the Angular codebase for our internal design system, strongly contributing to the development and strategic technical direction of internal frameworks, products and systems. You will ensure stability and scalability of the framework, and work closely with the rest of our framework development team, and with the CSS lead.
- The framework architect will be integrated within the Design team to produce code that aligns to the standards defined in our internal design library. Your primary task is to help build and maintain the internal Angular framework, which is used to create innovative and intuitive digital products that deliver best-in-class user experience and usability to our clients, both internally and externally. In this role, you will have opportunities to partner with Technology colleagues to provide support for onboarding to the Design System and to better understand how your work fits into the strategic objectives of the organization.
Responsibilities
- Lead the development of the Angular Framework that is aligned to our internal Design System components.
- Familiarity with and help with support of the React UI Library
- Work with the team to understand priority and urgency, while escalating blockers or delays
- Investigate bugs, and provide support to reduce risk for our users
- Handle framework upgrades and feature requests
- Ensure clear migration path for applications to remain on latest technology and design standards
- Follow internal standards for build processes and publishing to ensure stability of framework
- Keep the framework current with the latest trends both internally and externally
- Provide technical analysis and solutions to issues and technical direction
Required Skills:
- 8-10 years experience writing professional-quality shared component libraries with expertise in n TypeScript, Angular, and a solid foundational understanding of HTML/CSS
- Expertise in working with reusable code that is integrated with modern design systems
- Write high-quality code that is well-documented and easy to maintain
- Quality of work and speed of execution are crucial for success in this role.
- A growth mindset and willingness to learn and adapt in a fast-paced environment
- Strong attention to detail & analytical skills
- Experience delivering with an agile methodology and using bitbucket/github and jira to manage development
- Experience in development of end to end testing, unit testing
- Strong communication skills, and ability to raise escalate concerns when appropriate
- Stay up to date on the latest software development trends and technologies
- Support for developers looking to onboard and contribute to the design system
- Interest in working with Design Systems at scale, and developing within the structures of a design driven framework
Desirable Skills & Experience
- Interest in Design, methodologies of design systems
- Interest in enablement of AI in conjunction with maintenance and alignment to Design Systems
- Keen interest in, or knowledge of, banking or finance
Education:
Bachelor's/University degree or equivalent experience
Skills Mandatory Skills : Design systems
qualifications:
Bachelors
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including: medical, prescription, dental, vision, AD&D, and life insurance offerings, short-term disability, and a 401K plan (all benefits are based on eligibility).
This posting is open for thirty (30) days.
Senior Interactive Experience Developer / Creative Coder
Location: Portland, Oregon | Hybrid (3 days in office)
Tandem Talent is partnering with an innovative, globally active creative technology company to recruit a Senior Interactive Experience Developer / Creative Coder. This role is ideal for a developer who enjoys combining strong programming skills with creativity to build immersive digital experiences that exist beyond traditional screens.
You will collaborate with designers, UX strategists, and fellow developers to create interactive environments used in corporate spaces, museums, universities, sports venues, and cultural institutions worldwide. The work focuses on developing experiences that blend digital and physical environments through interactive displays, sensing technologies, and responsive systems.
This is an opportunity to work with advanced technology while helping bring ambitious creative concepts to life in real-world environments.
The Role
As a Senior Interactive Experience Developer, you will play a key role in designing and building innovative front-end and interactive systems. You will work closely with multidisciplinary teams to develop engaging experiences, prototype new ideas, and help shape technical best practices.
Key responsibilities include:
- Leading front-end development for client projects and internal innovation initiatives
- Experimenting with emerging technologies and frameworks to create new digital experiences
- Defining and maintaining coding standards and development best practices
- Mentoring junior developers and supporting collaborative problem-solving
- Conducting project reviews to ensure technical performance and creative quality
- Producing documentation that supports both technical and non-technical stakeholders
- Working within development tools including Atlassian, GitHub, MS Teams, Visual Studio, and Figma
- Supporting installations and client projects, including occasional travel for site visits (approximately 2–3 per year)
What We’re Looking For
The ideal candidate combines strong programming capability with an interest in creative technology and immersive environments.
Required experience:
- Strong programming foundation with experience in creative coding and visual development
- Experience with digital creation platforms such as TouchDesigner, Notch, Pixera, Unreal, or Unity
- Programming experience with languages and APIs including Qt/QML, JavaScript (Three.js, WebGL, Canvas), Python, or Unreal Blueprint/C++
- Strong browser-based development experience, particularly building creative in-browser experiences
- Portfolio demonstrating engaging digital work beyond standard web applications
- Graphics programming experience using OpenGL/GLSL, Vulkan, or DirectX and understanding of the graphics pipeline
- Experience using Git/GitHub for collaborative development
- Experience designing touch interfaces or other natural user interaction systems
- Ability to rapidly prototype concepts and develop them into production-ready code
- Understanding of UX principles and how technical decisions influence user experience
- Strong communication and collaboration skills across technical and non-technical teams
- Curiosity, creativity, and enthusiasm for exploring new technologies
Desirable experience:
- Experience working with interactive hardware, sensors, or immersive technologies.
The Opportunity
This position offers the chance to work on highly creative and technically challenging projects that reach audiences globally. Developers in this team build experiences that appear on interactive display walls, projection-mapped environments, and sensor-driven installations that respond to people and environments in real time.
You will be working within a collaborative, multidisciplinary team where ideas are encouraged and technical experimentation is part of the culture.
Location
Hybrid role based in Portland, Oregon, with three days per week in the office.
Please note that visa sponsorship is not available for this position.
If you are interested in combining technical expertise with creative problem solving to build immersive digital experiences, Tandem Talent would be pleased to hear from you.
Job Title: SDET / QA Automation Engineer
Location: Mount Laurel- NJ
Duration: Long Term
Job Description:
Job Summary:
We are seeking a highly skilled and experienced SDET / QA Automation Engineer with 8 to 10 years of expertise in Python, JavaScript, and modern automation frameworks. This position involves developing automation solutions, microservices, and test scripts while validating end‑to‑end network components and their behavior. The candidate should have strong domain knowledge in networking and cable technologies, with the ability to collaborate effectively with clients and cross‑functional teams.
Key Responsibilities:
- Develop microservices using Python, NodeJS, and Golang as part of automation and service validations.
- Develop standalone Python/NodeJS scripts to simulate network traffic and validate performance across different endpoints.
- Create Proof of Concepts (POCs) based on client needs and actively participate in client demos and technical discussions.
- Lead the creation of test strategies and manage test environments with both physical and virtual device setups.
- Create comprehensive test scenarios and automated test scripts using MochaJS, ensuring robust coverage of functional, integration, and regression test cases.
- Design reusable test components, validate API and microservice behavior, and integrate MochaJS test suites into the existing automation framework to enhance reliability and execution efficiency.
- Collaborate with cross‑functional teams to refine requirements, improve test coverage, and ensure smooth integration with CI/CD pipelines.
- Gather requirements and perform detailed analysis for new automation scenarios and test case development.
- Support manual and automation testing across applications, devices, and servers as required.
- Ensure code quality using tools like SonarQube and adhere to strict QA standards.
- Provide technical guidance, troubleshooting support, and mentorship to team members on tasks and issues raised by the client.
- Maintain version control and branching strategies using GitHub, ensuring high code integrity and traceability.
- Monitor automation execution, analyze failures, and drive root‑cause investigations to improve overall product quality.
- Document technical workflows, automation processes, and test scenarios to ensure long-term maintainability and knowledge sharing.
Required Skills & Experience:
- 8-10 years of experience in QA/SDET automation roles.
- Strong programming knowledge with Python and JavaScript.
- Good hands-on experience with Go Lang and NodeJS.
- Hands-on experience with MochaJS for scripting and automated testing.
- Excellent knowledge with web technologies like REST, SOAP, XML and JSON
- Proficiency in API testing using Bruno/ Postman.
- Familiarity with GitHub for version control and Jira for project tracking.
- Excellent domain knowledge in Network and cable domain
- Should be familiar with IMS architecture and SIP protocols.
- Good problem-solving and debugging skills.
Should have good communication and client interaction skills.
Are you an experienced Full Stack Developer with a desire to excel? If so, then Talent Software Services may have the job for you! Our client is seeking an experienced Full Stack Developer to work at their company in Minneapolis, MN.
Position Summary: The team builds and maintains scalable microservices and batch-processing platforms that ingest, enrich, store, and serve user-generated content for clients' eCommerce and enterprise systems. Our culture is highly collaborative, prioritizing agility, code simplicity, operational excellence, and consistently high-quality software delivery.
Primary Responsibilities/Accountabilities:
- Delivers complex, well-tested, and reliable product features with minimal oversight.
- Excels at breaking down large problems and demonstrates depth across software development lifecycle phases, including concept, design, testing, and deployment.
- Develops solutions and optimizations that improve performance across the full application stack.
- Comfortable independently triaging complex issues across multiple environments in a fast-paced, dynamic setting.
- Actively engages in pair programming, daily standups, sprint retrospectives, backlog grooming, and user story mapping.
Qualifications:
- 5+ years of experience building highly scalable, high-performing applications using Java, Spring Boot, and Gradle with strong object-oriented design skills.
- Experience with Test Driven Development (TDD), including writing unit and integration tests using JUnit, Mockito, and/or the Spock Framework.
- Experience with streaming and messaging platforms such as Kafka, RabbitMQ, or Google Pub/Sub.
- Strong experience with CI/CD pipelines using tools such as Jenkins or GitHub Actions.
- Experience designing distributed application architectures that leverage NoSQL data stores such as Apache Cassandra for high throughput at scale.
- Experience with search and indexing systems such as Apache Solr for large-scale data access and query performance.
Preferred:
- Strong communicator and collaborator who works effectively across cross-functional teams, proactively brings ideas to the table, and takes initiative rather than waiting to be directed.
- Experience with front-end technologies, including JavaScript, ReactJS and NodeJS.
- Experience with container platforms such as Docker.
- Experience designing, testing, and deploying scalable solutions on Google Cloud Platform utilizing services such as BigQuery, Cloud Functions, Cloud Run, and Dataflow.
- Experience with off-heap caching solutions such as Memcached.
- Experience leveraging AI-assisted development tools such as GitHub Copilot to accelerate development workflows.
- Ability to triage and manage complex, production issues
Project description
This technology engineer is responsible for ensuring the reliability, supportability, and continuous improvement of key infrastructure monitoring and management platforms, with primary ownership focus on tools such as SolarWinds, Azure Sentinel. This role requires a developer mindset. This person will also be providing operations systems administration support for hands on Linux and Windows systems. This role partners closely with internal teams across operations, monitoring, and security to strengthen platform health, improve signal quality, and enable effective incident response workflows. The engineer will support a hybrid environment with strong emphasis on Microsoft Azure monitoring and logging, contribute to platform lifecycle activities (patching, upgrades, onboarding, documentation), and continuously learn and apply modern capabilities— including analytics and emerging AI features—across event management, observability, and SIEM tooling to reduce operational friction and increase time to value
Responsibilities
Platform Ownership
Network & Monitoring Tools (must have)
Familiar with tools such as SolarWinds (including NetPath). As a platform owner, ensure platform stability, upgrades, patching, and day to day support.
Has knowledge about network centric monitoring capabilities including SNMP polling, traps, and device visibility etc. Ensure new sites and devices are properly onboarded
Partner with platform and cloud teams to ensure migrated workloads meet monitoring standards. Systems Administration (must have)
Provide sysadmin support for Linux and Windows servers, including:
Agent deployment and upgrades (SolarWinds, Datadog, Dynatrace)
OS level troubleshooting and configuration
Monitoring and logging enablement
Support hybrid environments spanning on prem and Azure infrastructure.
A developer mindset with experience in Dev workflow, GitHub, PowerShell etc.
Observability & Event Management Support (should have)
Has experience with tools such as Datadog and Dynatrace. The person will be responsible for collaborating with platform owners to support integrations, data quality, and alerting hygiene.
Assist with event management workflows, ensuring alerts are actionable and routed correctly.
Participate in efforts to reduce alert noise and repeat incidents. SIEM & Security Visibility (nice to have)
Develop a working understanding of SIEM concepts and platforms such as Azure Sentinel and CRIBL.
Support log ingestion, troubleshooting, and collaboration with security and incident response teams.
Ensure infrastructure and network telemetry supports security detection requirements. Cloud Monitoring & Azure Integration (should have)
Has experience with Azure cloud platform. Have either directly supported or is familiar with Azure based monitoring and logging, including:
Azure Monitor and Log Analytics integrations
Observability for Azure hosted workloads Automation, AI & Continuous Improvement (nice to have)
Explore and apply AI assisted features within monitoring, event management, and SIEM tools to:
Improve signal quality / reduce alert fatigue
Support faster incident triage
Contribute to documentation, runbooks, and operational improvements focused on small, incremental wins.
Knowledge Transfer & Operational Resilience
Participate in knowledge transfer activities related to platform transitions and retirements. Maintain documentation.
Support on call or escalation rotations as needed.
Skills
Must have
Minimum 4-5 years of experience in infrastructure operations, monitoring, observability, or platform operations roles, supporting enterprise environments
Hands on experience with systems administration for Linux and Windows servers, including troubleshooting, configuration, and deployment of monitoring or management agents (e.g., SolarWinds, Datadog, Dynatrace).
Foundational networking knowledge, including concepts such as SNMP, network monitoring, LAN/WAN fundamentals, firewalls, and telemetry collection, sufficient to support network centric monitoring platforms like SolarWinds
Not a must but nice to have experience with platform like StruxureWare.
Experience with observability or monitoring platforms, such as SolarWinds, Datadog, Dynatrace, or similar tools, with an understanding of alerting, dashboards, and signal quality.
Exposure to cloud environments, preferably Microsoft Azure, including familiarity with monitoring and logging concepts (e.g., cloud based telemetry, logs, metrics, and integrations).
Basic understanding of incident and event management practices, including alert triage, escalation, and collaboration with incident response or operations teams.
Demonstrated willingness and ability to learn new technologies quickly, with examples of picking up new platforms, tools, or domains outside of prior core expertise.
Familiarity with Agile or SAFe ways of working, including collaboration in sprint based delivery models, and cross functional team engagement is a plus.
Strong communication and collaboration skills, with the ability to work effectively with platform owners, operations teams, security teams, and external stakeholders.
Experience working in a modern Dev workflow using GitHub (branches, pull requests, code reviews, and CI/CD) to manage and deploy scripts/automation used for platform operations
Working proficiency in scripting languages such as PowerShell, Python, BASH, or similar scripting languages.
Knowledge with Azure, Azure Active Directory (AD), and hybrid cloud environments is a plus.
Exposure to SIEM concepts or platforms such as Azure Sentinel, CRIBL, or similar is a plus.
Experience with change management practices in an enterprise IT environment is beneficial
Job Title: Jenkins Administrator Engineer
Location: Phoenix, AZ
Duration: 12 Months
Experience: 6-9 years
Job Requirement:
• Hands-on experience of software development Java/ Python/ NodeJS/ Go
• Hands-on experience of using and administrating Jenkins and Artifactory is must
• Work experience on multiple DevOps and collaboration tools including Jenkins, GitHub, GitHub Actions, SonarQube, Slack, Confluence, Jira
• Hands-on experience of Linux server administration
• Good communication skills and Aptitude for learning new technology.
This technology engineer is responsible for ensuring the reliability, supportability, and continuous improvement of key infrastructure monitoring and management platforms, with primary ownership focus on tools such as SolarWinds, Azure Sentinel. This role requires a developer mindset. This person will also be providing operations systems administration support for hands on Linux and Windows systems. This role partners closely with internal teams across operations, monitoring, and security to strengthen platform health, improve signal quality, and enable effective incident response workflows. The engineer will support a hybrid environment with strong emphasis on Microsoft Azure monitoring and logging, contribute to platform lifecycle activities (patching, upgrades, onboarding, documentation), and continuously learn and apply modern capabilities— including analytics and emerging AI features—across event management, observability, and SIEM tooling to reduce operational friction and increase time to value.
Responsibilities:
Platform Ownership - Network & Monitoring Tools (must have)
• Familiar with tools such as SolarWinds (including NetPath). As a platform owner, ensure platform stability, upgrades, patching, and day to day support.
• Has knowledge about network centric monitoring capabilities including SNMP polling, traps, and device visibility etc. Ensure new sites and devices are properly onboarded
• Partner with platform and cloud teams to ensure migrated workloads meet monitoring standards.
Systems Administration (must have)
• Provide sysadmin support for Linux and Windows servers, including:
• Agent deployment and upgrades (SolarWinds, Datadog, Dynatrace)
• OS level troubleshooting and configuration
• Monitoring and logging enablement
- Support hybrid environments spanning on prem and Azure infrastructure.
- A developer mindset with experience in Dev workflow, GitHub, PowerShell etc.
- Observability & Event Management Support (should have)
- Has experience with tools such as Datadog and Dynatrace. The person will be responsible for collaborating with platform owners to support integrations, data quality, and alerting hygiene.
- Assist with event management workflows, ensuring alerts are actionable and routed correctly.
- Participate in efforts to reduce alert noise and repeat incidents.
SIEM & Security Visibility (nice to have)
- Develop a working understanding of SIEM concepts and platforms such as Azure Sentinel and CRIBL.
- Support log ingestion, troubleshooting, and collaboration with security and incident response teams.
- Ensure infrastructure and network telemetry supports security detection requirements.
Cloud Monitoring & Azure Integration (should have)
- Has experience with Azure cloud platform. Have either directly supported or is familiar with Azure based monitoring and logging, including:
- Azure Monitor and Log Analytics integrations
- Observability for Azure hosted workloads
Automation, AI & Continuous Improvement (nice to have)
• Explore and apply AI assisted features within monitoring, event management, and SIEM tools to:
- Improve signal quality / reduce alert fatigue
- Support faster incident triage
• - Contribute to documentation, runbooks, and operational improvements focused on small, incremental wins.
- Knowledge Transfer & Operational Resilience
- Participate in knowledge transfer activities related to platform transitions and retirements. Maintain documentation.
- - Support on call or escalation rotations as needed.
Mandatory Skills Description:
• Minimum 4-5 years of experience in infrastructure operations, monitoring, observability, or platform operations roles, supporting enterprise environments
• Hands on experience with systems administration for Linux and Windows servers, including troubleshooting, configuration, and deployment of monitoring or management agents (e.g., SolarWinds, Datadog, Dynatrace).
• Foundational networking knowledge, including concepts such as SNMP, network monitoring, LAN/WAN fundamentals, firewalls, and telemetry collection, sufficient to support network centric monitoring platforms like SolarWinds
• Not a must but nice to have experience with platform like StruxureWare.
• Experience with observability or monitoring platforms, such as SolarWinds, Datadog, Dynatrace, or similar tools, with an understanding of alerting, dashboards, and signal quality.
• Exposure to cloud environments, preferably Microsoft Azure, including familiarity with monitoring and logging concepts (e.g., cloud based telemetry, logs, metrics, and integrations).
• Basic understanding of incident and event management practices, including alert triage, escalation, and collaboration with incident response or operations teams.
• Demonstrated willingness and ability to learn new technologies quickly, with examples of picking up new platforms, tools, or domains outside of prior core expertise.
• Familiarity with Agile or SAFe ways of working, including collaboration in sprint based delivery models, and cross functional team engagement is a plus.
• Strong communication and collaboration skills, with the ability to work effectively with platform owners, operations teams, security teams, and external stakeholders.
• Experience working in a modern Dev workflow using GitHub (branches, pull requests, code reviews, and CI/CD) to manage and deploy scripts/automation used for platform operations
• Working proficiency in scripting languages such as PowerShell, Python, BASH, or similar scripting languages.
• Knowledge with Azure, Azure Active Directory (AD), and hybrid cloud environments is a plus.
• Exposure to SIEM concepts or platforms such as Azure Sentinel, CRIBL, or similar is a plus.
• Experience with change management practices in an enterprise IT environment is beneficial.
Nice-to-Have Skills Description:
Agile Methodologies
Location: Alpharetta, GA (3 days a week onsite)
Duration: 6 months
Job Description:
We are seeking a skilled Site Reliability Engineer to join our team and help build, maintain, and scale our cloud-native infrastructure. You will work closely with development and operations teams to ensure our systems are reliable, scalable, and efficient. The ideal candidate is passionate about automation, observability, and infrastructure-as-code, and thrives in a collaborative, fast-paced environment.
Key Responsibilities
Design, implement, and manage cloud infrastructure on Azure using Terraform and Terragrunt.
Maintain and optimize Kubernetes clusters on Azure Kubernetes Service (AKS).
Build and manage CI/CD pipelines using GitHub Actions/Workflows and ArgoCD for GitOps deployments.
Enhance system reliability by implementing monitoring, alerting, and observability solutions with Grafana.
Automate operational tasks to reduce toil and improve team efficiency.
Participate in on-call rotations, incident response, and post-mortem analysis.
Collaborate with development teams to improve application performance, scalability, and resilience.
Implement and advocate for SRE best practices, including SLIs, SLOs, and error budgets.
Continuously improve system performance, cost efficiency, and security.
Required Skills & Qualifications
3+ years of experience in an SRE, DevOps, or cloud infrastructure role.
Strong experience with Azure cloud services and infrastructure.
Hands-on experience with java and Terraform and Terragrunt for infrastructure-as-code.
Proficiency with Kubernetes (preferably AKS and container orchestration.
Experience with CI/CD tools, especially GitHub Workflows/Actions and ArgoCD.
Solid understanding of observability tools like Grafana (Prometheus, Loki, Tempo experience is a plus).
Education Requirements Bachelor's degree required, (Masters preferred)