Alibaba Cloud Linux Ubuntu Jobs in Usa
2,085 positions found — Page 8
Bhagyashree Yewle, Principal Lead Recruiter - YOH SPG
ODI Developer - Hybrid Onsite in Boston MA - USC OR GC ONLY (No Visas)
- Location: Boston, MA
- Hybrid: 3 days on site
- Potential Convert: Yes, USC/GC ONLY no exceptions. WILL NOT SPONSOR
- ETL/ELT
- ODI
- PL/SQL coding
- 7 years’ experience
- Knowledge on how to be an admin side of things (not day to day but is able to do that)
- Scripting – Python & Unix Scripting
Seeking a highly skilled and experienced Sr. ODI Developer to join our Private Banking Systems team. The ideal candidate will possess expertise in a range of technologies, including ODI (Oracle Data Integrator), Oracle Data Warehouse, Linux, Python scripting, and have a deep understanding of the Banking domain is a big plus. As a Data Engineer, you will play a pivotal role in designing, developing, and maintaining data solutions.
Key Responsibilities:
- Build ODI mappings/interfaces, packages, procedures, scenarios, topology configuration, ODI Agent and load plans to integrate data from multiple enterprise systems.
- Expertise in building Pl/SQL queries, procedures, data loading process, ensuring high-performance and scalability to meet the evolving data needs of the various applications.
- Design, develop, and maintain ETL/ELT pipelines using Oracle Data Integrator (ODI).
- Collaborate effectively with cross-functional teams, including other data engineers, DBA group, analysts, and business stakeholders, to understand data requirements and deliver solutions.
- Monitor and troubleshoot RMJ jobs, ODI workflows, sessions, agents, and data pipelines on Linux environments.
- Perform root cause analysis for failures related to ODI workflows, RMJ jobs, network connectivity, API integrations, and file transfers.
- Optimize ETL workflows to improve reliability, performance, and scalability.
- Use scripting and automation tools to support data processing and operational workflows.
- Work in Linux/Unix environments, using command-line tools and shell scripts for job automation and troubleshooting.
- Maintain comprehensive documentation of data processes, configurations, and best practices.
- Participate in walk-throughs which review program specifications, source code, and all technical supporting documentation, including screens/reports. Provide feedback in accordance with team standards and guidelines.
- Participate in implementation of changes, enhancements, and newly developed programs.
- Conduct technical research and provide recommendations, develop proofs of concept or prototypes, contributing to technical design of applications.
- Helping to identify coding patterns and anti-patterns and enforce implementation of the patterns through code reviews.
- Quickly resolving issues encountered by business lines in the production environment, maintaining a helpful, "high touch" approach to working with business users, performing root cause analysis, technology evaluation, and performance tuning.
Desired Qualifications:
- Degree in Computer Science, Engineering or related technical area
- 7+ years of extensive hands-on experience in ODI, Oracle Datawarehouse, Oracle PL/SQL, Linux, Python scripting, and ODI admin module (ODI Agent setup, logs configuration, certificate installation).
- Must have experience in building Pl/SQL queries for Oracle Server (incl. stored procedures, functions…) and must understand basic principles of data modeling
- Excellent collaborative and communication skills, particularly in high-stress situations
- Experience with scripting Python and Linux scripting, CLE, networking fundamentals (API, IP/ports, SFTP/FTP connectivity)
- High proficiency in development practices: unit testing, Continuous Integration (CI/CD), refactoring, clean code
- Experience with Bitbucket/GIT source control management
- Problem solving skills, able to determine upcoming risks & issues and address them accordingly.
- Ability to interpret and troubleshoot applications using logs.
- Pro-active approach and good communication skills.
- Experience with agile methodologies (Scrum, Kanban) and tools (Jira)
- Private Banking domain experience.
- Working experience in a financial service industry
- Financial application knowledge like FIS AddVantage, CRD, CRM Pivotal.
- Experience with Apache Airflow for workflow orchestration.
- Knowledge of dbt (Data Build Tool) for modern data transformations.
- Exposure to cloud data platforms or hybrid data architectures.
Key Competencies:
- Strong analytical and problem-solving skills
- Ability to work with large-scale enterprise data environments
- Excellent collaboration and communication skills
- Ability to manage multiple priorities in a fast-paced environment
- Commitment to continuous learning and technology innovation
Estimated Min Rate: $55.00
Estimated Max Rate: $72.00
What’s In It for You?
We welcome you to be a part of the largest and legendary global staffing companies to meet your career aspirations. Yoh’s network of client companies has been employing professionals like you for over 65 years in the U.S., UK and Canada. Join Yoh’s extensive talent community that will provide you with access to Yoh’s vast network of opportunities and gain access to this exclusive opportunity available to you. Benefit eligibility is in accordance with applicable laws and client requirements. Benefits include:
- Medical, Prescription, Dental & Vision Benefits (for employees working 20+ hours per week)
- Health Savings Account (HSA) (for employees working 20+ hours per week)
- Life & Disability Insurance (for employees working 20+ hours per week)
- MetLife Voluntary Benefits
- Employee Assistance Program (EAP)
- 401K Retirement Savings Plan
- Direct Deposit & weekly epayroll
- Referral Bonus Programs
- Certification and training opportunities
Note: Any pay ranges displayed are estimations. Actual pay is determined by an applicant's experience, technical expertise, and other qualifications as listed in the job description. All qualified applicants are welcome to apply.
Yoh, a Day & Zimmermann company, is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.
Visit to contact us if you are an individual with a disability and require accommodation in the application process.
For California applicants, qualified applicants with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. All of the material job duties described in this posting are job duties for which a criminal history may have a direct, adverse, and negative relationship potentially resulting in the withdrawal of a conditional offer of employment.
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
By applying and submitting your resume, you authorize Yoh to review and reformat your resume to meet Yoh’s hiring clients’ preferences. To learn more about Yoh’s privacy practices, please see our Candidate Privacy Notice: working/work at home options are available for this role.
Job Description
Must Have Technical/Functional Skills
Must have Windchill Admin Experience covering below aspects
- Windchill installation and configuration
- Hands on experience of managing Windchill environments on Linux , Windows & AWS.
- Installation / Configuration of integrations
- Windchill Upgrade experience.
- Deployment of Windchill on IaaS cloud model.
- Hands-on 5-8 years of Windchill customization experience including (but not limited to) Windchill data utilities, form validators, Windchill soft typing (Type and Attribute Manager), Windchill loader mechanism, ACLs, Workflow &
LifeCycle configuration with necessary customization, JDBC connections and writing respecting query specs, following PTC lead best practices of API development and usage.
- Experience with scripting languages like Shell/Perl/python/TCL for WC admin activities.
Roles & Responsibilities
- Install and configure Windchill and its integrations in the cloud environment according to the migration plan.
- Install and configure Windchill components in the cloud environment according to the Upgrade Plan.
- Execute the Windchill upgrade process, including applying patches, hotfixes, and performing major version upgrades.
- Support the bulk migration of data from the on-premises system to the cloud, ensuring data integrity throughout the process.
- Thoroughly test all deployed solutions, including workflows, configurations, and integrations, to ensure they function as expected.
- Proposing solutions and fixing issues raised in Windchill cloud migration and Windchill upgrade.
- Diagnose and resolve system errors, performance issues, and data integrity anomalies in the cloud, using application logs and other tools.
- Continuously monitor application and database performance in the cloud and optimize configurations for efficiency.
- Maintain up-to-date documentation for the new cloud environment, including processes, procedures, and configurations.
- Act as a key liaison, coordinating with PTC support, other IT teams (network, database), and product owners to resolve issues and meet project objectives.
- Leading and grooming the Windchill team in cloud migration and upgrade activities.-
Generic Managerial Skills, If any
- Collaborate with business analysts, solution architects and project managers to understand requirements and translate them into technical solutions.
- Work closely with cross-functional teams such as IT, PLM specialists, business users and other required teams to ensure proper execution of Teamcenter migration and upgrade project.
- As sist in training and mentoring juniors WC Admins and team members.
Job Description
At Boeing, we innovate and collaborate to make the world a better place. We're committed to fostering an environment for every teammate that's welcoming, respectful and inclusive, with great opportunity for professional growth. Find your future with us.
The Boeing Company is looking for a Senior Digital Engineer – Full Stack & Systems Architecture to join our team in Charleston, SC; El Segundo, CA; Huntsville, AL; Mesa, AZ; Oklahoma City, OK, Philadelphia, PA; or Berkeley, MO.
Boeing Test & Evaluation (BT&E) generates enormous volumes of data, but data alone does not create insight. We are building a Digital Engineering capability focused on transforming test intent into reusable knowledge through intuitive applications, scalable systems, and thoughtful architecture.
As a Digital Engineer – Full Stack & Systems Architecture, you will sit at the intersection of engineering workflows, software systems, and cloud platforms. Your mission is to empower BT&E engineers to work differently by designing and delivering digital products that shorten feedback loops, reduce friction, and accelerate learning from every test event.
This is not a traditional backend or data‐engineering role. You will design end‐to‐end solutions, working from initial understanding of engineer needs, to shaping application architecture, to implementing full‐stack solutions that integrate data, automation, and cloud services. You will partner closely with BT&E, BCA, BDS, and Wisk/Autonomy engineers to modernize how test data is accessed, explored, and operationalized across Boeing.
If you enjoy systems thinking, building products engineers actually want to use, and architecting platforms that scale across programs and clouds, this role is for you.
Position Responsibilities:
Digital Product & Application Development
- Design and develop full stack applications that improve test and evaluation workflows, decision making, and engineering productivity
- Translate ambiguous engineering problems into clear digital solutions, balancing usability, performance, and scalability
- Develop front end and backend services using modern frameworks (e.g., React, Node.js, Python, .NET)
- Design and implement APIs and service interfaces that enable integration across test systems, analytics platforms, and enterprise tools
Systems Architecture & Cloud Engineering
- Architect end to end systems spanning applications, data services, and cloud infrastructure
- Evaluate and select cloud services across AWS and Azure based on cost, usability, scalability, and long term maintainability
- Implement infrastructure as code using Terraform, CloudFormation, ARM, or Bicep to support repeatable, secure deployments
- Design solutions that support multi cloud and hybrid environments as required by program needs
Data Enabled Engineering (as a Platform Capability)
- Design data models and storage solutions that support both transactional systems and analytical workloads
- Build and integrate data services that allow engineers to discover, explore, and reuse test data efficiently
- Collaborate with data scientists and analysts to enable analytics, visualization, and ML workflows without burdening users with infrastructure complexity
DevOps, Reliability & Security
- Build CI/CD pipelines to support rapid iteration, testing, and safe deployment of applications
- Apply SRE principles to ensure reliability, observability, and operational excellence
- Build and maintain observability capabilities—including logs, metrics, and traces—to enable rapid diagnosis, performance optimization, and reliable operation of digital engineering systems
- Partner with security and compliance teams to ensure solutions meet Boeing security, data governance, and regulatory requirements
- Contribute to operational documentation, runbooks, and continuous improvement efforts
Collaboration & Technical Leadership
- Work closely with engineers, product owners, and stakeholders to shape digital roadmaps and technical direction
- Influence architecture and design decisions across programs through systems thinking and engineering judgment
- Collaborate with peers and contribute to a growing Digital Engineering community within BT&E
Basic Qualifications (Required Skills/Experience):
- Bachelor of Science degree in Engineering, Engineering Technology (including Manufacturing Technology), Computer Science, Data Science, Mathematics, Physics, Chemistry or non-US equivalent qualifications directly related to the work statement
- 5+ years of experience developing full stack applications with modern frameworks
- Strong systems thinking skills with experience designing end to end software solutions
- Proficiency in one or more programming languages (JavaScript/TypeScript, Python, C#, or Go)
- Experience deploying and operating applications in cloud environments (Azure and/or AWS)
- Hands on experience with infrastructure as code (Terraform, CloudFormation, ARM/Bicep)
- Working knowledge of CI/CD pipelines, Git, Docker, and Linux
- Experience designing and working with relational databases (e.g., PostgreSQL), including schema design and performance considerations
- Familiarity with security best practices (IAM, secrets management, network controls)
Preferred Qualifications (Desired Skills/Experience):
- Experience designing developer platforms or internal engineering tools
- Background in Digital Engineering, Model Based Systems Engineering (MBSE), or engineering workflow automation
- Cloud certifications (AWS and/or Azure)
- Experience with Kubernetes, serverless architectures, or event driven systems
- Exposure to data pipelines, analytics platforms, or data enabled applications
- Experience working in regulated or safety critical environments
- Understanding of aerospace, test & evaluation, or large scale engineering programs
- Familiarity with ITAR, EAR, DFARS, or similar compliance frameworks
Drug Free Workplace:
Boeing is a Drug Free Workplace where post offer applicants and employees are subject to testing for marijuana, cocaine, opioids, amphetamines, PCP, and alcohol when criteria is met as outlined in our policies.
Conflict of Interest:
Successful candidates for this job must satisfy the Company's Conflict of Interest (COI) assessment process.
Pay & Benefits:
At Boeing, we strive to deliver a Total Rewards package that will attract, engage and retain the top talent. Elements of the Total Rewards package include competitive base pay and variable compensation opportunities.
The Boeing Company also provides eligible employees with an opportunity to enroll in a variety of benefit programs, generally including health insurance, flexible spending accounts, health savings accounts, retirement savings plans, life and disability insurance programs, and a number of programs that provide for both paid and unpaid time away from work.
The specific programs and options available to any given employee may vary depending on eligibility factors such as geographic location, date of hire, and the applicability of collective bargaining agreements.
Pay is based upon candidate experience and qualifications, as well as market and business considerations.
Summary Pay Range: $127,500 – $197,800
Applications for this position will be accepted until Mar. 21, 2026
Export Control Requirements:
This position must meet U.S. export control compliance requirements. To meet U.S. export control compliance requirements, a "U.S. Person" as defined by 22 C.F.R. §120.62 is required. "U.S. Person" includes U.S. Citizen, U.S. National, lawful permanent resident, refugee, or asylee.
Export Control Details:
US based job, US Person required
Education
Bachelor's Degree or Equivalent Required
Relocation
Relocation assistance is not a negotiable benefit for this position.
Visa Sponsorship
Employer will not sponsor applicants for employment visa status.
Shift
This position is for 1st shift
Equal Opportunity Employer:
Boeing is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, age, physical or mental disability, genetic factors, military/veteran status or other characteristics protected by law.
Skills / Technology : OS/Infrastructure (OS, Load Balancer, DNS, Storage, Firewall)
The Challenge
Client is seeking as UNIX Administrator with excellent technical, process and automation skills to be part of High-Performance Cloud Operations Team. As an Infrastructure Administrator, this person is responsible for the daily administration of Linux and Unix servers in a business application environment. This includes general system administration tasks, software and hardware support, system configuration, system monitoring. This person must have excellent Linux/Unix administration experience, with customer relation skills. Candidate should be able to work with business application administrators, helping troubleshoot their applications and guide them with standard methodologies. Candidate must be able to express thoughts clearly and capable of working in a team or as a sole contributor. Individual should be self-motivated with very good communication skills. Main point responsible for the overall operability, resiliency, performance, and capacity of owned production services.
What you'll do
- System Administration - This person would be responsible for the day-to-day administration of all Linux based servers. This includes monitoring the trouble ticket queue, system troubleshooting, hardware and software system changes, scripting, patching, system performance monitoring, system sizing, system integration, upgrade implementation, and hardware diagnostics.
- Application support – This person would work with application administrators to help fix and fine-tune applications and also if required guide application administrators in standard processes related to using the underlying UNIX infrastructure.
- Documentation – Maintain all system documentation.
What you need to succeed
- Unix/Linux System Administration: In-depth experience with Unix/Linux servers (especially Suse, AIX, RHEL, CentOS) for installation, configuration, patching, and troubleshooting.
- Automation & Scripting: Proficiency in scripting (Bash, Python) and automation tools (Ansible, etc.) to streamline deployments and manage configurations.
- Demonstrable ability to perform UNIX builds,
- Understanding of RedHat Satellite, IBM NIM, or SUSE Manager for patch management.
- Networking Knowledge: Strong grasp of networking (TCP/IP, DNS, SSH, etc.) and system connectivity for effective troubleshooting in distributed environments.
- Working knowledge of Virtual machine management (VmWare, OpenShift) TCP/IP functionality, networking, Remote administration, cloning, migration, etc.
- Security Best Practices: Expertise in system security – user access controls, OS hardening, patch management, and compliance.
- Soft Skills: Strong communication, teamwork, and problem-solving skills to collaborate across teams and resolve complex issues efficiently.
- Operational experience with Ansible and Terraform are beneficial.
Job Description
At Boeing, we innovate and collaborate to make the world a better place. We’re committed to fostering an environment for every teammate that’s welcoming, respectful and inclusive, with great opportunity for professional growth. Find your future with us.
The Boeing Company is currently seeking a Lead Software Engineer – DevSecOps to support our Phantom Works Virtual Warfare Center team located in Berkeley, MO. This position will focus on supporting the Boeing Defense, Space & Security (BDS) business organization.
The DevSecOps Lead Engineer will architect and implement secure development and execution environments for the rapid prototyping and experimentation we use to answer our customers’ toughest questions about future technologies and capabilities. The Virtual Warfare Center executes far-reaching analysis to address military capability gaps in and across multiple warfighting domains in the face of accelerating adversary capabilities. In DevSecOps you will be part of a team modernizing our approach to software development and enhancing our security posture.
As the Virtual Warfare Center’s DevSecOps Team Lead you will lead a team of engineers designing, implementing, and monitoring software development infrastructure across multiple networks and physical locations across the United States. You will build and maintain cross-functional relationships with multiple teams to coordinate the selection, approval, deployment, and maintenance of a consistent set of software tools in all locations. Your work will guarantee our development and deployment infrastructure and processes are reliable, efficient, consistent, and secure. Your team will partner with relevant stakeholders to create processes, design cloud-based solutions, support deploying applications in cloud environments, evaluate solution performance and implement enhancements. You will guide the team through the update of a legacy software development infrastructure to use modern technologies including containers, cloud, high performance computing, AI/ML, and automation. This position requires mentoring early-career employees on DevSecOps design, implementation, maintenance, communication, and leadership skills. Your team will track required software updates and drive the process to eliminate known vulnerabilities including monitoring systems, tools, and software packages for security vulnerabilities. You will contribute to a collaborative, cross-functional team managing software security approvals and automate the integration of security into all phases of the software development lifecycle. Your work with an array of software development, IT, and cybersecurity teams will address emergent issues while improving the efficiency and usability of our systems and software products.
Position Responsibilities:
- Lead a team of engineers responsible for designing, installing, configuring, and maintaining a consistent, secure software development toolchain across multiple networks and physical locations.
- Spearhead the approval and implementation of continuous integration and continuous deployment pipelines into collateral secret and program spaces.
- Coordinate between software development, IT, and security teams on vulnerability tracking and mitigation, driving efforts forward.
- Architect and implement the transition of a multi-site, multi-network software development environment into a cloud-based approach.
- Lead trade studies and tool selection to upgrade and modernize software development processes and operational infrastructure.
- Lead implementation of best practices and methodologies for provisioning, platform scaling, configuration management, monitoring and troubleshooting
- Maintain the DevSecOps vision and roadmap, track status, and communicate progress to stakeholders.
- Mentor and coach the team, provide technical leadership, foster a culture of knowledge sharing and continuous learning, and grow their skills.
Basic Qualifications (Required Skills/ Experience):
- Bachelor’s Degree in an engineering discipline or 17+ years equivalent related experience
- 10+ years’ experience with software engineering
- 3+ years’ experience with scripting languages such as Bash or Python
- 3+ years’ experience containerized software development
- 3+ years’ experience supporting DevSecOps lifecycle
- Experience with Agile development practices using continuous integration and deployment
- 3+ years of experience performing automation, implementation and deployments in both Windows and Linux systems
- Active Secret clearance
Preferred Qualifications (Desired Skills/Experience):
- Active Top Secret SCI clearance
- Experience with gitlab
- Experience with Jenkins
- Experience with JIRA
- 3+ years’ experience supporting cloud development environments
- Experience with cloud computing in classified environments
- CompTIA Security+
- Bachelor of Science degree from an accredited course of study in engineering, engineering technology (includes manufacturing engineering technology), chemistry, physics, mathematics, data science, or computer science.
Travel: 10%
Drug Free Workplace:
Boeing is a Drug Free Workplace (DFW) where post offer applicants and employees are subject to testing for marijuana, cocaine, opioids, amphetamines, PCP, and alcohol when criteria is met as outlined in our policies.
CodeVue Coding Challenge:
To be considered for this position you will be required to complete a technical assessment as part of the selection process. Failure to complete the assessment will remove you from consideration.
Pay & Benefits:
At Boeing, we strive to deliver a Total Rewards package that will attract, engage and retain the top talent. Elements of the Total Rewards package include competitive base pay and variable compensation opportunities.
The Boeing Company also provides eligible employees with an opportunity to enroll in a variety of benefit programs, generally including health insurance, flexible spending accounts, health savings accounts, retirement savings plans, life and disability insurance programs, and a number of programs that provide for both paid and unpaid time away from work.
The specific programs and options available to any given employee may vary depending on eligibility factors such as geographic location, date of hire, and the applicability of collective bargaining agreements.
Pay is based upon candidate experience and qualifications, as well as market and business considerations.
Summary Pay Range for Lead: $136,850 - $185,150
Applications for this position will be accepted until Mar. 25, 2026
Export Control Requirements:
This position must meet U.S. export control compliance requirements. To meet U.S. export control compliance requirements, a “U.S. Person” as defined by 22 C.F.R. §120.62 is required. “U.S. Person” includes U.S. Citizen, U.S. National, lawful permanent resident, refugee, or asylee.
Export Control Details:
US based job, US Person required
Relocation
This position offers relocation based on candidate eligibility.
Security Clearance
This position requires an active U.S. Secret Security Clearance (U.S. Citizenship Required). (A U.S. Security Clearance that has been active in the past 24 months is considered active)
Visa Sponsorship
Employer will not sponsor applicants for employment visa status.
Shift
This position is for 1st shift
Equal Opportunity Employer:
Boeing is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, age, physical or mental disability, genetic factors, military/veteran status or other characteristics protected by law.
Business Area:
EngineeringSeniority Level:
AssociateJob Description:
At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world's largest enterprises.
At Cloudera, our Data Services Pillar is the heart of data innovation. We don't just work with technology; we build it. Our mission is to empower data practitioners by creating seamless, enterprise-grade experiences for data engineering, warehousing, streaming, operational databases, and AI.
You will be a key member of the NFQE (Non Functional QE) team that drives the performance reliability of Cloudera's Kuberneteshosted data services. The role blends deep technical knowledge of performance testing, distributed data workloads, and container orchestration with a datadriven mindset. You'll design, automate, run, and analyze performance tests for Cloudera's flagship services, ensuring they meet or exceed customerdefined SLOs/SLAs at scales.
As a Performance Engineer, you will:
Work with internal development teams and the open source community to proactively drive performance improvements/optimizations across our data warehouse and Data Engineering stack.
Work with product managers, developers and the field team to understand performance and scale requirements, and develop benchmarks based on these requirements.
Develop automation to execute benchmarks, collect and aggregate metrics and profiles, and report results, trends, and regressions.
Analyze performance and scalability characteristics to identify bottlenecks in large-scale distributed systems.
Perform root cause analysis of performance issues identified by internal testing and from customers and suggest corrective actions.
Evaluate performance of systems and provide related guidance to the team.
We are excited about you if you have:
3 + years of industry experience in performance-related work, ideally on large-scale distributed systems
Understanding of DBMS algorithms and data structure fundamentals.
Understanding of hardware trends and full-stack systems performance: CPU, RAM, storage, network, Linux kernel, JVM, and distributed systems performance.
Understanding of performance analysis tools and techniques.
Strong design, coding skills, and test automation skills (Java/C++/Golang/Python preferred)
Knowledge of relevant frameworks, cloud provider knowledge, K8s, etc.
Ability to work in a distributed setting with team members spread in multiple geographies
Demonstrated ability to work on large cross-functional projects, including strong written communication skills and a collaborative mindset, as you will be working with many teams inside and outside of Cloudera.
Experience with benchmark and performance test design. You eshould understand basic concepts of performance testing including different types of performance tests (microbenchmarks, end-to-end benchmarks, concurrency and scale testing), how to reduce (or deal with) noise in test results, etc.
Experience designing performance tests that provide useful insights into specific aspects of performance.
Solid understanding of basic performance theory - in particular a very good understanding of latency, throughput, and concurrency and how they relate to each other.
Strong understanding of the types of workloads they'll be testing Ideally they should have specific experience creating performance tests for the specific product area they'll be working on (SQL, ML, etc).
B.S. or M.S. in Computer Science or equivalent experience.
You might also have:
Experience with the Hadoop ecosystem (i.e. Hive, Impala, Spark), in specific Prior work on largescale data lakehouse or datawarehouse performance
Hands-on experience with containerization, Kubernetes, public cloud infrastructure (AWS, Azure and/or GCP) and mesh-networks
Certifications: CKA/CKAD, AWS Solutions Architect, GCP Cloud Architect, Azure Solutions Architect, or equivalent.
Security & Compliance: Experience writing performance tests that also verify dataprivacy and audit compliance (e.g., GDPR, HIPAA).
Why this role matters:
This is your opportunity to build cloud-native solutions that are deployable anywhere whether in massive clusters on any cloud provider or in private data centers. You'll work with cutting-edge technologies like Trino, Spark, Airflow, and advanced AI inferencing systems to shape the future of analytics. Your code will directly influence how data engineers, analysts, and developers worldwide find value in their data.
We believe in the power of open source. You'll collaborate with project committers, contributing upstream to keep technologies like Apache Hive and Impala evolving. You'll harden these engines for rock-solid security, optimize them for peak performance, and make them effortlessly run across all environments. Join us and help build the trusted, cloud-native platform that powers insights for the most data-intensive companies on the planet.
This position is not eligible for sponsorship.
The expected base salary range for this role in:
California is $124,000 - $155,000
The salary will vary depending on your job-related skills, experience and location.
What you can expect from us:
Generous PTO Policy
Support work life balance with Unplugged Days
Flexible WFH Policy
Mental & Physical Wellness programs
Phone and Internet Reimbursement program
Access to Continued Career Development
Comprehensive Benefits and Competitive Packages
Paid Volunteer Time
Employee Resource Groups
EEO/VEVRAA
#LI-SZ1
#LI-HYBRID
LocationAtlanta, Georgia
Full/Part TimeFull-Time
Regular/TemporaryRegular
Add to Favorite JobsEmail this Job
About Us
Overview
Georgia Tech prides itself on its technological resources, collaborations, high-quality student body, and its commitment to building an outstanding and diverse community of learning, discovery, and creation. We strongly encourage applicants whose values align with our institutional values, as outlined in our Strategic Plan. These values include academic excellence, diversity of thought and experience, inquiry and innovation, collaboration and community, and ethical behavior and stewardship. Georgia Tech has policies to promote a healthy work-life balance and is aware that attracting faculty may require meeting the needs of two careers.
About Georgia Tech
Georgia Tech is a top-ranked public research university situated in the heart of Atlanta, a diverse and vibrant city with numerous economic and cultural strengths. The Institute serves more than 45,000 students through top-ranked undergraduate, graduate, and executive programs in engineering, computing, science, business, design, and liberal arts. Georgia Tech's faculty attracted more than $1.4 billion in research awards this past year in fields ranging from biomedical technology to artificial intelligence, energy, sustainability, semiconductors, neuroscience, and national security. Georgia Tech ranks among the nation's top 20 universities for research and development spending and No. 1 among institutions without a medical school.
Georgia Tech's Mission and Values
Georgia Tech's mission is to develop leaders who advance technology and improve the human condition. The Institute has nine key values that are foundational to everything we do:
- Students are our top priority.
- We strive for excellence.
- We thrive on diversity.
- We celebrate collaboration.
- We champion innovation.
- We safeguard freedom of inquiry and expression.
- We nurture the wellbeing of our community.
- We act ethically.
- We are responsible stewards.
Over the next decade, Georgia Tech will become an example of inclusive innovation, a leading technological research university of unmatched scale, relentlessly committed to serving the public good; breaking new ground in addressing the biggest local, national, and global challenges and opportunities of our time; making technology broadly accessible; and developing exceptional, principled leaders from all backgrounds ready to produce novel ideas and create solutions with real human impact.
About the College of Computing at the Georgia Institute of Technology
The College of Computing has been a leader in defining modern computing as a paradigm that combines the foundations of theoretical mathematics and information science, the force of invention in computational systems and processes, and interdisciplinary practice that integrates innovation in computing with all facets of life. Today, the college comprises five schools that offer unique academic programs and conduct research specifically related to their concentration areas: Computer Science, Computing Instruction, Cybersecurity and Privacy, Interactive Computing, and Computational Science and Engineering.
About the Technology Services Organization (TSO)
TSO is responsible for the overall computing, networking and physical infrastructure, as well as technical and building support necessary to sustain the College's programs in research, instruction and administration for faculty, staff and students. Services are focused into groups: Enterprise Systems Support (Help Desk, Web Services), Research Program Support, Instruction Support, and Infrastructure.
Location
Atlanta, GA
Job Summary
Technology Services Organization (TSO) at Georgia Institute of Technology is seeking to hire a Research Technologist I to Work with the TSO research team in the technical support of research activity within the College of Computing.
This is a hybrid working position.
Responsibilities
Work with the TSO research team in the technical support of research activity, including:
- Provide point of contact and customer interface regarding TSO's support of college research technology resources.
- Work with faculty, researchers and graduate students with the acquisition, research, development and implementation of research resources, and integration of advanced technologies involving work in research data center and lab environments.
- Work with Research Lab Managers in support of strategic research facilities.
- Manage technical problems, requests, and projects to assure quick resolution, best practices, and excellent customer service.
- Coordinate the provisioning of OIT and TSO services to meet faculty and student needs.
- Implement large-scale, automated approaches to operating system and configuration management and control of systems and HPC resources.
- Implement CoC technical plans, infrastructure, policies and procedures in research facilities.
- Monitor and maintain the general health and life cycle of research systems.
- Implement Institute computing and networking security policies and procedures, as well as develop written internal computing and networking policies and documentation.
- Develop and coordinate technical aspects of research grant proposals.
- Provide bid and proposal costing for strategic technologies.
- Be aware of and discuss technology trends and new products that could be used to enhance research facilities.
- Act as liaison to GT/OIT, other GT units and external research partners regarding research computing capabilities.
Required Qualifications
- Bachelor's degree
Preferred Qualifications
- Master's degree in Computer Science or related field.
- Two or more years of job-related experience.
- Higher Education experience preferred, experience managing technology in a highly distributed, multi-vendor environment supporting research.
- Be able to conceptualize immediate and future research technologies.
- Linux systems administration experience, preferably in a heterogeneous environment that includes Linux/UNIX, Mac OS X, and Windows client systems accessing physical and virtual Linux servers.
- Knowledge of Linux server installation and administration, system monitoring, hardware maintenance and troubleshooting, configuration and patch management, virtualization, containerization, private cloud, storage solutions, IP networking concepts, information security, identity and access management, and backup & recovery.
- Knowledge of a UNIX scripting language (e.g. python, bash, etc.).
- Exemplary customer service skills and the agility to handle unusual technical requests.
- Possess excellent communications skills, the ability to make technical presentations, the ability to create technical documentation, a strong service orientation, and the ability to work well with other professionals in providing technical solutions in an advanced computing environment.
Contact Information
Requests for information may be directed to David Mercer:
USG Core Values
The University System of Georgia is comprised of our 26 institutions of higher education and learning as well as the System Office. Our USG Statement of Core Values are Integrity, Excellence, Accountability, and Respect. These values serve as the foundation for all that we do as an organization, and each USG community member is responsible for demonstrating and upholding these standards. More details on the USG Statement of Core Values and Code of Conduct are available in USG Board Policy 8.2.18.1.2 and can be found on-line at policymanual/section8/C224/#p8.2.18_personnel_conduct.
Additionally, USG supports Freedom of Expression as stated in Board Policy 6.5 Freedom of Expression and Academic Freedom found on-line at policymanual/section6/C2653.
Equal Employment Opportunity
The Georgia Institute of Technology (Georgia Tech) is an Equal Employment Opportunity Employer. The University is committed to maintaining a fair and respectful environment for all. To that end, and in accordance with federal and state law, Board of Regents policy, and University policy, Georgia Tech provides equal opportunity to all faculty, staff, students, and all other members of the Georgia Tech community, including applicants for admission and/or employment, contractors, volunteers, and participants in institutional programs, activities, or services. Georgia Tech complies with all applicable laws and regulations governing equal opportunity in the workplace and in educational activities.
Georgia Tech prohibits discrimination, including discriminatory harassment, on the basis of race, ethnicity, ancestry, color, religion, sex (including pregnancy), sexual orientation, gender identity, gender expression, national origin, age, disability, genetics, or veteran status in its programs, activities, employment, and admissions. This prohibition applies to faculty, staff, students, and all other members of the Georgia Tech community, including affiliates, invitees, and guests. Further, Georgia Tech prohibits citizenship status, immigration status, and national origin discrimination in hiring, firing, and recruitment, except where such restrictions are required in order to comply with law, regulation, executive order, or Attorney General directive, or where they are required by Federal, State, or local government contract.
More information on these policies can be found here: policymanual/section6/c2714 Board of Regents Policy Manual | University System of Georgia ( ).
Background Check
The candidate of choice will be required to pass a pre-employment background screening. employment/pre-employment-screening.
Job Description
At Boeing, we innovate and collaborate to make the world a better place. We’re committed to fostering an environment for every teammate that’s welcoming, respectful and inclusive, with great opportunity for professional growth. Find your future with us.
The Boeing Company is currently seeking a Lead Software Engineer – DevSecOps to support our Phantom Works Virtual Warfare Center team located in Berkeley, MO. This position will focus on supporting the Boeing Defense, Space & Security (BDS) business organization.
The DevSecOps Lead Engineer will architect and implement secure development and execution environments for the rapid prototyping and experimentation we use to answer our customers’ toughest questions about future technologies and capabilities. The Virtual Warfare Center executes far-reaching analysis to address military capability gaps in and across multiple warfighting domains in the face of accelerating adversary capabilities. In DevSecOps you will be part of a team modernizing our approach to software development and enhancing our security posture.
As the Virtual Warfare Center’s DevSecOps Team Lead you will lead a team of engineers designing, implementing, and monitoring software development infrastructure across multiple networks and physical locations across the United States. You will build and maintain cross-functional relationships with multiple teams to coordinate the selection, approval, deployment, and maintenance of a consistent set of software tools in all locations. Your work will guarantee our development and deployment infrastructure and processes are reliable, efficient, consistent, and secure. Your team will partner with relevant stakeholders to create processes, design cloud-based solutions, support deploying applications in cloud environments, evaluate solution performance and implement enhancements. You will guide the team through the update of a legacy software development infrastructure to use modern technologies including containers, cloud, high performance computing, AI/ML, and automation. This position requires mentoring early-career employees on DevSecOps design, implementation, maintenance, communication, and leadership skills. Your team will track required software updates and drive the process to eliminate known vulnerabilities including monitoring systems, tools, and software packages for security vulnerabilities. You will contribute to a collaborative, cross-functional team managing software security approvals and automate the integration of security into all phases of the software development lifecycle. Your work with an array of software development, IT, and cybersecurity teams will address emergent issues while improving the efficiency and usability of our systems and software products.
Position Responsibilities:
- Lead a team of engineers responsible for designing, installing, configuring, and maintaining a consistent, secure software development toolchain across multiple networks and physical locations.
- Spearhead the approval and implementation of continuous integration and continuous deployment pipelines into collateral secret and program spaces.
- Coordinate between software development, IT, and security teams on vulnerability tracking and mitigation, driving efforts forward.
- Architect and implement the transition of a multi-site, multi-network software development environment into a cloud-based approach.
- Lead trade studies and tool selection to upgrade and modernize software development processes and operational infrastructure.
- Lead implementation of best practices and methodologies for provisioning, platform scaling, configuration management, monitoring and troubleshooting
- Maintain the DevSecOps vision and roadmap, track status, and communicate progress to stakeholders.
- Mentor and coach the team, provide technical leadership, foster a culture of knowledge sharing and continuous learning, and grow their skills.
Basic Qualifications (Required Skills/ Experience):
- Bachelor’s Degree in an engineering discipline or 17+ years equivalent related experience
- 10+ years’ experience with software engineering
- 3+ years’ experience with scripting languages such as Bash or Python
- 3+ years’ experience containerized software development
- 3+ years’ experience supporting DevSecOps lifecycle
- Experience with Agile development practices using continuous integration and deployment
- 3+ years of experience performing automation, implementation and deployments in both Windows and Linux systems
- Active Secret clearance
Preferred Qualifications (Desired Skills/Experience):
- Active Top Secret SCI clearance
- Experience with gitlab
- Experience with Jenkins
- Experience with JIRA
- 3+ years’ experience supporting cloud development environments
- Experience with cloud computing in classified environments
- CompTIA Security+
- Bachelor of Science degree from an accredited course of study in engineering, engineering technology (includes manufacturing engineering technology), chemistry, physics, mathematics, data science, or computer science.
Travel: 10%
Drug Free Workplace:
Boeing is a Drug Free Workplace (DFW) where post offer applicants and employees are subject to testing for marijuana, cocaine, opioids, amphetamines, PCP, and alcohol when criteria is met as outlined in our policies.
CodeVue Coding Challenge:
To be considered for this position you will be required to complete a technical assessment as part of the selection process. Failure to complete the assessment will remove you from consideration.
Pay & Benefits:
At Boeing, we strive to deliver a Total Rewards package that will attract, engage and retain the top talent. Elements of the Total Rewards package include competitive base pay and variable compensation opportunities.
The Boeing Company also provides eligible employees with an opportunity to enroll in a variety of benefit programs, generally including health insurance, flexible spending accounts, health savings accounts, retirement savings plans, life and disability insurance programs, and a number of programs that provide for both paid and unpaid time away from work.
The specific programs and options available to any given employee may vary depending on eligibility factors such as geographic location, date of hire, and the applicability of collective bargaining agreements.
Pay is based upon candidate experience and qualifications, as well as market and business considerations.
Summary Pay Range for Lead: $136,850 - $185,150
Applications for this position will be accepted until Mar. 25, 2026
Export Control Requirements:
This position must meet U.S. export control compliance requirements. To meet U.S. export control compliance requirements, a “U.S. Person” as defined by 22 C.F.R. §120.62 is required. “U.S. Person” includes U.S. Citizen, U.S. National, lawful permanent resident, refugee, or asylee.
Export Control Details:
US based job, US Person required
Relocation
This position offers relocation based on candidate eligibility.
Security Clearance
This position requires an active U.S. Secret Security Clearance (U.S. Citizenship Required). (A U.S. Security Clearance that has been active in the past 24 months is considered active)
Visa Sponsorship
Employer will not sponsor applicants for employment visa status.
Shift
This position is for 1st shift
Equal Opportunity Employer:
Boeing is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, age, physical or mental disability, genetic factors, military/veteran status or other characteristics protected by law.
The Technical Project Manager (TPM) has three main responsibilities:
- Project Manage all technical tasks during implementation and upgrades.
- Install and configure servers and the Care Logistics applications in Amazon Web Services (AWS) and on premise.
- Perform technical operations and oversee availability, performance, and supportability of our observability infrastructure.
The TPM acts as the project manager and liaison between Care Logistics and the customer for all technical activities. The TPM is responsible for coordinating the system configuration, sizing, ordering, and installation while technically engineering and managing the integration of Care Logistics solutions. They work closely with Solutions Delivery and customer resources in support of organizational objectives. Solutions Delivery functions include project delivery tasks such as solution sizing, technical project planning, customer guidance, system installation, system validation, system testing, technical training, and support of technical onsite events. The TPM facilitates DevOps functions between development and the Solutions Delivery teams to ensure technical operations are correctly executed, effectively communicated, and continuously improved.
ESSENTIAL RESPONSIBILITIES:
Solutions Delivery Functions
- Delivery components of customer project tasks which include:
- Assist with the design and implementation of new technologies
- Assist with the sizing of customer systems
- Train new employees on all aspects of the role
- Considered a Subject Matter Expert for all aspects of the technology and project delivery
- Install and troubleshoot software, hardware, and services necessary to support Care Logistics solutions
- Lead the engineering of hospital customer’s technical solutions
- Lead, plan, organize and drive the design, testing, and implementation of Care Logistics software solutions and related advisory services
- Educate customer on technical aspects of the Care Logistics system
- Interface with service and hardware system vendors to build and configure systems
- Participate in onsite customer events, including technical go-live
- Technical Operations and Observability:
- Manage alert and monitoring configuration
- Collect, aggregate, and visualize metrics to provide actionable insights
- Advise right-sizing of AWS infrastructure resources to optimize cost and performance
- Manage incident response
- Provide insight to Cloud Center of Excellence
- Additional tasks which include:
- Provide primary technical support for project team members
- Provide Tier 2 level support for Care Logistics Support team
- Create and maintain internal environments for use by Care Logistics Client Engagement team
- Create Knowledge Base articles and other technical documentation for use by Care Logistics employees and customers
- Define and maintain a clear, concise documented process for the implementation and integration of the system
- Collaborate with teammates to troubleshoot and maintain existing application modules
- Participate in DevOps initiatives to improve products and operations
QUALIFICATIONS – EDUCATION, WORK EXPERIENCE, CERTIFICATIONS:
REQUIRED
- Bachelor’s degree in Computer Information Systems or equivalent experience
- PMP certification and/or equivalent experience
- 2-4 years hands on experience using Amazon Web Services (AWS) services such as EC2, RDS, Systems Manager, VPN, CloudWatch
- 2-4 years of monitoring systems experience using tools such as AWS CloudWatch, Datadog, New Relic, SolarWinds, Dynatrace, etc.
- 4-6 years demonstrated project management experience
- Advanced operation and maintenance of Linux (Red Hat Operating System)
- Demonstrated advanced analytical and troubleshooting skills
- 3+ years integrating software/hardware systems in client-server and cloud environments
- Proven organizational and delivery skills
DESIRED
- AWS certification desired
- Automating and configuring Amazon Web Services (AWS) such as EC2, RDS, VPN
- Operational best practices related to systems operation and maintenance in on-premises and AWS production environments
- Industry standard application/applet containers such as Tomcat
- PostgreSQL and Aurora Databases (installation, configuration, and operation)
- Production High availability server environments
- Complex hardware and software installations
- Management of enterprise reporting tools and/or related technologies
- Project delivery, operations, and support using DevOps and/or Agile methods
- Support leadership experience
- Use of ticketing systems such as JIRA and/or related incident management tools such as OpsGenie
- Comprehension of related scientific and technical journals, abstracts, financial reports, and legal documents.
- Preparation of articles, abstracts, editorials, journals, manuals, and critiques.
- Preparation and delivery of comprehensive presentations, participation in formal debate, extemporaneous communication, and professional communication before an audience.
- Professional certifications in related industry skills such as DBMS, CISSP, ITIL, Agile, and Lean are a plus
KNOWLEDGE, SKILLS, AND ABILITIES:
- Develop strong and productive working relationships with others
- Form strong team bonds and enhance team performance
- Strong organizational and quality management skills with ability to handle multiple, competing tasks and priorities
- Cope with rapidly changing information in a fast-paced environment
- Proven communication, interpersonal, analytical, and organizational skills
- Proven ability to properly communicate with customers (in person and via phone) and manage expectations during a project
- Work both independently and as a member of the implementation and support team
- Manage multiple concurrent activities, all with fluctuating deadlines, by working with other departments, both internal and external
- Quickly identify and resolve issues
- Quickly understand complex concepts
- Excellent oral and written communication skills
- Excellent customer management skills
- Above average observational skills to collect data and validate information
- Outstanding analytical skills with the ability to critically evaluate the information gathered from multiple sources, reconcile conflicts, relate high-level information to details, and distinguish user requests from underlying business problems/needs.
- Effectively represent Jackson Healthcare/Care Logistics values and principles in decision-making and actions
- Support leadership and/or project management
- Excellent troubleshooting skills
- Excellent organizational and delivery skills
- Install, configure, and manage hardware and software in AWS and on-premises environments
- Provide specifications for system hardware and AWS service requirements
- Implement complex system solutions involving multiple technologies
- Control and implement complex system and application feature configurations
- Troubleshoot complex system and technical issues
- Read and understand system and application logs
- Proven ability to communicate and teach complex technical concepts to less technical resources
- Excellent communications and interpersonal skills, as well as analytical and problem-solving skills
- Excellent documentation skills
REQUIRED KNOWLEDGE
- Amazon Web Services (AWS) services such as EC2, RDS, Systems Manager, VPN, CloudWatch
- Monitoring systems such as AWS CloudWatch, Datadog, New Relic, SolarWinds, Dynatrace, etc.
- In-depth knowledge of Linux (Red Hat Operating System) concepts and operations in a production environment
- VMware, Web servers, DBMS, Reporting and analytic tools
- Project Management Methodologies
- Advanced PC knowledge including proficiency with MS Outlook, Word, Excel, and PowerPoint
DESIRED KNOWLEDGE
- Knowledge automating and configuring Amazon Web Services (AWS) such as EC2, RDS, VPN
- Understanding of high availability server environments
- Hardware and software installation techniques
- Healthcare Information Systems
- Enterprise reporting tools
- DevOps and Agile methodologies related to project delivery, operations, and support
- Ticketing systems such as JIRA and related incident management tools (such as OpsGenie)
TRAVEL REQUIREMENTS & WORKING CONDITIONS:
- 10-80% travel required
- The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job
- Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions
- While performing the duties of this job, the employee is frequently required to stand; walk; sit; use hands to finger, handle, or feel; write; type; reach with hands and arms; climb or balance; stoop, kneel, crouch, or crawl; talk or hear; and smell
- The employee must frequently lift and/or move up to 50 pounds
- Specific vision abilities required by this job include close vision, distance vision, color vision, peripheral vision, depth perception, and ability to adjust focus
Site Reliability Engineer
Description and Requirements
About Our Team
We are building Quantum, a next‑generation hybrid AI platform that spans Windows, Android, and cloud. As part of this initiative, we are growing the reliability engineering organization that powers cross‑device Personal AI.
We are hiring Site Reliability Engineers (SREs) to strengthen the reliability, observability, and operational excellence of Qira’s AI systems across device, edge, and cloud. Depending on your strengths, you may be aligned to areas such as Observability, Operations, or Service Reliability.
Works with the speed and creativity of a startup inside— you’ll help build foundational systems with clarity, ownership, and modern engineering practices.
Location: On-site in Chicago, IL. Hybrid (3 days on-site, 2 days remote)
What You Might Work On
As an SRE, you may be responsible for a subset of the following, depending on team placement and skill alignment:
Reliability & Systems Engineering
- Support the reliability, availability, and performance of distributed systems across cloud, edge, and device environments.
- Help define, measure, and monitor SLIs and SLOs for core services.
- Identify reliability risks and collaborate with senior engineers on mitigation plans.
Operational Excellence
- Participate in on‑call rotations and assist with incident response and post‑incident reviews.
- Contribute improvements to runbooks, automation, and tooling that reduce alert noise and operational toil.
- Help enhance detection, alerting, and response workflows.
Observability & Insight
- Implement and improve telemetry using OpenTelemetry, Grafana, and related tools.
- Build dashboards and tools that improve visibility into system health and AI service behavior.
- Ensure observability data is complete, accurate, and actionable.
Deployments & Change Safety
- Support safe, reliable deployment workflows including canaries, staged rollouts, and automated rollbacks.
- Assist in improving CI/CD systems and deployment tooling.
Collaboration & Best Practices
- Work closely with senior SREs, DevOps engineers, AI/ML teams, and platform engineers.
- Contribute to reliability reviews, operational readiness checks, and cross‑team projects.
- Advocate for modern SRE and DevOps practices within the organization.
Basic Qualifications
- 4+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or production systems operations.
- Bachelor’s Degree in Computer Science, Engineering, or related technical field (or equivalent practical experience).
- Foundational experience supporting distributed systems in production.
- Ability to write scripts or tools in Python, Go, Bash, or similar languages.
- Solid understanding of Linux systems, networking basics, and system performance fundamentals.
- Experience with cloud platforms (Azure preferred, AWS or GCP acceptable).
- Familiarity with monitoring/observability (metrics, logs, tracing).
- Experience with containers and Kubernetes.
Preferred Qualifications
- Experience with OpenTelemetry instrumentation and telemetry pipelines.
- Hands‑on experience with Grafana, Prometheus, Loki, or Tempo.
- Exposure to AI/ML systems, inference services, or data‑intensive workloads.
- Experience contributing to CI/CD processes and deployment automation.
- Familiarity with hybrid architectures spanning device, edge, and cloud.
- Passion for automation, reliability, and operational excellence.
What Success Looks Like
- Systems become easier to operate, observe, and trust.
- Alerts are more accurate and actionable.
- On‑call load decreases through thoughtful automation and improvements.
- Deployment workflows become more reliable and repeatable.
- You grow toward deeper ownership and technical leadership within the reliability engineering organization.