Prometheus Relabelconfigs Example Jobs in Usa

S

REO Resiliency Engineering and Quality Leader (Hybrid)

✦ New

Salary not disclosed

Saint Paul, MN, Hybrid 1 day ago

*At Securian Financial the internal position title is Infrastructure Dir."

Mission

"To lead the engineering discipline that ensures Securian's technology platforms and cloud services are built and operated with uncompromising resilience, performance, and quality. This role drives the design and automation of fault-tolerant, high-availability architectures across AWS, Azure, and GCP-ensuring the enterprise meets resiliency, scalability, and efficiency expectations at every layer of technology."

Positioning

The Director of Resilience Engineering and Quality Leader is both a strategic peer and technical counterpart to the Infrastructure & Reliability Engineering Leader.

This role provides bench depth and succession coverage for REO's most technically complex domains while driving innovation in reliability, resilience, and performance practices.

Strategic influence: Shapes cloud reliability, quality engineering, and resilience strategy across REO and Architecture domains.
Operational authority: Leads Sr. Managers and Managers who own the execution of quality, resilience, and performance engineering capabilities.
Enterprise collaboration: Works hand-in-hand with Technology, Solution, Business, Data, and Enterprise Architects to embed reliability and resilience as core architecture principles.

Scope of Accountability

Resilience Engineering & Cloud Reliability

Architect and validate fault-tolerant, regionally resilient architectures across AWS, Azure, and GCP.
Own resilience automation, chaos testing, and IaC-based recovery validation.
Lead cross-cloud reliability design reviews and failure-mode analyses for critical systems.

Quality Engineering & Continuous Testing

Define enterprise-wide quality engineering strategy integrated into CI/CD pipelines.
Drive automation-first testing (functional, non-functional, performance, resilience).
Embed observability-driven quality validation and contract testing across services.

Performance, Capacity & Efficiency Engineering

Oversee predictive capacity planning, scaling automation, and cost/efficiency optimization (FinOps/GreenOps).
Partner with Platform & Infrastructure teams to tune performance across application and platform layers.
Measure and report on performance SLIs/SLAs aligned to REO's Reliability Metrics framework.

Cross-Domain Architecture Collaboration

Partner with Enterprise Architects to codify resilience and reliability standards in technology blueprints.
Collaborate with Technology & Solution Architects to design service reliability into delivery architectures.
Engage Data Architects for data resilience, replication, and pipeline reliability.
Work with Business Architects to align technical reliability goals with critical business outcomes.

Leadership & Talent Development

Lead a team of Sr. Managers and Managers, fostering a high-performance, hands-on engineering culture.
Build and mentor top-tier technical talent in cloud reliability, resilience, and quality automation.
Partner with HR and REO Enablement to develop succession plans and technical competency frameworks.

Core Technical Competencies

AWS (primary) - Multi-account design, HA architecture, region failover, resilience automation, Terraform/CDK/CloudFormation.
Azure & GCP (secondary) - Compute, networking, and reliability constructs; hybrid cloud design and failover integration.
Infrastructure as Code (IaC) - Deep proficiency in Terraform, policy-as-code (OPA/Conftest), drift detection, pipeline integration.
Reliability & Chaos Engineering - AWS Fault Injection Simulator, Gremlin, steady-state hypothesis design.
Observability & Quality Automation - OpenTelemetry, Prometheus, CloudWatch, K6, Gatling; CI/CD quality gates and dashboards.
Performance Engineering - Load, stress, and soak testing automation; performance profiling and SLO alignment.
Disaster Recovery Automation - Cross-region orchestration, IaC-driven DR runs, replication validation.
FinOps/GreenOps - Cloud cost and efficiency automation, carbon-aware scaling policies.

Leadership Competencies

Strategic Technical Leadership: Operates at the intersection of deep engineering and executive strategy.
Multi-Domain Collaborator: Integrates reliability and resilience across architecture, operations, and business domains.
Talent Multiplier: Develops and empowers senior managers, fostering engineering mastery and innovation.
Credible Technical Authority: Trusted peer to Infrastructure & Reliability Engineering; capable of leading architecture reviews and executive briefings.
Change Champion: Drives transformation of reliability practices across platforms, pipelines, and teams.

Qualifications & Experience

12+ years in cloud engineering, reliability, or platform leadership roles.
5+ years leading Sr. Managers/Managers in technical domains.
Proven expertise across AWS, with working knowledge of Azure and GCP.
Experience with multi-cloud governance, DR design, IaC at scale, and reliability automation.
Strong understanding of observability, SRE principles, and REO/ITIL-aligned reliability frameworks.
Certifications:
- Required: AWS Certified Solutions Architect - Professional
- Preferred: AWS DevOps Engineer, Azure Solutions Architect Expert, Google Professional Cloud Architect

Success Metrics

99.9% availability maintained for Tier-1 workloads.
100% coverage of DR automation for Tier-1 services.
25% annual increase in automated quality/test coverage.
15% annual improvement in resource efficiency and cost performance.
Documented resilience participation across all enterprise architecture blueprints.
Positive "technical peer readiness" and succession rating from Head of REO.

Summary Value Proposition

This Director role blends deep AWS reliability engineering expertise, multi-cloud technical breadth, and leadership scale.

It ensures REO maintains both technical depth and leadership redundancy, and it strengthens the bridge between engineering execution and enterprise architecture alignment.

#LI-hybrid **This position will be in a hybrid working arrangement.**

Securian Financial believes in hybrid work as an integral part of our culture. Associates get the benefit of working both virtually and in our offices. If you're in a commutable distance (90 minutes), you'll join us 3 days each week in our offices to collaborate and build relationships. Our policy allows flexibility for the reality of business and personal schedules.

The estimated base pay range for this job is:

$145,000.00 - $267,000.00

Pay may vary depending on job-related factors and individual experience, skills, knowledge, etc. More information on base pay and incentive pay (if applicable) can be discussed with a member of the Securian Financial Talent Acquisition team.

Be you. With us. At Securian Financial, we understand that attracting top talent means offering more than just a job - it means providing a rewarding and fulfilling career. As a valued member of our high-performing team, we want you to connect with your work, your relationships and your community. Enjoy our comprehensive range of benefits designed to enhance your professional growth, well-being and work-life balance, including the advantages listed here:

Paid time off:

We want you to take time off for what matters most to you. Our PTO program provides flexibility for associates to take meaningful time away from work to relax, recharge and spend time doing what's important to them. And Securian Financial rewards associates for their service by providing additional PTO the longer you stay at Securian.
Leave programs: Securian's flexible leave programs allow time off from work for parental leave, caregiver leave for family members, bereavement and military leave.
Holidays: Securian provides nine company paid holidays.

Company-funded pension plan and a 401(k) retirement plan: Share in the success of our company. Securian's 401(k) company contribution is tied to our performance up to 10 percent of eligible earnings, with a target of 5 percent. The amount is based on company results compared to goals related to earnings, sales and service.

Health insurance: From the first day of employment, associates and their eligible family members - including spouses, domestic partners and children - are eligible for medical, dental and vision coverage.

Volunteer time: We know the importance of community. Through company-sponsored events, volunteer paid time off, a dollar-for-dollar matching gift program and more, we encourage you to support organizations important to you.

Associate Resource Groups: Build connections, be yourself and develop meaningful relationships at work through associate-led ARGs. Dedicated groups focus on a variety of interests and affinities, including:

Mental Wellness and Disability
Pride at Securian Financial
Securian Young Professionals Network
Securian Multicultural Network
Securian Women and Allies Network
Servicemember Associate Resource Group

For more information regarding Securian's benefits, please review our Benefits page.

This information is not intended to explain all the provisions of coverage available under these plans. In all cases, the plan document dictates coverage and provisions.

Securian Financial Group, Inc. does not discriminate based on race, color, religion, national origin, sex, gender, gender identity, sexual orientation, age, marital or familial status, pregnancy, disability, genetic information, political affiliation, veteran status, status in regard to public assistance or any other protected status. If you are a job seeker with a disability and require an accommodation to apply for one of our jobs, please contact us by email at , by telephone (voice), or 711 (Relay/TTY).

To view our privacy statement click here

To view our legal statement click here

Remote working/work at home options are available for this role.

Not Specified

Y

W2 Role: Senior Site Reliability Engineer

✦ New

🏢 Yochana

Salary not disclosed

Charlotte, North Carolina 8 hours ago

Job Title : Senior Site Reliability Engineer

Location : Charlotte, NC/ Columbus, OH – Hybrid (3 days onsite a week)

Duration : Contract role (W2)

In-person Interview required in NJ or NC on 21st Saturday March

Job Description:

Tech Stack: Java/J2EE (Spring, Spring Boot, Python, Shell Scripting, Kafka, Oracle, MongoDB etc.).

10+ years of Software Engineering experience
5+ years of experience in Site Reliability Engineering teams with continued focus on improving Platform health
Familiar with Agile or other rapid application development practices
Hands-on expertise in building dashboards using APM tools.
Experience with distributed (multi-tiered) systems, algorithms, relational databases, and NoSQL databases.
Knowledge & Exposure caching tools (Redis, memcache) or messaging tools such as MQ, Kafka.
Must have working knowledge of APM tools such as splunk, GCL, ELK, Grafana, Prometheus etc.
Able to create Dashboards using GCL/Splunk/ELK and setup alerts.
Working knowledge of CICD is a plus – Source control like Git, Continuous Integration – Jenkins / UCD Release etc. .
Ability to work with Engineering teams across the ecosystem such as Security, Networking & Infrastructure challenges which can impact platform health & resiliency.
Shell Scripting / DevOps tools like Ansible with good knowledge of yaml file to write playbooks .
Experience with distributed storage technologies like NFS as well as dynamic resource management frameworks PCF, Kubernetes / OpenShift, AWS or Azure.
A proactive approach to spotting problems, areas for improvement, and performance bottlenecks

Not Specified

V

RedHat OpenShift & Kubernetes SME

✦ New

🏢 VDart

Salary not disclosed

Princeton, New Jersey 8 hours ago

Job Title: RedHat OpenShift & Kubernetes SME

Location: Princeton - NJ - 08540

Mode : Contract (6+ Months) – Onsite

Min 15 Years of experience required.

Qualifications:

Design, deploy and maintain Red Hat OpenShift and Rancher Managed Kubernetes Clusters

Architect Highly available, scalable, and secure container platforms

Install, configure, upgrade and patch OpenShift and Rancher Clusters

Implement logging, monitoring, and alerting (Prometheus, Grafana, EFK etc.)

Troubleshoot Cluster, Networking, Storage, and application issues

Perform root cause analysis and provide performance optimization

Act as an SME for OpenShift and Rancher Technologies

Provide guidance to Customer and application teams

Create documentation, standards, and operational runbooks

Strong Hands-on experience with RedHat Open Shift and Rancher (RKE, RKE2)

Expert knowledge of Kubernetes architecture and Operations

Experience supporting mixed OS environments (Windows and Linux).

Excellent communication skills, able to explain complex concepts to technical and non-technical audiences.

Demonstrated ability to work independently and as part of a team.

Relevant certifications (RHCA, CKA, CKAD, etc.) and active participation in the Kubernetes community are a plus.

Experience with CI/CD Pipelines

Not Specified

Agentic AI Engineer

✦ New

🏢 Unisys

Salary not disclosed

Rockville, Maryland 8 hours ago

Overview

Architects and builds the infrastructure and tooling that powers AI agent development across the Software Development Lifecycle (SDLC). Develops production-grade agentic systems, orchestration frameworks, and observability solutions that enable teams to build, deploy, and monitor reliable AI agents at scale. Plays a key role in defining and implementing the next generation of SDLC through AI-first innovation and comprehensive instrumentation.

What We're Looking For

You demonstrate sharp product sense for high-impact automation opportunities, technical taste in implementation decisions, and the ability to clearly articulate trade-offs. You know when to apply AI agent solutions versus simpler approaches and can explain the \"why\" behind architectural choices.

You excel at 0-to-1 (and 1-to-100) product development, comfortable operating in ambiguous environments where requirements emerge through experimentation and iteration rather than upfront specification.

Key Responsibilities

AI Agent Development & Automation:

• Develop production-grade AI agents that eliminate manual handoffs across the SDLC

• Create custom integrations and CLI tools that give agents deep understanding of internal systems and codebases

• Design comprehensive testing strategies to ensure agent reliability and output quality

• Implement \"Golden Path\" scaffolding that embeds organizational standards into new projects

• Build AI solutions that improve codebase navigation, documentation, and developer workflows

• Identify workflow bottlenecks and deliver measurable impact through intelligent automation

• Shape SDLC evolution by identifying AI-first opportunities and proving outcomes through experimentation

Agent Infrastructure & Platform:

• Architect and maintain production infrastructure supporting agent deployment, lifecycle management, and scaling

• Develop agent frameworks, templates, and SDKs that accelerate agent development

• Create governed Model Context Protocol (MCP) catalog enabling compliant agent-to-agent and agent-to-MCP communication

• Implement governance controls for agent behavior, permissions, and system access

Observability & Performance Analytics:

• Design and implement metrics, monitoring, and logging infrastructure for AI agents and development workflows

• Build dashboards that provide actionable insights into developer productivity, tool adoption, and agent performance

• Establish KPIs and measurement frameworks to quantify the impact of AI-powered automation

• Create alerting and anomaly detection systems to ensure reliability of agents and tooling

• Analyze telemetry data to identify optimization opportunities and guide strategic investment decisions

Collaboration & Impact:

• Partner across teams to drive adoption of AI-powered tooling and process transformation

• Stay current with LLM technologies and coach colleagues on AI-assisted development and automation best practices

• Rapidly prototype solutions to validate use cases and prove value quickly

• Communicate data-driven insights to stakeholders through clear visualizations and reports

Preferred Qualifications:

• 5-7+ years of software engineering experience building production systems

• Proven experience building agentic systems using LLM orchestration frameworks

• Hands-on expertise with AI-powered development tools (code assistants, AI-enhanced editors)

• Strong foundation in SDLC, system design, and internal tooling development

• Experience with observability tools and practices including metrics collection, logging frameworks, and dashboard development

• Full-stack technical proficiency:

• Languages: Java, Python, JavaScript/TypeScript

• Frameworks: Angular, Spring Boot

• CI/CD platforms and cloud infrastructure (AWS)

• Monitoring/observability tools (e.g., Prometheus, Grafana, CloudWatch)

• Passion for transforming software development through AI innovation and data-driven decision making

# LI-CGTS

# TS-2505

Not Specified

V

Sr. Cloud Engineer

✦ New

🏢 Venteon

Salary not disclosed

Rochester, Michigan 8 hours ago

Position Summary

Our client is building a modern, cloud-native platform that powers connected, data-driven manufacturing operations. Their technology sits at the center of increasingly automated factories, integrating equipment, software systems, and real-time production data into a scalable SaaS platform used by global manufacturers.

To support rapid growth and platform scale, they are seeking a Senior Cloud Operations Engineer to own the reliability, performance, and operational excellence of their cloud infrastructure. This is a highly impactful role responsible for ensuring the platform remains highly available, secure, and scalable as adoption continues to grow.

This position is ideal for engineers who thrive in modern cloud environments, enjoy solving complex reliability challenges, and prefer automating everything possible. The right person will combine deep technical expertise with strong operational discipline, helping build a world-class cloud platform supporting real industrial environments.

Key Responsibilities

Cloud Operations & Reliability

• Maintain and optimize production, staging, and development environments running in Kubernetes on AWS

• Implement and manage monitoring, logging, alerting, and observability frameworks

• Lead incident response efforts and drive post-incident reviews focused on continuous improvement

• Own backup, disaster recovery, and business continuity processes

• Perform system capacity planning and performance tuning

Automation & Infrastructure Management

• Build and maintain Infrastructure-as-Code using tools such as Terraform or Pulumi

• Automate provisioning, configuration management, and environment lifecycle processes

• Identify and eliminate operational inefficiencies through automation

• Manage secrets, environment configuration, and version control across infrastructure environments

Security & Compliance

• Implement and maintain least-privilege access models and cloud security guardrails

• Support vulnerability management, patching workflows, and dependency maintenance

• Assist with compliance readiness efforts including SOC 2, ISO 27001, or similar frameworks

• Ensure proper logging, retention, and audit practices across cloud environments

FinOps / Cost Optimization

• Monitor and optimize cloud spend across services and environments

• Implement tagging standards, budget alerts, and cost visibility frameworks

• Recommend architectural improvements to balance performance and cost efficiency

Collaboration & Leadership

• Partner closely with engineering teams to improve reliability, deployment pipelines, and system architecture

• Mentor engineers on operational best practices and cloud platform management

• Develop runbooks, documentation, and operational standards

• Champion reliability engineering principles, operational maturity, and risk reduction practices

Technical Environment

Candidates should be comfortable working in modern cloud-native environments and familiar with:

• Kubernetes clusters, autoscaling, Helm charts, and service mesh concepts

• AWS cloud services including compute, networking, storage, and cost management

• Infrastructure-as-Code frameworks such as Terraform

• Observability platforms such as Datadog, CloudWatch, Prometheus, or New Relic

• CI/CD tools such as GitHub Actions, Bitbucket Pipelines, or Bamboo

• Linux systems administration and troubleshooting

• SRE practices including SLIs, SLOs, MTTR, RTO/RPO, and incident management

Not Specified

V

Teradata Infrastructure DBA

✦ New

🏢 Ventures Unlimited Inc

Salary not disclosed

Plano, Texas 8 hours ago

Must have

Teradata platform expertise

• Deep knowledge of Teradata architecture: parsing, BYNET, AMP, vproc, fallback, hashing, PDCR, fallback, and spool management.

• Data distribution and primary index design; collecting statistics and understanding optimizer behavior.

• Experience with recent Teradata versions and releases migration/upgrade planning: TD 16.XX, TD 17.XX and preferably TD 20.XX.

System administration

• Provisioning and managing Teradata nodes and clusters (physical and virtual).

• OS-level skills: Linux administration (SLES/RHEL/CentOS/Oracle Linux) for Teradata on Linux, including kernel tuning, package management, user and permissions management.

• Storage subsystem knowledge: SAN, NAS, Fibre Channel, LUNs, RAID, and how storage impacts Teradata I/O and spool.

Performance tuning and troubleshooting

• SQL query and plan analysis; collecting and interpreting Explain plans.

• Workload management (WLM) and resource allocation: query prioritization, throttling, and KRI/SLAs.

• Monitoring and diagnostics: using Teradata tools and logs to analyze spool, CPU, memory, disk I/O, network, BYNET contention.

Backup, recovery & high availability

• Best practices for backups restore procedures, and disaster recovery (DR) planning and testing.

• Knowledge of fallback, AMP resilience, replication methods and physical vs logical protection.

Security & compliance

• DB and platform-level security: roles, privileges, LDAP/Kerberos integration, encryption (at rest/in transit), auditing and compliance (SOx and Others as applicable).

• Secure configuration and hardening practices.

Networking & infrastructure

• Network architecture for Teradata clusters, VLANs, link aggregation, low-latency requirements, and BYNET tuning.

• Integration with enterprise infrastructure: DNS, NTP, monitoring stacks, and identity providers.

Automation, scripting & tools

• Scripting languages: Bash, Python, Perl for automation, maintenance, and custom monitoring. – one of them

• Configuration management and automation tools: Ansible, Terraform, Chef, or Puppet (as used in the enterprise). – one of them

• Familiarity with Teradata utilities and tools: BTEQ, FastLoad, MultiLoad, TPT (Teradata Parallel Transporter), DBSControl, Viewpoint, Teradata Studio/SQL Assistant. – one of them

Observability & tooling

• Use of monitoring/alerting tools (Viewpoint, Prometheus, Grafana, Splunk, Nagios, etc.) and designing dashboards and alerts. One of them, View point is mandatory

• Capacity planning, trending, and forecasting for CPU, disk, spool, and concurrency.

Soft skills & organizational capabilities

• Incident management and on-call experience

• Leading postmortems, RCA (root-cause analysis), implementing corrective actions.

• Communication and stakeholder management: vendors, management and applications

• Translate technical impacts to business stakeholders; coordinate with DBAs, developers, network/storage teams, and vendors.

Role and Responsibilities

Installs, configures and upgrades Teradata software and related products.

• Backup, restore, migrate Teradata data and objects

• Establish and maintain backup and recovery policies and procedures.

• Manages and monitor system performance. proactively monitor the database systems to ensure secure services with minimum downtime

• Implements and maintains database security.

• Sets up and maintains documentation and standards.

• Supports multiple Teradata Systems including independent marts/ enterprise warehouse.

• Work with the team to ensure that the associated hardware resources allocated to the databases and to ensure high availability and optimum performance.

• Responsible for improvement and maintenance of the databases to include rollout and upgrades.

• Responsible for implementation and release of database changes as submitted by the development team, Working with end customer

• Teradata, customer, datacenter, vendor co-ordinations

• Forecast data, security audits

• User account and access management

• Teradata active system management and customer requests and system allocation

• Backup and recovery

• SOX compliance and audits

• DB support from 3rd party vendors

• Product evaluations

• On call support and major incidents

• Backup restore, frequency and retention

• Disaster recovery

• Create long r

Not Specified

M

Senior Software Deployment & Customer Operations Engineer

🏢 Metric Bio

Salary not disclosed

Boston, MA 6 days ago

Senior Software Engineer – Deployment & Reliability (Digital Pathology / Medical Imaging)

A fast-growing technology company operating in the digital pathology and medical imaging space is seeking a Senior Software Engineer to support the deployment, configuration, and long-term reliability of advanced imaging and AI-driven software systems.

This role sits at the intersection of software deployment, infrastructure engineering, and site reliability, ensuring complex software platforms are successfully installed, integrated with customer IT environments, and maintained at high levels of performance and stability.

You will work closely with engineering, customer support, and monitoring teams to ensure a smooth transition from system deployment to ongoing operational support while contributing to improvements that make deployments more scalable and reliable over time.

Key Responsibilities

Deployment & Configuration

Lead end-to-end deployments of imaging, AI, and data management software systems at customer environments
Configure and integrate servers, clusters, and storage systems within hospital or laboratory IT infrastructures
Work with networking, authentication, storage, and security configurations to ensure successful installations
Collaborate with field engineering teams during system installation and commissioning
Develop standardized deployment playbooks, documentation, and validation checklists

System Reliability & Upgrades

Manage software version rollouts, upgrades, and patching across deployed customer environments
Work with monitoring and observability teams to track system performance and health
Troubleshoot complex issues across multi-component systems including imaging software, AI inference pipelines, and storage layers
Improve automation around upgrades, rollbacks, and maintenance processes

Engineering Collaboration & Continuous Improvement

Identify recurring deployment or performance challenges and work with R&D teams to design long-term solutions
Provide structured feedback from field deployments to improve product architecture and deployment workflows
Validate new deployment tools, frameworks, and configuration approaches prior to wider rollout
Contribute to improving the scalability and resilience of the overall platform

Customer IT & Cross-Functional Collaboration

Serve as a technical liaison with customer IT teams regarding networking, infrastructure, security, and data access
Ensure deployments comply with institutional IT policies and healthcare regulatory requirements
Collaborate closely with support and monitoring teams to align escalation processes and root cause investigations
Participate in post-deployment reviews to improve operational processes and reliability

Documentation & Knowledge Sharing

Maintain detailed installation and configuration documentation
Develop deployment guides, troubleshooting documentation, and internal knowledge resources
Support and mentor field teams on standardized deployment and configuration practices

Requirements

Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or related discipline
5+ years of experience in software deployment, DevOps, infrastructure engineering, or systems engineering
Strong Linux (Ubuntu) administration and scripting skills
Experience with containerization and orchestration technologies (Docker, Kubernetes)
Experience with database technologies such as PostgreSQL or MongoDB
Familiarity with web service configuration (Nginx or Apache)
Solid understanding of networking concepts including VPNs, firewalls, and authentication systems
Ability to troubleshoot complex distributed systems across software, infrastructure, and data layers
Strong communication and collaboration skills when working with cross-functional teams and customer IT stakeholders

Preferred Experience

Exposure to medical imaging systems, digital pathology, or healthcare technology environments
Familiarity with DICOM or PACS systems
Experience deploying or supporting AI/ML models in production environments
Experience with observability and monitoring tools (Prometheus, Grafana, ELK)
Knowledge of regulated environments and healthcare compliance frameworks (HIPAA, GDPR, IVDR)
Experience supporting hardware and software integrated systems

Why This Role

This position offers the opportunity to work on advanced digital pathology and imaging technologies that support clinical diagnostics and research globally. The role combines hands-on technical deployment with the chance to influence how complex systems are designed, automated, and scaled across a growing global customer base.

Not Specified

C

Observability and AI Enterprise Architect

✦ New

🏢 ClifyX

Salary not disclosed

Edison, NJ 1 day ago

Key Responsibilities:

Design and deploy observability frameworks leveraging tools such as Grafana, Dynatrace, Prometheus, ELK, Splunk, etc. Define best practices for monitoring, alerting, and visualization across hybrid and multi-cloud environments.
Develop strategies for monitoring KPIs tied to business outcomes (e.g., sales performance, supply chain efficiency, customer experience).
Collaborate with business and IT teams to identify key metrics and integrate them into dashboards and alerting systems.
Implement AIOps solutions using industry-leading platforms like OpenAI, AWS Bedrock, Google Gemini, Anthropic, and similar technologies.
Develop predictive analytics and anomaly detection models to proactively identify and resolve operational issues.
Integrate observability tools with ITSM platforms and automation workflows. Enable automated root cause analysis and remediation using AI/ML models.
Provide observability strategies for infrastructure (servers, storage, cloud), applications (microservices, APIs), and networks (LAN/WAN, SD-WAN). Collaborate with DevOps, SRE, and IT operations teams to ensure end-to-end visibility and reliability.
Establish observability standards, KPIs, and SLAs for performance and availability. Ensure compliance with security and regulatory requirements in monitoring solutions.
Develop scalable architecture using LLMs, agentic frameworks, and multi-modal AI technologies.
Build AI-powered analytics platforms for IT operations analysis, anomaly detection, and predictive insights.
Architect and deploy intelligent chatbots for IT support and self-service capabilities.
Integrate AI solutions with existing IT operations tools and workflows.
Implement automated remediation and root cause analysis using AI/ML models.

Qualifications:

10-13 years of relevant experience
Hands-on experience with Grafana, Dynatrace, and other monitoring platforms.
Practical experience implementing AI-based solutions for anomaly detection, predictive maintenance, and automated remediation. Familiarity with OpenAI, Bedrock, Gemini, Anthropic, or similar AI platforms.
Strong understanding of infrastructure, application architectures, and networking. Experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes).
Proficiency in Python, Bash, or similar scripting languages for automation and integration.
Strong experience with LLMs (OpenAI, Anthropic, Gemini, Bedrock) and agentic AI solutions.
Hands-on experience in designing AI architectures for enterprise IT environments.
Proficiency in Python or similar languages for AI model integration and automation.

Not Specified

P

Regional Sales Executive

🏢 Prometheus Materials

Salary not disclosed

Longmont, CO 1 week ago

Company Description

Prometheus Materials is at the forefront of sustainable innovation, providing cutting-edge building materials that drive the transition to a carbon-negative future. Drawing inspiration from nature, our solutions utilize microalgae in the creation of our ProZERO™ line of carbon-negative supplemental cement blends. These blends are optimized for ready-mix concrete applications, manufactured products, and licensed material solutions tailored to the needs of existing concrete manufacturers. Prometheus Materials is dedicated to reshaping the construction industry with environmentally friendly and high-performance materials.

Role Description

This is a full-time, on-site role for a Sales Executive, based in Longmont, CO. The Sales Executive will be responsible for driving revenue growth by identifying and pursuing sales opportunities, building and nurturing client relationships, and developing sales strategies. Key responsibilities include generating leads, delivering presentations, negotiating contracts, closing transactions and achieving sales targets. Collaboration with internal teams to align sales strategies with business objectives will also be an integral part of the role. The Sales Executive is responsible for identifying, developing, selling and closing customers in Colorado, Arizona, New Mexico, Wyoming, So. California and Texas. You will evaluate and execute new business opportunities which align with Prometheus Materials’ overall market growth strategies. This position will work closely with building owners, architects, distributors, general contractors, cement manufacturers, and ready mix concrete providers.

Qualifications

Strong sales and negotiation skills, with the ability to build and maintain client relationships.
Proficiency in creating sales strategies, delivering effective presentations, and closing transactions.
Excellent communication and interpersonal skills to engage effectively with clients and internal teams.
Knowledge of sustainable building materials or the construction industry is an advantage.
Self-motivated, results-driven, and organized, with the ability to meet sales targets and deadlines.
Proficiency in relevant sales and CRM tools is preferred
Minimum of 5 years of experience in sales in the cement and/or concrete related industries
Experience within the building materials industry preferred (e.g., sand and gravel, cement, ready mix, or admixtures)
Proven experience collaborating with industry experts (Architects and Engineers)
Working knowledge of key high-level industry standards relating to cement, concrete, and aggregates
Strong understanding of business-to-business sales cycles, sales strategies, and key performance metrics (KPIs)
Demonstrated experience developing, managing, and executing sales strategies to drive revenue growth
Knowledge or experience with sustainability initiatives, LEED certification, and carbon reduction targets
Strong negotiation, presentation, and facilitation skills

Responsibilities

This is a summary of activities and is not intended to be all-inclusive of all responsibilities:

Meet or exceed agreed upon sales attainment goals
Develop, maintain, and track product backlog and bid activity
Create and manage key account plans, including defined goals, activities, strategies, and timelines
Communicate regular updates of key performance indicators, including volume, revenue, and strategic initiatives
Identify, secure, grow, and manage key licensing opportunities across multiple industries
Monitor and maintain competitive intelligence, including competitor products, pricing strategies, and development activities
Regularly review the sales cycle and implement continuous improvement strategies
Travel up to 40% as required

Please send resume and cover letter to

Not Specified

C

Linux System Administrator

🏢 Cellcom

Salary not disclosed

Green Bay, WI 1 week ago

The Linux Systems & Automation Engineer is responsible for designing, deploying, automating, and operating enterprise Linux infrastructure and supporting applications running there. This role focuses on infrastructure-as-code, automation tooling, monitoring, and reliability engineering across on-prem and hybrid environments. The engineer will collaborate with network, platform, and application teams to deliver scalable, secure, and repeatable infrastructure.

Key Responsibilities

Linux Systems Engineering

Design, deploy, and maintain Linux systems across bare metal and virtual environments.
Develop and enforce OS baseline standards, hardening, and patching processes.
Manage system lifecycle: provisioning, configuration, upgrades, and decommissioning.

Automation & Infrastructure as Code

Build and maintain automation pipelines using Ansible, Terraform, cloud-init, or equivalent tools.
Develop Ruby/Python/Bash tooling to automate operational workflows.
Create standardized system images, templates, and deployment frameworks.

Monitoring, Observability, and Reliability

Design and maintain monitoring and telemetry platforms and data pipelines. (Zabbix, Prometheus, Grafana, OpenSearch, etc.).
Analyze metrics and logs to improve system reliability and performance.
Participate in on-call and incident response.
Conduct root cause analysis and drive corrective actions.

DevOps & Tooling

Develop and Maintain Git-based workflows and CI/CD pipelines for infrastructure code.
Develop internal tools to improve provisioning, validation, and operational efficiency.
Collaborate in architecture reviews and technical design discussions.

Requirements:

Bachelor’s degree in Computer Science, Engineering, Information Systems, or equivalent experience.
5+ years of experience administering Linux systems in enterprise environments.
Strong Linux fundamentals (processes, networking, storage, security, kernel concepts).
Proficiency in scripting/programming (Python, Bash, Ruby, etc.).
Experience with Git and collaborative development workflows.
Experience with automation/configuration management tools (Ansible, Terraform, Puppet, Chef, etc.).

Preferred Qualifications

Experience with containers and orchestration (Docker, Kubernetes).
Virtualization experience (VMware, KVM, OpenStack).
Cloud platform experience (AWS, Azure, GCP) or hybrid architectures.
Monitoring/observability tooling experience (Prometheus, Grafana, Zabbix, ELK/OpenSearch).
Security experience (SSH hardening, PAM, SELinux, CIS benchmarks).
Experience supporting telecom, financial, healthcare, or other regulated environments.

Not Specified