Prometheus Relabel Example Jobs in Usa

L

AWS DevOps Engineer

Salary not disclosed

Charlotte, NC 1 week ago

LTIMindtree is an equal opportunity employer that is committed to diversity in the workplace. Our employment decisions are made without regard to race, color, creed, religion, sex (including pregnancy, childbirth or related medical conditions), gender identity or expression, national origin, ancestry, age, family-care status, veteran status, marital status, civil union status, domestic partnership status, military service, handicap or disability or history of handicap or disability, genetic information, atypical hereditary cellular or blood trait, union affiliation, affectional or sexual orientation or preference, or any other characteristic protected by applicable federal, state, or local law, except where such considerations are bona fide occupational qualifications permitted by law.

A little about us...

Role: AWS DevOps Engineer

Location: Charlotte, NC

Salary: Market Rate

Job Description:

We are seeking a highly skilled Senior DevOps Engineer with strong expertise in AWS cloud infrastructure automation databases and modern containerized environments The ideal candidate will have experience designing implementing and maintaining scalable secure and reliable systems while enabling fast and efficient development workflows You will work closely with development architecture and operations teams to build robust CICD pipelines automate infrastructure provisioning and ensure high availability of business critical applications

Key Responsibilities:

Design implement and manage AWS cloud infrastructure EC2 S3 Lambda ECSEKS etc with scalability and security in mind
Develop and maintain Infrastructure as Code IaC using Terraform
Build manage and optimize Docker base images and containerized application stacks
Orchestrate and maintain Kubernetes EKS clusters for production and staging environments
Set up manage and optimize CICD pipelines in GitLab to support fast reliable deployments
Manage MCP servers and ensure reliable operations for critical services
Automate operational tasks and workflows using Python and JavaScript
Support fullstack teams React Nodejs by providing containerized environments and deployment strategies
Manage and optimize databases SQL PostgreSQL for performance security and scalability
Integrate and manage AWS streaming services Kinesis MSK Kafka or similar for realtime data pipelines
Implement container image security scanning governance and lifecycle management
Monitor system performance availability and cost implementing proactive improvements
Ensure compliance with security and governance standards across cloud infrastructure and database layers
Collaborate with developers and architects to improve application delivery scalability and resilience

Required Skills Qualifications:

8 years of experience in DevOps Cloud Infrastructure
Strong Handson experience with AWS services EC2 S3 ECSEKS Lambda VPC IAM CloudWatch Kinesis MSK
Proficiency in Terraform for infrastructure automation
Expertise with Docker including base image creation and Kubernetes orchestration
Strong scripting programming skills in Python and JavaScript
Experience with GitLab CICD for pipelines automation and environment management
Strong database experience with SQL and PostgreSQL setup scaling replication performance tuning
Exposure to streaming architectures AWS Kinesis Kafka MSK or similar
Experience supporting React based applications from a DevOps perspective
Familiarity with MCP servers and containerized service deployments
Knowledge of cloud cost optimization and security best practices
Strong problem-solving troubleshooting and communication skills
Preferred Qualifications
AWS certifications eg AWS Certified Solutions Architect DevOps Engineer Professional
Experience with monitoring observability tools Prometheus Grafana ELK Datadog
Knowledge of networking load balancing and distributed system design
Familiarity with Agile Scrum methodologies

Skills

Mandatory Skills : AWS Lambda, Docker, Python
Good to Have Skills : Ansible, Git, Kubernetes

LTIMindtree is an equal opportunity employer that is committed to diversity in the workplace. Our employment decisions are made without regard to race, color, creed, religion, sex (including pregnancy, childbirth or related medical conditions), gender identity or expression, national origin, ancestry, age, family-care status, veteran status, marital status, civil union status, domestic partnership status, military service, handicap or disability or history of handicap or disability, genetic information, atypical hereditary cellular or blood trait, union affiliation, affectional or sexual orientation or preference, or any other characteristic protected by applicable federal, state, or local law, except where such considerations are bona fide occupational qualifications permitted by law.

Not Specified

L

DevOps Engineer

🏢 LTIMindtree

Salary not disclosed

Berkeley Heights, NJ 1 week ago

LTIMindtree is an equal opportunity employer that is committed to diversity in the workplace. Our employment decisions are made without regard to race, color, creed, religion, sex (including pregnancy, childbirth or related medical conditions), gender identity or expression, national origin, ancestry, age, family-care status, veteran status, marital status, civil union status, domestic partnership status, military service, handicap or disability or history of handicap or disability, genetic information, atypical hereditary cellular or blood trait, union affiliation, affectional or sexual orientation or preference, or any other characteristic protected by applicable federal, state, or local law, except where such considerations are bona fide occupational qualifications permitted by law.

A little about us...

Role: Azure DevOps Engineer

Location: Berkeley Heights, NJ

Job Description:

1. Extensive hands-on experience on GitHub Actions writing workflows in YAML using re-usable templates

2. Extensive hands-on experience with application CI/CD pipelines both for Azure and on-prem for different frameworks

3. Hands on experience with Azure DevOps and migration programs of CI/CD pipelines preferably from Azure DevOps to GitHub Actions

4. Proficiency in integrating and consuming REST APIs to achieve automation through scripting

5. Hands on experience with atleast 1 scripting language and has done out of box automations for platforms like People Soft, SharePoint, MDM etc

6. Hands on experience with CI/CD of databases

7. Good to have experience with infrastructure-as-code including ARM templates Terraform Azure CLI Azure PowerShell modules

8. Exposure to monitoring tools like ELK Prometheus Grafana

LTIMindtree is an equal opportunity employer that is committed to diversity in the workplace. Our employment decisions are made without regard to race, color, creed, religion, sex (including pregnancy, childbirth or related medical conditions), gender identity or expression, national origin, ancestry, age, family-care status, veteran status, marital status, civil union status, domestic partnership status, military service, handicap or disability or history of handicap or disability, genetic information, atypical hereditary cellular or blood trait, union affiliation, affectional or sexual orientation or preference, or any other characteristic protected by applicable federal, state, or local law, except where such considerations are bona fide occupational qualifications permitted by law.

Not Specified

A

Agentic QA Engineer

🏢 Astir IT Solutions, Inc.

Salary not disclosed

Dallas, TX 1 week ago

Hi

I hope you’re doing well.

My name is Sai, and I’m an Account Manager with Astir IT Solutions. We are currently working with our client on a senior-level opportunity for Agentic AI QA Engineer at Dallas, TX (Need Locals)!

Based on your background, I believe this role could be a strong fit.

Job Title: Agentic AI QA Engineer

Location: Dallas, TX (Need Locals)

Experience: 7+ years

Position type: Contract W2/C2C

Required Qualifications

• 7+ years in Software QA/Testing, with 2+ years in AI/ML or LLM-based systems; hands-on experience testing agentic/multi-agent architectures.

• Strong programming skills in Python or TypeScript/JavaScript; experience building test harnesses, simulators, and fixtures.

• Experience with LLM evaluation (exact/soft match, BLEU/ROUGE, BERTScore, semantic similarity via embeddings), guardrails, and prompt testing.

• Expertise in distributed systems testing latency profiling, resiliency patterns (circuit breakers, retries), chaos engineering, and message queues.

• Familiarity with orchestration frameworks (LangChain, LangGraph, LlamaIndex, DSPy, OpenAI Assistants/Actions, Azure OpenAI orchestration, or similar).

• Proficiency with CI/CD (GitHub Actions/Azure DevOps), observability (OpenTelemetry, Prometheus/Grafana, Datadog), and feature flags/canaries.

• Solid understanding of privacy/security/compliance in AI systems (PII handling, content policies, model safety).

• Excellent communication and leadership skills; proven ability to work cross-functionally with Ops, Data, and Engineering.

Preferred Qualifications

• Experience with multi-agent simulators, agent graph testing, and tooling latency emulation.

• Knowledge of MLOps (model versioning, datasets, evaluation pipelines) and A/B experimentation for LLMs.

• Background in cloud (AWS), serverless, containerization, and event-driven architectures.

Prior ownership of cost/latency/SLAs for AI workloads in production

If you are currently open to new opportunities, I would appreciate the chance to connect and discuss this role in more detail. Please let me know a convenient time for a quick call, or feel free to share your updated resume.

Looking forward to hearing from you.

Thanks & Regards.

Sai

Sr. Account Manager

Astir IT Solutions, Inc.

ID: , Contact: 732-694-6000 * 795

Not Specified

S

Platform Engineer

🏢 Soho Square Solutions

Salary not disclosed

San Jose, CA 1 week ago

We are seeking an experienced Cloud Platform Engineer with deep expertise in Red Hat OpenShift and strong Linux systems engineering background. This role will be responsible for designing, building, and operating large-scale OpenShift platforms within on-premises datacenter environments.

The ideal candidate will work closely with SRE teams and Program Management to drive the successful implementation, scaling, and operationalization of enterprise-grade OpenShift infrastructure.

Key Responsibilities

1. Platform Engineering

Design, deploy, and manage enterprise-scale Red Hat OpenShift clusters in on-prem datacenter environments.
Architect highly available, scalable, and secure OpenShift platforms.
Implement cluster lifecycle management (installation, upgrades, patching, scaling).
Configure networking, storage, ingress, and security components for OpenShift.

2. Infrastructure Build & Automation

Build and automate infrastructure in datacenter environments (compute, storage, networking).
Integrate OpenShift with virtualization platforms (VMware/other hypervisors as applicable).
Develop Infrastructure-as-Code (IaC) solutions using tools such as Terraform, Ansible, or similar.
Implement CI/CD pipelines for platform deployments and updates.

3. Linux Systems Engineering

Provide deep Linux system administration and troubleshooting support.
Optimize OS-level configurations for performance, reliability, and security.
Automate system configuration and compliance management.
Diagnose and resolve complex kernel, networking, and storage issues.

4. Reliability & Operations

Partner closely with the SRE team to establish SLOs, SLIs, monitoring, and alerting.
Drive observability implementation (logging, metrics, tracing).
Participate in incident management, root cause analysis (RCA), and remediation.
Ensure platform resiliency, performance tuning, and capacity planning.

5. Program & Cross-Functional Collaboration

Work with Program Management to drive large-scale OpenShift implementation milestones.
Provide technical input into roadmap planning, timelines, and risk mitigation.
Collaborate with security, networking, storage, and application teams.
Document architecture, standards, and operational procedures.

6. Security & Compliance

Implement RBAC, security policies, and compliance controls within OpenShift.
Harden clusters according to enterprise security standards.
Support vulnerability management and patch governance processes.

Required Qualifications

5+ years of experience in Linux systems engineering (RHEL preferred).
3+ years of hands-on experience with Red Hat OpenShift (OCP 4.x preferred).
Proven experience building infrastructure in on-prem datacenter environments.
Strong understanding of:
Kubernetes architecture
Networking (DNS, load balancing, firewalls, SDN)
Storage (SAN, NAS, CSI drivers)
Virtualization platforms (VMware, etc.)
Experience with automation tools (Terraform, Ansible, GitOps).
Strong troubleshooting and problem-solving skills.

Preferred Qualifications

Red Hat certifications (RHCE, OpenShift Certification).
Experience implementing OpenShift at enterprise scale (multi-cluster environments).
Experience working in SRE-driven environments.
Knowledge of DevOps/GitOps practices.
Experience with monitoring tools (Prometheus, Grafana, ELK, etc.).

Not Specified

T

Windows & Vulnerability Management Engineer

🏢 TEKVANA INC.

Salary not disclosed

Chicago, IL 1 week ago

Job Title: Windows SRE – Vulnerability Management & PowerShell

Location: Onsite

Experience: 8+ Years

Job Summary:

Looking for a Windows SRE with strong experience in managing enterprise Windows environments, vulnerability remediation, and automation using PowerShell. The role focuses on improving system reliability, security, and operational efficiency.

Main Skills Required:

Windows Server Administration (2016/2019/2022)
Vulnerability Management (Qualys / Tenable / Nessus / Rapid7)
PowerShell Scripting & Automation
Patch Management (SCCM / WSUS / Intune)
Active Directory & Group Policy
SRE / Production Support Experience
Monitoring Tools (Splunk / Datadog / Prometheus)
Incident Management & Root Cause Analysis
Security Hardening & Compliance (CIS / NIST)
Cloud Exposure (Azure / AWS)
Infrastructure Automation (Ansible / Terraform)

Not Specified

Rotating Equipment Engineer

🏢 Belcan

Salary not disclosed

Baytown, TX 1 week ago

Job Title: Rotating Equipment Planner

Location: Baytown TX

Duration: indefinite

Rate: $50-$60 per hour DOE

Description:

Position Summary

The Rotating Equipment Planner specializes in planning, scheduling, and coordinating maintenance activities for critical rotating equipment (pumps, compressors, turbines, motors, gearboxes, cooling towers, etc.). This role prepares detailed plans for non-emergency maintenance work selected through the Risk Based Work Selection (RBWS) process, ensuring optimal equipment reliability and performance while minimizing production downtime.

Key Responsibilities

• Planning: Develop detailed work plans for rotating equipment maintenance, including precision alignments, vibration analysis, and bearing replacements with appropriate man-hour and cost estimates

• Technical Expertise: Apply specialized knowledge of rotating equipment mechanics, tolerances, and failure modes to develop effective maintenance strategies and troubleshooting procedures

• Materials Management: Ensure critical rotating equipment spare parts (bearings, seals, couplings) are properly inventoried and available; create and maintain Bills of Material

• Work Coordination: Coordinate with Contractor Management Coordinator for resource requirements; prioritize maintenance activities between crews and production teams to minimize process disruption

• Documentation & Systems: Create and maintain task lists for repetitive jobs; outline detailed work instructions with safety advice, resources, and tools; close out jobs by entering notification history

• Reliability Improvement: Collaborate with production and technical teams to establish preventive/predictive maintenance plans, including vibration monitoring programs and lubrication schedules

• Backlog Management: Review and purge backlog weekly, distributing 'ready-to-schedule' work; identify and communicate repetitive equipment problems to Asset Engineer

Required Qualifications

• High school diploma or equivalent

• 12 years of heavy industrial maintenance experience OR 7 years with an associate's degree OR 4 years with a bachelor's degree

• Certification from Vocational or Technical school in millwright or verifiable millwright experience

• Demonstrated experience in equipment planning for rotating equipment and cooling towers

• Minimum 2 years planning/scheduling experience

• In-depth knowledge with SAP-PM Maintenance Transactions and Prometheus

• Experience using Microsoft Office Products (Word, Excel, Outlook etc.)

• The eligibility to apply for and obtain a Transportation Worker Identification Credential (TWIC) within a reasonable timeframe

Physical Requirements

• Ability to climb stairs and work at heights up to 100+ feet

• Ability to climb vertical ladders

• Sufficient physical strength to perform requirements safely

• Ability to work at computer workstation for extended periods

Success Metrics Performance measured by quality of planning and meeting established KPIs

Not Specified

N

Redis Admin

🏢 Net2Source (N2S)

Salary not disclosed

New York 1 week ago

Job Title: Redis Admin

Location: NYC, NY (3 days onsite minimum)

Duration: 6 months

The ideal candidate will be responsible for designing, deploying, maintaining, and scaling Kafka clusters in mission-critical environments, while also supporting the Linux-based infrastructure that forms the foundation of our real-time data platform.

Responsibilities

Manage and maintain Redis instances, ensuring high availability and optimal performance.
Should possess well-versed experience in Redis administration and management for ex: strong understanding of data structures, caching mechanisms, and performance tuning in Redis.
Monitor system health, troubleshoot issues, and implement backup and recovery strategies for Redis clusters.
Configure Redis caching, session management, and data storage.
Develop and maintain Python scripts for data manipulation, integration, and automation related to Redis.
Create efficient data processing pipelines to ingest and process data from various sources.
Python scripting for database interactions and automation tasks. Optimize Python scripts for performance, scalability, and maintainability.
Work closely with development teams to design and implement Redis-based solutions that meet business requirements.
Provide technical support and training to team members on Redis functionalities and Python scripting best practices.
Document Redis configurations, Python scripts, and integration workflows for knowledge sharing and compliance.
Generate performance reports and dashboards to monitor Redis usage and efficiency

Qualifications

BE/B Tech/MCA
Excellent written and verbal communication skills

Preferred Qualifications/ Skills

Experience with Redis clustering, caching strategies, and distributed systems
Familiarity with monitoring tools like Prometheus and ELK Stack and cloud solutions like AWS ElastiCache
Preferred experience running Redis on Kubernetes and familiarity with Redis modules like RedisJSON
Working experience with OpenShift Kubernetes Cloud services to deploy Redis cluster using vendor provided docker/helm charts
Redis cluster monitoring & alerting
Optimizing Redis cluster performance using Jvm tuning & profiling.

Not Specified

I

Middleware Engineer

🏢 InfoVision Inc.

Salary not disclosed

Jersey City, New Jersey 1 week ago

Build and scale enterprise Kafka infrastructure using Confluent Cloud and Platform across hybrid environments. Design event-driven architectures, automate deployments with Terraform/CI/CD, optimize performance, ensure security compliance, and troubleshoot distributed streaming systems at scale.

Must Have:

5+ years Kafka (2+ years Confluent Cloud/Platform, Kafka Connect, Schema Registry, ksqlDB)
Expertise in hybrid cloud Kafka deployments (AWS/Azure/GCP + on-prem)
Strong automation (Terraform, Ansible, Jenkins) and programming (Java, Python, Scala)
Experience with monitoring/troubleshooting distributed systems (Splunk, Datadog, Prometheus)
Security expertise (Kerberos, SSL, RBAC) and compliance knowledge (GDPR, SOC, PCI)

You'll Build: Scalable Kafka clusters • Event-driven architectures • Automated CI/CD pipelines • Observability frameworks • Secure, compliant streaming platforms.

Not Specified

Y

Site Reliability Engineer

🏢 Yochana

Salary not disclosed

Columbus, Ohio 1 week ago

Please find the JD

Role: Site Reliability Engineer

Location: Columbus, OH (Onsite)

8+ years of Software Engineering experience
4+ years of experience in Site Reliability Engineering teams with continued focus on improving Platform health
Familiar with Agile or other rapid application development practices
Hands-on expertise in building dashboards using APM tools.
Experience with distributed (multi-tiered) systems, algorithms, relational databases, and NoSQL databases.
Knowledge & Exposure caching tools (Redis, memcache) or messaging tools such as MQ, Kafka.
Must have working knowledge of APM tools such as splunk, GCL, ELK, Grafana, Prometheus etc.

Gopi Pabbu

Resource Specialist

Yochana IT Solutions Inc

Mail Id :

Not Specified

M

Senior Infrastructure Engineer - Cloud and Network

🏢 Motion Recruitment

Salary not disclosed

Palo Alto, California 1 week ago

Seeking Senior Cloud Infrastructure Engineers - Hybrid Roles in Palo Alto, CA (Remote Option Not Available)

Duration: 5+ months with possibility of longer term extensions

Pay rate: $55/hr on W2

If interested, please email me your resume at

Please Note: Client is not open to C2C, H1B, TN Visa, 1099, F1 – CPT & OPT at this time.

*Must be located/authorized to work in the US without visa sponsorship or transfer now or in the future. No C2C inquiries, please

Role and Responsibilities:

Manage AWS environment using Control Tower, EKS, EC2, S3, IAM, and related services.

Administer and troubleshoot Kubernetes (EKS) and EC2 instances, including patching, lifecycle management, and performance optimization.

Provision and manage infrastructure as code using Terraform (CloudFormation a plus).

Triage and resolve ServiceNow tickets for OS and cloud issues, including vulnerability remediation, ensuring compliance with SLAs.

Automate operational tasks using Python scripting.

Drive cloud resource lifecycle activities, including commissioning/decommissioning, backups, patching, DR activities, and cost optimization.

Implement and maintain monitoring and observability (CloudWatch, Prometheus, Grafana, etc.); build/run operational runbooks and playbooks.

Design, deploy, and securely manage networks (LAN/WAN/VPN/firewalls/routers/switches) across AWS and on-premise hybrid environments.

Integrate, secure, and troubleshoot Windows and Linux systems, with expertise in Active Directory, clustering, and Hyper-V.

Collaborate with security, operations, and application teams; participate in on-call and incident response rotations.

Work with security tools (AWS WAF, Inspector, Macie); experience with vulnerability remediation workflows.

Deep knowledge of AWS cost management and optimization practices.

Proven skills in documentation, process improvement, and compliance in regulated environments.

Experience with container orchestration (Docker/Kubernetes), CI/CD tooling, and automation pipelines.

Support and manage Zscaler Cloud proxy, policies, certificate management, and API integrations.

Excellent communication, collaboration, and problem-solving abilities.

Required Qualifications:

Bachelor's degree in Computer Science/IT preferred.

Minimum 7 years' experience in cloud and network engineering (focused on AWS, Windows, and Linux).

Expert knowledge of AWS core services (EC2, S3, EKS, RDS, IAM), networking protocols (TCP/IP, BGP, OSPF).

Experience in Terraform and Python.

Strong background in cloud and network security, compliance, and automation.

Track record of supporting production environments in high-growth, fast-paced organizations.

Not Specified