Prometheus Metrics Query Examples Jobs in Usa

2,796 positions found — Page 4

REO Resiliency Engineering and Quality Leader (Hybrid)
✦ New
Salary not disclosed

*At Securian Financial the internal position title is Infrastructure Dir."

Mission

"To lead the engineering discipline that ensures Securian's technology platforms and cloud services are built and operated with uncompromising resilience, performance, and quality. This role drives the design and automation of fault-tolerant, high-availability architectures across AWS, Azure, and GCP-ensuring the enterprise meets resiliency, scalability, and efficiency expectations at every layer of technology."

Positioning

The Director of Resilience Engineering and Quality Leader is both a strategic peer and technical counterpart to the Infrastructure & Reliability Engineering Leader.

This role provides bench depth and succession coverage for REO's most technically complex domains while driving innovation in reliability, resilience, and performance practices.

  • Strategic influence: Shapes cloud reliability, quality engineering, and resilience strategy across REO and Architecture domains.

  • Operational authority: Leads Sr. Managers and Managers who own the execution of quality, resilience, and performance engineering capabilities.

  • Enterprise collaboration: Works hand-in-hand with Technology, Solution, Business, Data, and Enterprise Architects to embed reliability and resilience as core architecture principles.

Scope of Accountability

Resilience Engineering & Cloud Reliability

  • Architect and validate fault-tolerant, regionally resilient architectures across AWS, Azure, and GCP.

  • Own resilience automation, chaos testing, and IaC-based recovery validation.

  • Lead cross-cloud reliability design reviews and failure-mode analyses for critical systems.

Quality Engineering & Continuous Testing

  • Define enterprise-wide quality engineering strategy integrated into CI/CD pipelines.

  • Drive automation-first testing (functional, non-functional, performance, resilience).

  • Embed observability-driven quality validation and contract testing across services.

Performance, Capacity & Efficiency Engineering

  • Oversee predictive capacity planning, scaling automation, and cost/efficiency optimization (FinOps/GreenOps).

  • Partner with Platform & Infrastructure teams to tune performance across application and platform layers.

  • Measure and report on performance SLIs/SLAs aligned to REO's Reliability Metrics framework.

Cross-Domain Architecture Collaboration

  • Partner with Enterprise Architects to codify resilience and reliability standards in technology blueprints.

  • Collaborate with Technology & Solution Architects to design service reliability into delivery architectures.

  • Engage Data Architects for data resilience, replication, and pipeline reliability.

  • Work with Business Architects to align technical reliability goals with critical business outcomes.

Leadership & Talent Development

  • Lead a team of Sr. Managers and Managers, fostering a high-performance, hands-on engineering culture.

  • Build and mentor top-tier technical talent in cloud reliability, resilience, and quality automation.

  • Partner with HR and REO Enablement to develop succession plans and technical competency frameworks.

Core Technical Competencies

  • AWS (primary) - Multi-account design, HA architecture, region failover, resilience automation, Terraform/CDK/CloudFormation.

  • Azure & GCP (secondary) - Compute, networking, and reliability constructs; hybrid cloud design and failover integration.

  • Infrastructure as Code (IaC) - Deep proficiency in Terraform, policy-as-code (OPA/Conftest), drift detection, pipeline integration.

  • Reliability & Chaos Engineering - AWS Fault Injection Simulator, Gremlin, steady-state hypothesis design.

  • Observability & Quality Automation - OpenTelemetry, Prometheus, CloudWatch, K6, Gatling; CI/CD quality gates and dashboards.

  • Performance Engineering - Load, stress, and soak testing automation; performance profiling and SLO alignment.

  • Disaster Recovery Automation - Cross-region orchestration, IaC-driven DR runs, replication validation.

  • FinOps/GreenOps - Cloud cost and efficiency automation, carbon-aware scaling policies.

Leadership Competencies

  • Strategic Technical Leadership: Operates at the intersection of deep engineering and executive strategy.

  • Multi-Domain Collaborator: Integrates reliability and resilience across architecture, operations, and business domains.

  • Talent Multiplier: Develops and empowers senior managers, fostering engineering mastery and innovation.

  • Credible Technical Authority: Trusted peer to Infrastructure & Reliability Engineering; capable of leading architecture reviews and executive briefings.

  • Change Champion: Drives transformation of reliability practices across platforms, pipelines, and teams.

Qualifications & Experience

  • 12+ years in cloud engineering, reliability, or platform leadership roles.

  • 5+ years leading Sr. Managers/Managers in technical domains.

  • Proven expertise across AWS, with working knowledge of Azure and GCP.

  • Experience with multi-cloud governance, DR design, IaC at scale, and reliability automation.

  • Strong understanding of observability, SRE principles, and REO/ITIL-aligned reliability frameworks.

  • Certifications:

    • Required: AWS Certified Solutions Architect - Professional

    • Preferred: AWS DevOps Engineer, Azure Solutions Architect Expert, Google Professional Cloud Architect

Success Metrics

  • 99.9% availability maintained for Tier-1 workloads.

  • 100% coverage of DR automation for Tier-1 services.

  • 25% annual increase in automated quality/test coverage.

  • 15% annual improvement in resource efficiency and cost performance.

  • Documented resilience participation across all enterprise architecture blueprints.

  • Positive "technical peer readiness" and succession rating from Head of REO.

Summary Value Proposition

This Director role blends deep AWS reliability engineering expertise, multi-cloud technical breadth, and leadership scale.

It ensures REO maintains both technical depth and leadership redundancy, and it strengthens the bridge between engineering execution and enterprise architecture alignment.

#LI-hybrid **This position will be in a hybrid working arrangement.**


Securian Financial believes in hybrid work as an integral part of our culture. Associates get the benefit of working both virtually and in our offices. If you're in a commutable distance (90 minutes), you'll join us 3 days each week in our offices to collaborate and build relationships. Our policy allows flexibility for the reality of business and personal schedules.

The estimated base pay range for this job is:

$145,000.00 - $267,000.00

Pay may vary depending on job-related factors and individual experience, skills, knowledge, etc. More information on base pay and incentive pay (if applicable) can be discussed with a member of the Securian Financial Talent Acquisition team.

Be you. With us. At Securian Financial, we understand that attracting top talent means offering more than just a job - it means providing a rewarding and fulfilling career. As a valued member of our high-performing team, we want you to connect with your work, your relationships and your community. Enjoy our comprehensive range of benefits designed to enhance your professional growth, well-being and work-life balance, including the advantages listed here:

Paid time off:

  • We want you to take time off for what matters most to you. Our PTO program provides flexibility for associates to take meaningful time away from work to relax, recharge and spend time doing what's important to them. And Securian Financial rewards associates for their service by providing additional PTO the longer you stay at Securian.

  • Leave programs: Securian's flexible leave programs allow time off from work for parental leave, caregiver leave for family members, bereavement and military leave.

  • Holidays: Securian provides nine company paid holidays.

Company-funded pension plan and a 401(k) retirement plan: Share in the success of our company. Securian's 401(k) company contribution is tied to our performance up to 10 percent of eligible earnings, with a target of 5 percent. The amount is based on company results compared to goals related to earnings, sales and service.

Health insurance: From the first day of employment, associates and their eligible family members - including spouses, domestic partners and children - are eligible for medical, dental and vision coverage.

Volunteer time: We know the importance of community. Through company-sponsored events, volunteer paid time off, a dollar-for-dollar matching gift program and more, we encourage you to support organizations important to you.

Associate Resource Groups: Build connections, be yourself and develop meaningful relationships at work through associate-led ARGs. Dedicated groups focus on a variety of interests and affinities, including:

  • Mental Wellness and Disability

  • Pride at Securian Financial

  • Securian Young Professionals Network

  • Securian Multicultural Network

  • Securian Women and Allies Network

  • Servicemember Associate Resource Group

For more information regarding Securian's benefits, please review our Benefits page.

This information is not intended to explain all the provisions of coverage available under these plans. In all cases, the plan document dictates coverage and provisions.

Securian Financial Group, Inc. does not discriminate based on race, color, religion, national origin, sex, gender, gender identity, sexual orientation, age, marital or familial status, pregnancy, disability, genetic information, political affiliation, veteran status, status in regard to public assistance or any other protected status. If you are a job seeker with a disability and require an accommodation to apply for one of our jobs, please contact us by email at , by telephone (voice), or 711 (Relay/TTY).

To view our privacy statement click here

To view our legal statement click here


Remote working/work at home options are available for this role.
Not Specified
Agentic AI Engineer
✦ New
🏢 Unisys
Salary not disclosed
Rockville, Maryland 14 hours ago

Overview

Architects and builds the infrastructure and tooling that powers AI agent development across the Software Development Lifecycle (SDLC). Develops production-grade agentic systems, orchestration frameworks, and observability solutions that enable teams to build, deploy, and monitor reliable AI agents at scale. Plays a key role in defining and implementing the next generation of SDLC through AI-first innovation and comprehensive instrumentation.

What We're Looking For

You demonstrate sharp product sense for high-impact automation opportunities, technical taste in implementation decisions, and the ability to clearly articulate trade-offs. You know when to apply AI agent solutions versus simpler approaches and can explain the \"why\" behind architectural choices.

You excel at 0-to-1 (and 1-to-100) product development, comfortable operating in ambiguous environments where requirements emerge through experimentation and iteration rather than upfront specification.

Key Responsibilities

AI Agent Development & Automation:

• Develop production-grade AI agents that eliminate manual handoffs across the SDLC

• Create custom integrations and CLI tools that give agents deep understanding of internal systems and codebases

• Design comprehensive testing strategies to ensure agent reliability and output quality

• Implement \"Golden Path\" scaffolding that embeds organizational standards into new projects

• Build AI solutions that improve codebase navigation, documentation, and developer workflows

• Identify workflow bottlenecks and deliver measurable impact through intelligent automation

• Shape SDLC evolution by identifying AI-first opportunities and proving outcomes through experimentation

Agent Infrastructure & Platform:

• Architect and maintain production infrastructure supporting agent deployment, lifecycle management, and scaling

• Develop agent frameworks, templates, and SDKs that accelerate agent development

• Create governed Model Context Protocol (MCP) catalog enabling compliant agent-to-agent and agent-to-MCP communication

• Implement governance controls for agent behavior, permissions, and system access

Observability & Performance Analytics:

• Design and implement metrics, monitoring, and logging infrastructure for AI agents and development workflows

• Build dashboards that provide actionable insights into developer productivity, tool adoption, and agent performance

• Establish KPIs and measurement frameworks to quantify the impact of AI-powered automation

• Create alerting and anomaly detection systems to ensure reliability of agents and tooling

• Analyze telemetry data to identify optimization opportunities and guide strategic investment decisions

Collaboration & Impact:

• Partner across teams to drive adoption of AI-powered tooling and process transformation

• Stay current with LLM technologies and coach colleagues on AI-assisted development and automation best practices

• Rapidly prototype solutions to validate use cases and prove value quickly

• Communicate data-driven insights to stakeholders through clear visualizations and reports

Preferred Qualifications:

• 5-7+ years of software engineering experience building production systems

• Proven experience building agentic systems using LLM orchestration frameworks

• Hands-on expertise with AI-powered development tools (code assistants, AI-enhanced editors)

• Strong foundation in SDLC, system design, and internal tooling development

• Experience with observability tools and practices including metrics collection, logging frameworks, and dashboard development

• Full-stack technical proficiency:

• Languages: Java, Python, JavaScript/TypeScript

• Frameworks: Angular, Spring Boot

• CI/CD platforms and cloud infrastructure (AWS)

• Monitoring/observability tools (e.g., Prometheus, Grafana, CloudWatch)

• Passion for transforming software development through AI innovation and data-driven decision making

# LI-CGTS

# TS-2505

Not Specified
Data Analyst Manager
✦ New
Salary not disclosed
Hickory, NC 14 hours ago

Who We Are

At Feetures, movement is our business. And we believe that a meaningful business begins with authentic values—and our values were forged by the bonds of family.

What started as a bold idea around a kitchen table has grown into a fast-moving, purpose-driven brand redefining performance. As a family-owned company in North Carolina, we’re fueled by the belief that better is always possible—and that energy drives both our products and our culture.

Movement is at the heart of everything we do. From our socks to our team and to our communities, we are always pushing forward. If you are ready to grow, challenge the status quo, and help shape the next chapter of a brand that is always in stride, come move with us. Feetures is Meant to Move. Are you?


Role Summary:

The Data Analytics Manager is responsible for owning and optimizing the organization’s end-to-end data ecosystem, ensuring that data infrastructure, governance, and analytics processes effectively support business operations. This role leads the design and management of the data stack—from source system integrations and NetSuite Analytics Warehouse to reporting and business intelligence tools—while establishing strong data governance standards, quality monitoring, and documentation practices. The manager also oversees and mentors analytics team members, prioritizes analytics requests, and coordinates cross-functional data workflows. Acting as the central authority for data reliability and insights, the role ensures consistent metric definitions, scalable data models, and accurate reporting while translating complex data into clear, actionable insights for business stakeholders.


Responsibilities:

Data Architecture & Tooling

  • Own the end-to-end data stack — from source system integrations and the NetSuite Analytics Warehouse to downstream reporting layers
  • Evaluate, select, and implement tools that improve data accessibility, reliability, and performance
  • Ensure alignment between data infrastructure and evolving business needs across distribution operations
  • Design and maintain scalable data models, SuiteQL queries, and saved searches within NetSuite

Data Governance & Quality

  • Define and enforce data standards, metric definitions, and naming conventions across all business domains
  • Establish data ownership, lineage documentation, and access governance policies
  • Implement monitoring and alerting for data quality issues across source systems and the warehouse
  • Build and maintain a data dictionary that serves as the single source of truth for the organization

Orchestration of Analysts & Systems

  • Manage and mentor the Data Analyst and Business Analyst — prioritizing requests, unblocking work, and validating outputs
  • Triage and prioritize the analytics request queue in alignment with business stakeholders and IT leadership
  • Coordinate cross-functional data workflows and ensure handoffs between systems and analysts are clean and documented
  • Serve as the escalation point for data discrepancies, report failures, and analytical questions from the business


Qualifications:

Required

  • 3-5 years of experience in data analytics, business intelligence, or data engineering
  • 2+ years in a lead or management role overseeing analysts or data team members
  • Strong proficiency in SQL; experience with SuiteQL or similar ERP query languages
  • Hands-on experience with NetSuite, including Analytics Warehouse, saved searches, and reporting
  • Proven track record establishing data governance standards and documentation practices
  • Experience integrating and managing multiple data sources across SaaS and ERP platforms
  • Demonstrated ability to translate complex data into clear, actionable insights for non-technical stakeholders

Preferred

  • Experience in distribution, wholesale, or supply chain environments
  • Familiarity with SaaS BI platforms (e.g., Tableau, Power BI, Looker, or embedded analytics)
  • Exposure to scripting or automation (JavaScript, Python, or similar) for data workflows
  • Background working within IT-led or hybrid IT/Analytics teams


Benefits:

  • Health insurance
  • Dental insurance
  • Vision insurance
  • Life & Disability insurance
  • 401(K) with company match


Company Paid holidays and PTO:

  • Feetures offers 20 PTO Days which are available to you on day one of employment and are available to all employees, no matter your role. After working at Feetures for 5 years, your PTO days will increase to 25 days. Days can be used for vacations, appointments and sick days.
  • We offer 10 company paid holidays and 1 floating holiday per year.


Perks:

  • Parking provided (Charlotte office and onsite at Hickory office)
  • Employee Engagement team
  • Monthly stipend to pursue an active lifestyle


Feetures is an Equal Opportunity Employer that welcomes and encourages all applicants to apply regardless of age, race, sex, religion, color, national origin, disability, veteran status, sexual orientation, gender identity and/or expression, marital or parental status, ancestry, citizenship status, pregnancy or other reasons protected by law.

Not Specified
Physician Advisor - Strategic Quality Performance
Salary not disclosed
Lakeland, FL 2 days ago

Position Details


Lakeland Regional Health is a leading medical center located in Central Florida. With a legacy spanning over a century, we have been dedicated to serving our community with excellence in healthcare. As the only Level 2 Trauma center for Polk, Highlands, and Hardee counties, and the second busiest Emergency Department in the US, we are committed to providing high-quality care to our diverse patient population. Our facility is licensed for 910 beds and handles over 200,000 emergency room visits annually, along with 49,000 inpatient admissions, 21,000 surgical cases, 4,000 births, and 101,000 outpatient visits.


Lakeland Regional Health is currently seeking motivated individuals to join our team in various entry-level positions. Whether you're starting your career in healthcare or seeking new opportunities to make a difference, we have roles available across our primary and specialty clinics, urgent care centers, and upcoming standalone Emergency Department. With over 7,000 employees, Lakeland Regional Health offers a supportive work environment where you can thrive and grow professionally.


Work Hours per Biweekly Pay Period: 80.00

Shift:

Location: 1324 Lakeland Hills Blvd Lakeland, FL

Pay Rate: Min $161,200.00 Mid $215,300.80


Position Summary


The Physician Advisor serves as a liaison between the clinical document improvement (CDI) team, which includes hospital coders; members of the Hospital's administration; the Medical Staff of the hospital; and the hospital's Utilization Management to facilitate the development and implementation of clinical documentation improvement initiatives. The Physician Advisor is pivotal in leveraging his or her clinical position to demonstrate the association of care delivery with specificity in documentation. The Physician Advisor is responsible for conducting clinical reviews referred by the Utilization Management, Coding and Clinical Documentation Improvement departments. The Physician Advisor will assist with reviews and appeals of DRG and medical necessity denials.

Position Responsibilities


People At The Heart Of All We Do

  • Fosters an inclusive and engaged environment through teamwork and collaboration.
  • Ensures patients and families have the best possible experiences across the continuum of care.
  • Communicates appropriately with patients, families, team members, and our community in a manner that treasures all people as uniquely created.


Stewardship

  • Demonstrates responsible use of LRH's resources including people, finances, equipment and facilities.
  • Knows and adheres to organizational and department policies and procedures.


Safety And Performance Improvement

  • Behaves in a mindful manner focused on self, patient, visitor, and team safety.
  • Demonstrates accountability and commitment to quality work.
  • Participates actively in process improvement and adoption of standard work.


Supervisor/Team Lead Capabilities

  • Demonstrates accountability for shift/team operations and care/service delivery to support achievement of organizational priorities.
  • Coaches front line team members to support ongoing professional development and hardwire technical and professional capabilities.
  • Creates a high performing team by building strong relationships, delegating work and nurturing commitment and engagement.
  • Manages team conflict/issues implementing appropriate corrective actions, improvement plans and regular performance evaluations.
  • Applies change management best practices and standard work to support departmental changes and ensure effective team transition.
  • Promotes a healthy and safe culture to advance system, team and service experien


Standard Work: Physician Advisor

  • Acts as a liaison between the CDI professionals, Health Information Management, and the hospital's medical staff to facilitate accurate and complete documentation for coding and abstracting of clinical data, capture of severity, acuity and risk of mortality, HCC/risk adjustment in addition to Diagnosis Related Group (DRG) assignment.
  • Perform concurrent and retrospective reviews of selected health records as it pertains to CDI and coding validation, and participate in the development of clinically appropriate and compliant provider queries to further clarify documentation.
  • Educates individual hospital staff physicians about International Classification of Diseases (ICD) coding guidelines and clinical terminology to improve their understanding of severity, acuity, risk of mortality, HCC/risk adjustment and DRG assignments on their individual patient records.
  • Assists with the evaluation and appeal of concurrent and restrospective denials and retrospective DRG downgrades. May perform peer-to-peer meetings as required.
  • Participates in the coding and CDI programs and identifies potential areas for improved documentation of services. Also participates in the Coding and CDI meetings and provides ongoing education to the team members.
  • Provides peer to peer communication to affect the appropriate response for those cases where the physician fails to respond or questions the need for queries.
  • Responsible for writing and submitting appeals (multiple levels as needed) specifically around medical necessity, non-covered services, authorizations, and inpatient/observation stay related denials. May perform peer-to-peer meetings as required.
  • The Physician Advisor is pivotal in leveraging his or her clinical position to demonstrate the association of care delivery with specificity in documentation through effective communication and education of the respective parties.
  • Provides his or her expert opinion in relation to clinical validity assessments, and, furthermore, the development of clinically robust and appropriate queries.
  • Serves as second level reviewer for UM, providing guidance on appropriate/alternate levels of care based on InterQual guidelines and other appropriate criteria.


Competencies & Skills


Essential:

  • Broad knowledge base of clinical medicine across all specialties.
  • Basic coding guidelines regarding the selection of the principal diagnosis and reporting additional diagnoses and procedures; understanding the DRG system; levels of comorbidities; and concepts of risk adjustment, severity of illness, risk of mortality, case mix index, prospective payment, hospital acquired conditions, patient safety indicators.
  • Organize tasks effectively and efficiently and the ability to act independently through the application of critical thinking skills.
  • Computer skills appropriate to position
  • Excellent written and verbal communication skills.


Qualifications & Experience


Essential:

  • Medical Degree

Essential:

  • Licensed to practice medicine in the state of Florida, shall be board certified in internal medicine, and shall meet any other reasonable professional criteria established by LRH or the hospital.

Other information:

Experience Essential:

- Minimum of two years of experience in conducting coding and CDI reviews.

- Knowledge of coding guidelines and how it translates from clinical documentation.

- Knowledge of DRGs, Risk of Mortality, Severity of Illness, Mortality Rate, HCC/risk adjustment, CMI and the impact of clinical documentation/coding in relation to these metrics.

- Excellent computer skills with prior exposure to use of Microsoft Office suite

Not Specified
Data Analyst
✦ New
Salary not disclosed
Des Moines, IA 8 hours ago

This is a full-time position that requires onsite presence in Des Moines, Iowa. Candidates must be authorized to work in the United States without sponsorship now or in the future.


P3+Uplift is partnering with a local insurance company to find a SQL-driven Data Analyst who enjoys working directly with business stakeholders to turn data questions into clear insights and reporting. This role is highly hands-on with SQL and data extraction, working across multiple data sources to support reporting, analysis, and data-driven decision making. The ideal candidate is both analytical and consultative—able to understand business needs, write efficient queries, and deliver clear, actionable insights.


The company offers a flexible schedule, hybrid work environment, casual dress code, and a collaborative culture, plus a comprehensive benefits package.


Key Responsibilities

  • Write and optimize SQL queries to pull and analyze data from multiple sources.
  • Partner with business teams to clarify questions, define metrics, and deliver actionable insights.
  • Build and maintain interactive reports and dashboards to support decision-making (Power BI preferred).
  • Ensure data accuracy through validation, cleansing, and reconciliation.
  • Document data sources, definitions, and analysis logic to create repeatable, reliable reporting processes.
  • Identify opportunities to streamline data workflows, improve automation, and enhance reporting efficiency.
  • Communicate findings and trends in clear, business-friendly language to stakeholders.
  • Contribute to ad-hoc analysis projects, providing insights to guide business strategy.


5+ years experience:

  • Strong SQL experience required with the ability to query and analyze large datasets.
  • Experience working with data structures, relational databases, and multiple data sources.
  • Experience with data validation, cleansing, and quality assurance.
  • Experience with Power BI or other data visualization tools preferred.
  • Ability to translate complex data into clear, business-friendly insights.
  • Strong communication skills and a consultative approach with stakeholders.


Education: Bachelor’s degree in Business, Analytics, Statistics, or a related field, or equivalent experience

Not Specified
Semantic/ Ontology Engineer
✦ New
Salary not disclosed
East Hanover, NJ 1 day ago

Position: Senior Semantic Engineer / Ontology Engineer

Location: East Hanover, NJ (Hybrid - 3x/week onsite)

Duration: 6 Months (extendable)

DESCRIPTION:

We are hiring a Senior Semantic Engineer / Ontology Engineer to lead the design of healthcare-grade ontologies and semantic layers that power trusted analytics, interoperable data products, and AI-ready knowledge systems. You will apply metrics-first semantic modeling and ontology engineering practices aligned to the principles such as clear semantics, reusable meaning, governance-by-design, and measurable business outcomes. You’ll work across RDF and property graph paradigms and Snowflake semantic layer.


Key Responsibilities:

• Design and evolve healthcare ontologies and semantic models to standardize meaning across domains (clinical, patient, provider, claims, access, quality, outcomes).

• Design data products that are AI-ready and leverage ontologies and semantic models

Build metrics-first semantic layers:

• Define canonical metric definitions, dimensions, hierarchies, and calculation rules.

• Ensure metrics are explainable, auditable, and consistently implemented across products and teams.

• Model knowledge in both RDF (RDFS/OWL) for formal semantics and interoperability.

• Property graphs for traversal-heavy use cases and relationship analytics.

Develop and maintain semantic artifacts:

• Concept schemes, entity models, vocabularies, mappings, and documentation.

• Alignment patterns between ontologies, data products, and downstream analytics/AI use cases.

Implement semantic integration patterns:

• Entity identity resolution, entity linking, terminology harmonization, and enrichment workflows.

Partner with platform teams to operationalize semantics in Snowflake:

• Enable semantic access patterns that support analytics and AI applications.

• Contribute to solutions that leverage Snowflake Cortex for semantic enrichment and assisted discovery (within established governance constraints).

Collaborate with governance and architecture stakeholders to embed:

• Versioning, stewardship workflows, quality checks, and change management for semantic assets.

• Guide best practices and mentor engineers/analysts on ontology engineering, graph modeling, and metrics-first design.


Required Qualifications

• 8+ years in semantic engineering, ontology engineering, knowledge graph development, or closely related roles.

• Demonstrated experience in healthcare data domains (payer/provider, clinical, claims, RWE, quality, outcomes, etc.).

• Strong hands-on ontology engineering experience: RDF, RDFS, OWL, SPARQL and/or graph query experience

• Ontology modularization, alignment, and lifecycle management

• Experience with property graph modeling (e.g., Neo4j-style patterns) and translating between RDF and property graph representations when needed.

• Proven delivery of a metrics-first approach:

• Canonical KPIs/metrics definitions, dimensional modeling alignment, semantic consistency across BI and data products.

• Experience working with modern cloud data platforms, especially Snowflake, and exposure to Snowflake Cortex for AI-enabled workflows.

• Strong stakeholder communication skills: able to translate clinical/business intent into precise semantic definitions and usable artifacts.

Preferred Qualifications

• Familiarity with healthcare interoperability and terminology standards (e.g., HL7/FHIR, SNOMED CT, LOINC, ICD-10) and how to map/align them to enterprise semantics.

• Experience with semantic tooling and practices, validation rules, ontology testing, and CI/CD for semantic assets.

• Experience deploying semantic context layers

Not Specified
Site Reliability Engineer
✦ New
Salary not disclosed
Chicago, IL 14 hours ago

Site Reliability Engineer


Description and Requirements

About Our Team

We are building Quantum, a next‑generation hybrid AI platform that spans Windows, Android, and cloud. As part of this initiative, we are growing the reliability engineering organization that powers cross‑device Personal AI.

We are hiring Site Reliability Engineers (SREs) to strengthen the reliability, observability, and operational excellence of Qira’s AI systems across device, edge, and cloud. Depending on your strengths, you may be aligned to areas such as Observability, Operations, or Service Reliability.

Works with the speed and creativity of a startup inside— you’ll help build foundational systems with clarity, ownership, and modern engineering practices.




Location: On-site in Chicago, IL. Hybrid (3 days on-site, 2 days remote)


What You Might Work On

As an SRE, you may be responsible for a subset of the following, depending on team placement and skill alignment:

Reliability & Systems Engineering

  • Support the reliability, availability, and performance of distributed systems across cloud, edge, and device environments.
  • Help define, measure, and monitor SLIs and SLOs for core services.
  • Identify reliability risks and collaborate with senior engineers on mitigation plans.

Operational Excellence

  • Participate in on‑call rotations and assist with incident response and post‑incident reviews.
  • Contribute improvements to runbooks, automation, and tooling that reduce alert noise and operational toil.
  • Help enhance detection, alerting, and response workflows.

Observability & Insight

  • Implement and improve telemetry using OpenTelemetry, Grafana, and related tools.
  • Build dashboards and tools that improve visibility into system health and AI service behavior.
  • Ensure observability data is complete, accurate, and actionable.

Deployments & Change Safety

  • Support safe, reliable deployment workflows including canaries, staged rollouts, and automated rollbacks.
  • Assist in improving CI/CD systems and deployment tooling.

Collaboration & Best Practices

  • Work closely with senior SREs, DevOps engineers, AI/ML teams, and platform engineers.
  • Contribute to reliability reviews, operational readiness checks, and cross‑team projects.
  • Advocate for modern SRE and DevOps practices within the organization.


Basic Qualifications

  • 4+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or production systems operations.
  • Bachelor’s Degree in Computer Science, Engineering, or related technical field (or equivalent practical experience).
  • Foundational experience supporting distributed systems in production.
  • Ability to write scripts or tools in Python, Go, Bash, or similar languages.
  • Solid understanding of Linux systems, networking basics, and system performance fundamentals.
  • Experience with cloud platforms (Azure preferred, AWS or GCP acceptable).
  • Familiarity with monitoring/observability (metrics, logs, tracing).
  • Experience with containers and Kubernetes.


Preferred Qualifications

  • Experience with OpenTelemetry instrumentation and telemetry pipelines.
  • Hands‑on experience with Grafana, Prometheus, Loki, or Tempo.
  • Exposure to AI/ML systems, inference services, or data‑intensive workloads.
  • Experience contributing to CI/CD processes and deployment automation.
  • Familiarity with hybrid architectures spanning device, edge, and cloud.
  • Passion for automation, reliability, and operational excellence.


What Success Looks Like

  • Systems become easier to operate, observe, and trust.
  • Alerts are more accurate and actionable.
  • On‑call load decreases through thoughtful automation and improvements.
  • Deployment workflows become more reliable and repeatable.
  • You grow toward deeper ownership and technical leadership within the reliability engineering organization.
Not Specified
Observability and AI Enterprise Architect
✦ New
🏢 ClifyX
Salary not disclosed
Edison, NJ 1 day ago

Key Responsibilities:

  • Design and deploy observability frameworks leveraging tools such as Grafana, Dynatrace, Prometheus, ELK, Splunk, etc. Define best practices for monitoring, alerting, and visualization across hybrid and multi-cloud environments.
  • Develop strategies for monitoring KPIs tied to business outcomes (e.g., sales performance, supply chain efficiency, customer experience).
  • Collaborate with business and IT teams to identify key metrics and integrate them into dashboards and alerting systems.
  • Implement AIOps solutions using industry-leading platforms like OpenAI, AWS Bedrock, Google Gemini, Anthropic, and similar technologies.
  • Develop predictive analytics and anomaly detection models to proactively identify and resolve operational issues.
  • Integrate observability tools with ITSM platforms and automation workflows. Enable automated root cause analysis and remediation using AI/ML models.
  • Provide observability strategies for infrastructure (servers, storage, cloud), applications (microservices, APIs), and networks (LAN/WAN, SD-WAN). Collaborate with DevOps, SRE, and IT operations teams to ensure end-to-end visibility and reliability.
  • Establish observability standards, KPIs, and SLAs for performance and availability. Ensure compliance with security and regulatory requirements in monitoring solutions.
  • Develop scalable architecture using LLMs, agentic frameworks, and multi-modal AI technologies.
  • Build AI-powered analytics platforms for IT operations analysis, anomaly detection, and predictive insights.
  • Architect and deploy intelligent chatbots for IT support and self-service capabilities.
  • Integrate AI solutions with existing IT operations tools and workflows.
  • Implement automated remediation and root cause analysis using AI/ML models.


Qualifications:

  • 10-13 years of relevant experience
  • Hands-on experience with Grafana, Dynatrace, and other monitoring platforms.
  • Practical experience implementing AI-based solutions for anomaly detection, predictive maintenance, and automated remediation. Familiarity with OpenAI, Bedrock, Gemini, Anthropic, or similar AI platforms.
  • Strong understanding of infrastructure, application architectures, and networking. Experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes).
  • Proficiency in Python, Bash, or similar scripting languages for automation and integration.
  • Strong experience with LLMs (OpenAI, Anthropic, Gemini, Bedrock) and agentic AI solutions.
  • Hands-on experience in designing AI architectures for enterprise IT environments.
  • Proficiency in Python or similar languages for AI model integration and automation.
Not Specified
Linux Cloud Engineer
Salary not disclosed
Charlotte 3 days ago
A financial firm is looking for a Linux Cloud Engineer w/Openshift / AKS to join their team in Charlotte, NC.

Compensation: $150-195k Responsibilities: • Design, deploy, and manage container orchestration platforms using OpenShift and AKS.

• Administer and optimize Linux-based systems in hybrid and multi-cloud environments.

• Automate infrastructure provisioning and configuration using Ansible Automation Platform.

• Develop and maintain Infrastructure as Code (IaC) using Terraform, Helm, and GitOps workflows.

• Collaborate with DevOps and application teams to implement CI/CD pipelines and DevSecOps practices.

• Monitor system performance, troubleshoot issues, and ensure high availability and disaster recovery.

• Implement security best practices for containerized workloads and cloud environments.

• Provide technical leadership and mentorship to junior engineers.

• Stay current with emerging technologies and contribute to strategic cloud initiatives.

• Assist with migrations to cloud, ensuring best practices are followed and architecture is compliant with company standards.

Qualifications: Required: • Bachelor's degree in computer science, Engineering, or related field (or equivalent experience).

• 5+ years of professional experience in Linux system administration and cloud engineering.

• 3+ years of hands-on experience with OpenShift and AKS in production environments.

• Strong proficiency in scripting languages (e.g., Bash, Python).

• Experience with CI/CD tools (e.g., Jenkins, GitLab CI, ArgoCD).

• Deep understanding of Kubernetes architecture, networking, and security.

• Familiarity with cloud platforms (Azure, AWS, GCP) and hybrid cloud strategies.

• Knowledge of monitoring and logging tools (Prometheus, Grafana, ELK stack).

• Excellent problem-solving and communication skills.

• Linux Administration: Deep expertise in RHEL environment.

• Container Platforms: 3+ years of hands-on experience with OpenShift and AKS.

• Automation: Proficiency with Ansible, Ansible Tower/AAP, and scripting (Bash, Python).

• Infrastructure as Code: Experience with Terraform, Helm, and GitOps tools (e.g., ArgoCD, Flux).

• CI/CD: Familiarity with Jenkins, GitLab CI, Azure DevOps, or similar tools.

• Cloud Platforms: Strong knowledge of Azure, with exposure to AWS or GCP a plus.

• Monitoring & Logging: Experience with Prometheus, Grafana, ELK/EFK, and Azure Monitor.

• Security: Understanding of container security, RBAC, network policies, and compliance frameworks.

• Networking: Solid grasp of Kubernetes networking, service mesh (e.g., Istio), and ingress controllers.

Preferred: • Red Hat Certified Specialist in OpenShift Administration.

• Microsoft Certified: Azure Kubernetes Service Specialist.

• Experience with service mesh technologies (e.g., Istio, Linkerd).

• Experience in regulated industries (e.g., finance, healthcare) is a plus.
Not Specified
Sr Platform Architect
✦ New
Salary not disclosed
Dunwoody, GA 1 day ago

Senior Platform Architect

Reports To: Director of Engineering

Department: Engineering

Location: Hybrid - Atlanta, GA


What makes MTech different:


Purpose-Driven Work – Build technology that solves real problems for the world

Casual & Collaborative – No corporate bureaucracy, direct access to senior leadership

Innovation-Focused – Healthy innovation pipeline expanding into new segments and technologies

Transparent & Data-Driven – Clear metrics, objectives, and visibility into company performance

Modern Development – Robust development tools, training programs, and technical excellence

Flexibility & Balance – Flexible work environment that values results over presenteeism



Job Summary

The Senior Platform Architect will lead the technical architecture, design, and modernization of large-scale, multi-tenant enterprise SaaS platforms built on Azure and the .NET stack. This role requires mastery of distributed systems, cloud-native design, and advanced engineering practices to deliver highly available, performant, and secure solutions for global consumer-facing SaaS and Agentic AI products.


Responsibilities and Duties


Architectural Design & Transformation

  • Lead migration from monolithic systems to modular monolith and microservices architectures using domain-driven design, bounded contexts, and decomposition strategies.
  • Design multi-tenant SaaS platforms with advanced tenant isolation, resource partitioning, and elastic scaling using Azure services.
  • Define and enforce architectural standards for .NET (C#), TypeScript, Angular, SQL Server, and Azure, including dependency injection, SOLID principles, asynchronous programming, and reactive patterns.
  • Design and implement distributed systems: service orchestration, API gateway management, IoT, edge computing, distributed transactions, eventual consistency, CQRS, and event sourcing.
  • Architect for cloud-native resiliency: circuit breakers, bulkheads, retries, failover, geo-redundancy, and disaster recovery using Azure App Services, Azure Functions, Service Bus, Cosmos DB, and Azure SQL.
  • Develop and maintain architecture documentation, reference models, and decision records using industry frameworks (TOGAF, Zachman, C4 Model).


Performance Engineering & Observability

  • Establish and monitor platform SLOs (latency, throughput, error rates, availability) mapped to customer SLAs.
  • Architect and implement advanced caching strategies, indexing, and query optimization for SQL Server and NoSQL stores in coordination with Senior Data Architect, Data Engineers, and Database Admins.
  • Design and implement telemetry pipelines: distributed tracing (OpenTelemetry), structured logging, metrics collection, and real-time dashboards for system health and diagnostics.
  • Conduct performance profiling, load testing, and capacity planning for backend services and frontend applications.


Automation, Quality, and DevOps

  • Architect and implement CI/CD pipelines with automated build, test, security scanning, and deployment workflows.
  • Integrate static code analysis, code coverage, and quality gates into the development lifecycle.
  • Design and enforce automated testing strategies: unit, integration, contract, and end-to-end tests for backend and frontend components.
  • Develop infrastructure as code (IaC) solutions for repeatable, scalable cloud provisioning.
  • Create incident response playbooks for rollback, failover, and recovery, drive down MTTR and automate remediation where possible.


Security, Compliance, and Governance

  • Architect for multi-tenant security: authentication/authorization (OAuth2, OpenID Connect), encryption at rest and in transit, secrets management, and compliance with SOC 1, SOC 2, GDPR, and other regulatory standards.
  • Implement secure software development lifecycle (SSDLC) practices, threat modeling, and vulnerability management, including ZDR, DLP, No Model Training policies with AI Models.
  • Ensure architectural governance and alignment with enterprise frameworks (TOGAF, Zachman), maintain architecture decision records, and participate in architecture review boards.


Technical Leadership & Collaboration

  • Mentor engineering teams in advanced architectural concepts, distributed systems, cloud-native development, and best practices.
  • Collaborate with Data Architect, DevOps, IT Services, Engineering and Product Management teams to ensure platform extensibility, integration, and support for complex business requirements.
  • Evaluate and integrate AI/ML services, advanced analytics, and developer productivity tools to enhance platform capabilities.
  • Champion a culture of technical excellence, continuous improvement, and innovation.


Required Experience & Skills

  • Minimum 10+ years in software/platform engineering, with at least 8 years in platform architecture for enterprise SaaS on Azure and .NET tech stack.
  • Proven experience architecting and delivering large-scale, multi-tenant SaaS platforms for global consumer-facing products.
  • Deep expertise in .NET (C#), Azure cloud services (App Services, Functions, Service Bus, Cosmos DB, SQL Server), Azure Open AI, Microsoft Agent Framework, TypeScript, Angular, CI/CD, automated testing, and observability.
  • Mastery of distributed systems, cloud-native patterns, event-driven architectures, and microservices.
  • Demonstrated success in technical debt reduction, performance engineering, and architectural modernization.
  • Experience with architectural frameworks (TOGAF, Zachman, C4 Model), architectural governance, and compliance.
  • Strong understanding of platform security, regulatory compliance, and multi-tenant SaaS challenges.


Success Metrics (First 12 Months)

  • Reduction in platform-related incidents/support tickets.
  • Improvement in deployment speed and release velocity.
  • Reduction in MTTR for platform incidents.
  • Achievement of modularization milestones (monolith decomposition, service rollout, platform observability in production).
  • Increase in automated test coverage, code quality, and system performance metrics.


Preferred Skills & Certifications

  • TOGAF, Zachman, or similar architecture certification.
  • Advanced knowledge of event sourcing, CQRS, service mesh, and cloud-native security.
  • Familiarity with semantic technologies, knowledge graphs, and AI/ML integration.
  • Hands-on experience with infrastructure as code, automated testing tools, and modern DevOps practices.
  • Strong background in platform security, compliance, and multi-tenant SaaS challenges.


EEO Statement

Integrated into our shared values is MTech’s commitment to diversity and equal employment opportunity. All qualified applicants will receive consideration for employment without regard to sex, age, race, color, creed, religion, national origin, disability, sexual orientation, gender identity, veteran status, military service, genetic information, or any other characteristic or conduct protected by law. MTech aims to maintain a global inclusive workplace where every person is regarded fairly, appreciated for their uniqueness, advanced according to their accomplishments, and encouraged to fulfill their highest potential. We

Not Specified
jobs by JobLookup
✓ All jobs loaded