Full time
Remote

Senior Architect & SRE Innovator

Overview:

If you've seen the good, the bad, and the ugly and want your turn to build it right, come join us.

We're a startup (with excellent benefits) that provides a Platform-as-a-Service for industrial customers. Key to success is automating everything, quickly learning from problems, and continually innovating better ways to work.

We are seeking a talented individual with experience architecting, developing, deploying, monitoring and managing cloud and on-premises workloads at scale. We need someone who knows what to do, has the skills to build it, and feels the ownership to support it.

You might be a good fit if:

  • You are hands on, deep in Kubernetes, curious, and talented.
  • You thrive in the excitement, camaraderie, empowerment, responsibility, and flexibility of a startup environment.
  • You consistently focus on what matters most, shifting effort away from lower-value tasks.
  • You are both opinionated and open to new ideas.
  • You appreciate that the customer experience is paramount.
  • You value and contribute to a culture of psychological safety.

As part of this job, you will:

  • Design, build, and maintain infrastructure-as-code, app deployment, and system update solutions for customers around the world in cloud and on-prem.
  • Develop and maintain automation tools and processes for deployment, monitoring, and configuration management with tools such as k8s, Azure, and Pulumi.
  • Define our Site Reliability Engineering (SRE) strategy and determine appropriate SLOs and SLIs.
  • Develop and implement best practices for system reliability, operability, and security.
  • Reduce toil and boring work by scripting routine tasks and automating self-repair.
  • Collaborate with team to determine functional and non-functional requirements, reliability strategies, and influence the product roadmap.
  • Solve problems relating to production issues and create solutions to prevent problem recurrence.
  • Apply troubleshooting skills, use debugging tools, and examine logs, telemetry, and other methods to verify assumptions and customer impact. Proactively address findings.
  • Stay current with industry trends, emerging technologies, and best practices in site reliability engineering and cloud/edge computing.

Required Skills and Qualifications:

  • Kubernetes Certified Application Developer (or experience demonstrating k8s expertise)
  • 5+ years experience with infrastructure-as-code tools
  • 3+ years DevSecOps experience
  • 3+ years Azure Experience
  • Experience managing high availability, stateful workloads

Bonus Points for:

  • Pulumi Experience
  • Azure Arc experience
  • Azure Certifications
  • Experience with CI/CD and GitOps (Flux, ArgoCD, or Rancher Fleet)
  • Experience with Inductive Automation's Ignition Platform
  • Experience managing workloads on-prem
  • Knowledge of configuration management best practices and tools (e.g. Azure App Configuration, Ansible, Chef, Puppet, etc.)
  • Experience with OT and industrial/manufacturing customers

What We Offer:

  • A dynamic and fast-paced work environment with opportunities for rapid growth and development.
  • A competitive salary and benefits package (Medical/Dental/Vision, 401K).
  • An opportunity to work with a talented team of engineers and developers on a cutting-edge product.
  • The chance to shape the future of our company and its platform.
  • Flexible work arrangements, including remote work options.
  • A culture that values innovation, collaboration, openness, and continuous learning.

Contact Us

Contact Us to Apply

Join a growing team in a dynamic environment.
Contact Us