13 Jul 2021

Blue Code International AG: Remote Site Reliability Engineer

Job Description

Headquarters: Zürich, Zurich, Switzerland

URL: https://bluecode.com

At Bluecode we’re building the first European mobile payment scheme that enables cashless payments combined with value-added services. It’s a solution from and for Europe; accepted at 20’000 locations, from large and small department stores and supermarket chains to famous events like Oktoberfest (watch our CEO pitching at NOAH18 for more).

We just got additional funding, and we are making a strong push to evolve our AWS-based infrastructure orchestrated by Pulumi to the next level, automate more manual work, and implement the SRE methodology more faithfully.

To that end we’re looking for a Site Reliability Engineer to help us achieve and surpass our goals regarding infrastructure and automation.

Engineering @ Bluecode

We build new features on top of a modern stack, consisting of web apps (in a mix of Vue, Svelte, and Typescript) and native SDKs communicating through APIs to Elixir services, backed by Postgres, all deployed in docker containers in a continuous delivery cycle to kubernetes on AWS EKS, orchestrated by Pulumi.
You don’t need to have experience in any of our specific technologies: we’re great at teaching good engineers how to use our modern SaaS stack.
We don’t follow rigid Scrum or Kanban, but we do work in an agile, iterative way, and try to continuously improve and implement what works for us.

What will you do?

We’re currently in the process of building out an SRE team, and firmly in the Kitchen Sink stage. That means lots of opportunities and influence, but also lots of responsibility and different hats to wear. If you expect that everything has already been neatly structured, don’t apply. Our culture is closer to a startup than to a large corporation.
You will join the team that owns infrastructure, automation, tooling, and automated acceptance testing; a small, collaborative team of senior engineers, and you will be mainly working on our infrastructure-as-code (IaC) codebase written in Typescript using Pulumi to orchestrate all infrastructure in AWS and Kubernetes running on top of it (EKS), and integrations into Github, Datadog, and other SaaS providers.
While we encourage all engineers to contribute to the IaC codebase, they tend to do so on their own service’s deployment level; pure infrastructure concerns are with us (the kubernetes clusters, the security, all non-functional aspects, writing library code for the software engineers to consume and leverage, all infrastructure upgrade/migration concerns, etc.).
No Terraform, no CloudFormation, no YAML, only Pulumi. The C in IaC is code, not configuration; with everything that a full programming language brings with it: better abstraction, better tooling, more flexibility, more automation; but also more skill and responsibility required. We prefer using more powerful tools to enable highly skilled individuals to have more leverage over simpler tools that contain less foot-guns (we also believe in automated testing to avoid the foot-guns…).
The above also means that you have to have a good grasp on software engineering, which means coding, delivering new features, code reviews, fixing bugs, peer programming, etc. You will do a lot of that; in small, actionable chunks to deliver the most value in no time. And you are also responsible to manage the rollout of your changes across our different environments/AWS accounts.
You will participate in and improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement; collaborate with software engineers and product managers to help plan timely delivery of required new infrastructure.
You will maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
You will own, operate, and be accountable for the different infrastructure pieces and any other high-leverage coding/automation/tooling projects you take on.
You will help evolve our systems by pushing for changes that improve reliability, reduce friction, and improve scale.
You’ll be doing sustainable incident response postmortems and post-launch reviews.
You will be learning from and teaching other engineers; sharing your knowledge to enable them to contribute to the IaC codebase, reducing knowledge silos and making the team more resilient and effective in the process. As a senior engineer you’re also encouraged to take on mentoring of junior engineers.

Requirements

Who are you?

You thrive in identifying high-leverage, high-value work and direct your attention to those opportunities without explicit direction.
You aim to always be learning new things, working in new spaces and share this passion with those around you.
You’ve worked to automate and remove repetitive and manual tasks because inefficiency is one of your least favorite things.
You love to design, implement, and improve tools, frameworks, metrics, and processes.
You believe that unless you can quantify or measure something, you probably can’t improve it.
You love to work, collaborate, and lead cross-functionally.

What do you need to bring?

You have previous experience in Site Reliability or relevant operations & development in a SaaS organization, including both successes and failures (“scars”), which help you make better decisions.
You have experience building and operating service-based systems on AWS using Pulumi or Terraform.
You can write reliable and understandable code in Typescript or other languages.

What would be nice to have?

Experience automating complex, multi-step infrastructure rollouts (using a workflow solution like Azure Logic Apps).
Experience with Pulumi’s Automation API.
Experience with the Nix package manager.

Benefits

Why do you want to work at Bluecode?

You can help build an amazing product in a company big enough for growth but lean enough to make a genuine impact.
The experience of being able to pay in real life with the system you’ve helped building is priceless.
A startup you’d be proud to use, we are putting consumers privacy first.
We will provide you with opportunities to develop your career.
You will be offered a competitive salary.
The work is fully remote and we have flexible working hours (though you’re expected to overlap with Europe/Zurich for 5h per day).
Your equipment needs get covered by a recurring equipment stipend.
You will have a yearly budget for attending conferences.

At Bluecode, we foster an inclusive, supportive, fun yet challenging team environment. We value having a team that is made up of a diverse set of backgrounds and respect the healthy expression of diverse opinions. We embrace experimentation and the examination of all kinds of ideas through reasoning and testing. Come join us as we continue to change the world of mobile payment.

If you have any questions, please feel free to reach out to (paste into a Bourne-compatible shell):

echo “moc tod edoceulb ta relleum tod d” | rev

Interested? We’d love to hear from you.

To apply: https://weworkremotely.com/remote-jobs/blue-code-international-ag-remote-site-reliability-engineer

Source:

Endless.

20 total views, 1 today

Blue Code International AG: Remote Site Reliability Engineer

Job Description

Apply for this Job

Leave a Reply Cancel reply

Recent Links

Recent Posts

Contact US