This month - Remote Puppet jobs
  • Surge
    PROBABLY NO LONGER AVAILABLE.Must be located: North America.Preferred timezone: UTC +8

    We are looking for people that have and can work remotely, living in the US and/ or Canada to work on long term projects. 

    Our immediate opening is for a Senior DevOps Engineer.

    SKILLSET REQUIRED:

    Build/Release senior engineer with systems architecture skills
    • Demonstrated experience working with various tools like Jenkins, Docker, Chef, etc. 
    • Experience in automated builds, deployments, rollbacks and troubleshooting.
    • Expertise with source code control and configuration management. 
    • Experience with Windows and/or Linux administration, and Infrastructure management
    • Experience with Databases (SQL Server, MongoDB)

    FOR IMMEDIATE CONSIDERATION, EMAIL RESUME WITH TECH STACKS BY EACH JOB, FIRST AND LAST NAME, CONTACT INFORMATION DIRECTLY ON THE RESUME. Sorry, NO Visas.

  • Ahrefs
    PROBABLY NO LONGER AVAILABLE.Preferred timezone: UTC -7 to UTC -3

    What We Need

    Ahrefs is looking for a Site Reliability Engineer to help take care of its distributed crawler powered by 2,000 servers and ensure all systems are up and running 24/7. If you possess a healthy desire to automate everything while being able to quickly resolve urgent issues manually, then we want you! We strive to keep humans away from doing repetitive jobs that can be done by computers and focus instead on foreseeing problems and defining programmatic means to handle them.

    Our system is big part custom OCaml code and also employs third-party technologies - Debian, ELK, Puppet, Clickhouse, and anything else that will solve the task at hand. In this role, be prepared to deal with 25 petabytes storage cluster, 2,000 baremetal servers, experimental large-scale deployments and all kinds of software bugs and hardware deviations on a daily basis.

    Basic Requirements:

    • Deep understanding of operating systems and networks fundamentals
    • Practical knowledge of Linux userspace and kernel internals

    The ideal candidate is expected to:

    • Understand the whole technology stack at all levels: from network and user-space code to OS internals and hardware
    • Independently deal with and investigate infrastructure issues on live production systems including dealing with hardware problems and interact with datacenters
    • Develop internal automation - monitoring, setup, statistics
    • Have the ability to foresee potential problems and prevent them from happening. Apply first-aid reaction to infrastructure failures when necessary
    • Help developers with deployment and integration
    • Participate in on-call rotation
    • Make well-reasoned technical choices and take responsibility for it
    • Approach problems with a practical mindset and suppress perfectionism when time is a priority
    • Setup automatic systems to control infrastructure
    • Possess a healthy detestation for complex shell scripts
  • Platform.sh
    PROBABLY NO LONGER AVAILABLE.Preferred timezone: UTC -8 to UTC -4

    If you’re looking for an exciting, high-growth opportunity with an award-winning, cutting-edge company, this could be just the job for you

    For its groundbreaking PaaS solution https://platform.sh is looking for a Pythonian Cloud Engineer with a taste for Go, good Linux system understanding, and a real hunger for the challenges of building robust, distributed systems.

    Platform.sh is a PaaS shrouded in a lot of black magic (we can consistently clone a whole running cluster, with its state, databases, indexes in a matter of seconds). We want to get this down to the hundreds of milliseconds domain. Interested? There is more…

    Our external API is pure Hypermedia REST + oAuth on top of Pyramid. It mechanizes the Git layer and needs more features.

    We can consistently generate from the same manifest a Docker container, an LXC one, or VM disk images (AWS, Azure, OpenStack), we want more targets.

    We probably have the highest industry container density. We need to get it higher.

    We support any Python, Ruby, NodeJS, PHP, Java and .NET time to roll-out Elixir, of course Elixir, and Rust.

    We need to have more auto-healing on the high-availability clusters. We need more performance out of our multi-protocol ssh proxy. We need work on our Ceph Implementation; We need to get the Debian package generation streamlined and faster. We need… great ideas on how to make Platform.sh even better. Interested? Join us!

    This is a remote position and occasional travel to cool places like Paris, France, may be required.

    Skills & requirements:

    Required:

    • Be a really really good dev that likes testing, understands how an OS works, knows networking, how git works, and the constraints of a distributed system.
    • Be proficient in Python (2 out of 3 of our dev team learned it while on the job, but we'd prefer someone who has already mastered it.)

    Would be really great if you had:

    • Great Golang experience.
    • Experience with C (we contribute to a bunch of upstream projects, like LXC) is a plus; Love not required.
    • Great knowledge of Git
    • Good Networking background (routing/protocols)
    • Good grasp of practical security and cryptography
    • Experience with other programming languages (Haskell anyone? Java, Javascript, Ruby, PHP? Common Lisp?)
    • Good knowledge of how the Web works (Hacking Nginx with Lua a plus).
    • Good understanding of how database systems and search engines work
    • A good notion on distributed systems (consensus protocols like Raft/Paxos, eventual consistency models, gossip protocols)
    • Working knowledge of Puppet
    • Mad Debian Skills. Sporting a Debian plaid cloth during the interview is not frowned upon.
Older - Remote Puppet jobs
  • SemanticBits
    PROBABLY NO LONGER AVAILABLE.Preferred timezone: UTC -5

    SemanticBits is seeking a DevOps Engineer to support the automation and deployment needs of a range of projects. You will work hand-in-hand with development teams to implement automation solutions using technologies like Amazon Web Services (AWS), CloudFormation, Ansible, Terraform, Elastic Compute Cloud, and Jenkins to automatically build, test, integrate, and deploy applications in the healthcare and life sciences domains. You will leverage the full power of the cloud to configure highly resilient and scalable applications that can handle hundreds of thousands of users. This is a remote position.

    SemanticBits is a leading company specializing in the design and development of digital health services, and the work we do is just as unique as the culture we’ve created. We develop cutting-edge solutions to complex problems for commercial, academic, and government organizations. The systems we develop are used in finding cures for deadly diseases, improving the quality of healthcare delivered to millions of people, and revolutionizing the healthcare industry on a nationwide scale. There is a meaningful connection between our work and the real people who benefit from it; and, as such, we create an environment in which new ideas and innovative strategies are encouraged. We are an established company with the mindset of a startup and we feel confident that we offer an employment experience unlike any other and that we set our employees up for professional success every day.

    REQUIREMENTS

    We are looking for a DevOps Engineer who is well versed in the following key technologies:

    • Solid hands-on working experience with configuring and maintaining resources on AWS
    • Experience with the majority of EC2, ELB, CloudFormation, S3, Glacier, CodeDeploy, SNS, SQS, RDS, IAM
    • Hands-on understanding of virtualization and experience with Docker
    • Deep, hands-on experience with Linux and administration
    • Expertise with production deployments, and CI/CD tools such as Jenkins
    • Experience automating cloud infrastructure, such as with CloudFormation
    • Expertise with cloud security, such as managing users, roles, and privileges through IAM
    • Experience managing Atlassian tooling such as Jira and Confluence preferred
    • Experience deploying and managing a wide range of components that support web applications, such a nginx, Apache http, git, scripting (bash, Perl, Python, etc.), databases (MongoDB, PostgreSQL, etc.)
  • Surge
    PROBABLY NO LONGER AVAILABLE.Must be located: North America.Preferred timezone: UTC -7

    SURGE is looking for smart, self-motivated, experienced, senior automated test engineers who enjoy the freedom of telecommuting and flexible schedules, to work as long-term, consistent (40 hrs/week) independent contractors (no W2) on a variety of software development projects.

    Experience Required: 

    Senior DevOps Cloud Engineer, AWS Required

    Must be located in the US to be considered for this role. Sorry, No Visas.

    For immediate consideration, email resume with tech stack under each job where it was done and include your cell phone number, email address and start date.

  • O'Reilly Auto Parts
    PROBABLY NO LONGER AVAILABLE.Preferred timezone: UTC -6

    Have you ever heard of O, O, O, O'Reilly Auto Parts…Ow?! This is not your standard System Engineer position and we are not your standard brand! We are the dominant auto parts retailer in all our market areas.

    Our infrastructure teams work on projects adding directly to the O’Reilly Auto Parts bottom line and we are looking for exceptional Engineers and Admins to help us succeed! Some of the tools we use to implement our projects are Linux, Puppet, Git, Jenkins, Ansible, and other open source tools and technologies. We also utilize collaboration tools such as Jira and Confluence.

    What we look for in our Team Members:

    • Love solving complex problems related to serving our customers better – both internal & external customers
    • Enjoy working with teams
    • Senior level experience with linux and automation
    • Experience with documentation
    • An ambition to always learn and grow

    About our team:

    • We are a “work family”! We have fun together and support each other
    • We respect a healthy work-life balance
    • We are responsible for maintaining our linux infrastructure which consists of over 2,000 servers
    • The team keeps open communication through different outlets – video conferencing, team messaging applications, and daily stand-up meetings
    • Our managers really value collaboration between team members and encourage them to bring forth creative problem-solving ideas from both a technical and functional aspect

    Growth within our teams at O’Reilly Auto Parts:

    • We have several career paths, whether you want to be a supervisor, manager, or architect – there’s a documented growth plan to help you follow the path you choose
    • We want to grow our people – we help to make you better by providing training for both technical and professional development
    • We look to promote from within – O’Reilly is diligent to promote from within our organization with qualified team members
  • Platform.sh
    PROBABLY NO LONGER AVAILABLE.Preferred timezone: UTC -8 to UTC -4

    This role is open to remote full time people.

    Platform.sh is a groundbreaking hosting and development tool for web applications. We’re a European VC-Backed startup with a host of blue-chip Enterprise clients and a string of awards and grants (including €2m from the EU Horizon 2020 program).

    To reinforce our technical prowess, we are looking to grow our operations team. If you’re looking for an exciting, high-growth opportunity with an award-winning, cutting-edge company, this could be just the job for you

    For its PaaS solution https://platform.sh is looking for an Operations and Service Reliability Engineer with a taste for Python and Go, great Linux system understanding, and a real hunger for the challenges of building robust, distributed systems.

    Platform.sh is a PaaS shrouded in a lot of black magic (we can consistently clone a whole running cluster, with its state, databases, indexes in a matter of seconds). We want to get this down to the hundreds of milliseconds domain. Interested? There is more…

    We can consistently generate from the same manifest a Docker container, an LXC one, or VM disk images (AWS, Azure, OpenStack), we want more targets.

    We probably have the highest industry container density. We need to get it higher.

    We support any Python, Ruby, NodeJS or PHP, Java and .NET.

    Directly reporting to our Director of Infrastructure and in close interaction with our Engineering and Customer Support teams, you will be responsible for:

    • cloud operations: configure clusters, deploy stuff, follow-up on alerts, help customer support debug issues, all in Microsoft Azure
    • automating all of the above so they can instead drink margaritas (or non-alcoholic beverages, of course)
    • creating systems, tools & processes that will enhance our support and operations efficiency
    • improving service quality, discipline and reliability throughout lifecycle
    • monitoring operating objectives, streamline and automate intervention
    • continuous learning from Operations experience, modeled as software

    The ideal candidate:

    • has proven successful experience in an operations role
    • has demonstrated the ability to successfully manage cloud-based infrastructure for a fast growing organization
    • has experience with containerization technologies
    • has had exposure to cloud services (Azure)
    • understands how an OS works, knows networking, how git works, and the constraints of a distributed system
    • Puppet experience
    • is proficient in Python (Golang a plus)

    Nice to have :

    • knowledge of Magento Ecommerce, Symfony, Drupal, eZ Platform, or Typo3
    • relational database skills

    Note: We don't like stress, so we build everything to be robust and resilient, but stuff does break. This is a role with on-call duties. If page-duty fills you with dread… well, this might not be a fit.

  • Wikimedia Foundation, Inc.
    PROBABLY NO LONGER AVAILABLE.

    Location: San Francisco, CA or Remote

    Summary

    We are looking for a Site Reliability Engineer to directly support our application platform serving the world’s favorite encyclopædia to millions of people around the globe. Wikipedia and its sister projects are powered strictly by Free and Open Source software with MediaWiki in its core surrounded by an ecosystem of microservices in PHP, NodeJS, Python, Go and Java.

    We are a distributed and diverse team of engineers with a drive to explore, experiment and embrace new technologies. During the past few years we have been transitioning our platform from a monolith to a hybrid, microservices architecture, and started migrating our microservices onto Kubernetes. We’ve adopted Elastic Stack and Prometheus as our de facto logging and monitoring platforms and are improving our automation (we ❤️ automation).

    If you find what we do interesting, if you are up to the challenge of improving the reliability and delivery of one of the Internet’s top 10 websites, and you enjoy the idea of working with a globally distributed team, you might be just the person we need. Come as you are!

    Responsibilities

    • Ensure smooth and reliable operation of the MediaWiki application platform, the surrounding ecosystem of microservices, and their dependencies (Memcached, Redis, Kafka, etcd, …)

    • Perform platform transformations and migrations towards modernized infrastructure (HHVM to Zend PHP7, bare metal deployments to Kubernetes clusters, active/active multi-data center support, etc.)

    • Bring your creativity to improve our current infrastructure and introduce new automation where needed

    • Support new code/feature deployments when required

    • Troubleshoot, debug and follow-up on emerging issues in our application stack and its surroundings

    • Perform day-to-day operational/DevOps tasks on Wikimedia’s wider public facing infrastructure (deployment, maintenance, configuration, troubleshooting), as well as reduction of manual, repetitive, automatable tasks (toil).

    • Implement and utilize configuration management and deployment tools (Puppet, Kubernetes)

    • Assist in the architectural design of new services and making them operate at scale

    • Monitoring of systems, services and service clusters, optimization of performance and resource utilization

    • Incident response, diagnosis and follow-up on system outages or alerts across Wikimedia’s production infrastructure

    • Share our values and work in accordance with them

    Qualifications

    • 3+ years experience in an SRE/Operations/DevOps role as part of a team
    • Experience in supporting complex web applications running highly available and high traffic infrastructure based on Linux
    • Comfortable with configuration management and orchestration tools (Puppet, Ansible, Chef, SaltStack, etc.), and modern observability infrastructure (monitoring, metrics and logging)
    • Aptitude for automation and streamlining of tasks
    • Comfortable with shell and scripting languages used in an SRE/Operations engineering context (e.g. Python, Go, Bash, Ruby, etc.)
    • Good understanding of Linux/Unix fundamentals and debugging skills
    • Strong English language skills and ability to work independently, as an effective part of a globally distributed team
    • B.S. or M.S. in Computer Science or equivalent in related work experience

    Pluses

    • Experience managing MediaWiki installations is a major plus
    • Track record of open source contributions is highly appreciated
    • Experience running PHP/LAMP stack applications is a plus, especially in geographically distributed environments
    • Familiarity  with modern distributed container cluster management systems (Kubernetes, Docker Swarm, Mesos, …)
    • Low level systems troubleshooting and debugging (CPU/memory profiling, C/C++ experience, in-depth Linux knowledge)
    • Experience with advanced distributed storage and database systems (Swift, Ceph, Cassandra, etc.)
    • Familiarity with RFC2549 or similar protocols

    The Wikimedia Foundation is… 

    …the nonprofit organization that hosts and operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge, free of interference. We host the Wikimedia projects, build software experiences for reading, contributing, and sharing Wikimedia content, support the volunteer communities and partners who make Wikimedia possible, and advocate for policies that enable Wikimedia and free knowledge to thrive. The Wikimedia Foundation is a charitable, not-for-profit organization that relies on donations. We receive financial support from millions of individuals around the world, with an average donation of about $15. We also receive donations through institutional grants and gifts. The Wikimedia Foundation is a United States 501(c)(3) tax-exempt organization with offices in San Francisco, California, USA.

    The Wikimedia Foundation is an equal opportunity employer, and we encourage people with a diverse range of backgrounds to apply

    U.S. Benefits & Perks*

    • Fully paid medical, dental and vision coverage for employees and their eligible families (yes, fully paid premiums!)
    • The Wellness Program provides reimbursement for mind, body and soul activities such as fitness memberships, baby sitting, continuing education and much more
    • The 401(k) retirement plan offers matched contributions at 4% of annual salary
    • Flexible and generous time off - vacation, sick and volunteer days, plus 19 paid holidays - including the last week of the year.
    • Family friendly! 100% paid new parent leave for seven weeks plus an additional five weeks for pregnancy, flexible options to phase back in after leave, fully equipped lactation room.
    • For those emergency moments - long and short term disability, life insurance (2x salary) and an employee assistance program
    • Pre-tax savings plans for health care, child care, elder care, public transportation and parking expenses
    • Telecommuting and flexible work schedules available
    • Appropriate fuel for thinking and coding (aka, a pantry full of treats) and monthly massages to help staff relax
    • Great colleagues - diverse staff and contractors speaking dozens of languages from around the world, fantastic intellectual discourse, mission-driven and intensely passionate people

    *Eligible international workers' benefits are specific to their location and dependent on their employer of record

    More information

    Wikimedia Foundation website

    Wikimedia Foundation blog

    Annual Report - 2017

    Wikimedia 2030

  • O'Reilly Auto Parts
    PROBABLY NO LONGER AVAILABLE.Preferred timezone: UTC -6

    Have you ever heard of O, O, O, O'Reilly Auto Parts…Ow?! This is not your standard System Engineer position and we are not your standard brand! We are the dominant auto parts retailer in all our market areas.

    Our infrastructure teams work on projects adding directly to the O’Reilly Auto Parts bottom line and we are looking for exceptional Engineers and Admins to help us succeed! Some of the tools we use to implement our projects are Linux, Puppet, Git, Jenkins, Ansible, and other open source tools and technologies. We also utilize collaboration tools such as Jira and Confluence.

    What we look for in our Team Members:

    • Love solving complex problems related to serving our customers better – both internal & external customers
    • Enjoy working with teams
    • Senior level experience with linux and automation
    • Experience with documentation
    • An ambition to always learn and grow

    About our team:

    • We are a “work family”! We have fun together and support each other
    • We respect a healthy work-life balance
    • We are responsible for maintaining our linux infrastructure which consists of over 2,000 servers
    • The team keeps open communication through different outlets – video conferencing, team messaging applications, and daily stand-up meetings
    • Our managers really value collaboration between team members and encourage them to bring forth creative problem-solving ideas from both a technical and functional aspect

    Growth within our teams at O’Reilly Auto Parts:

    • We have several career paths, whether you want to be a supervisor, manager, or architect – there’s a documented growth plan to help you follow the path you choose
    • We want to grow our people – we help to make you better by providing training for both technical and professional development
    • We look to promote from within – O’Reilly is diligent to promote from within our organization with qualified team members