- Time management
- Software deployment
- Shell Scripting
As a Senior Datacenter Engineer you will be responsible for leading the day-to-day operations of the Tesla Datacenter engineering tasks for the new Gigafactory Texas build in Austin, Texas but also some off-site locations on the east coast. Our global team performs all the on and off premise datacenter work that supports all production and engineering work that makes Tesla a world leader in self-driving EV, energy storage, and solar power technology. Continuous deployment, monitoring, maintenance, improvement, and rapid turn-around on service requests from all over the organization is imperative to drive a successful production environment in the datacenter.
You’ll be the highly engaged and hands-on regional representative for a closely integrated, cross-functional, and versatile team that performs most racking, stacking, wiring, and implementation designs, implements, and maintains all Tesla datacenter resources globally. With the ever-growing need for more and more data and compute, locally, and in remote locations – datacenter operations need to follow suit, be scalable through more automated processes for deployment, monitoring, and alerting. You will be responsible for ensuring greatly improved processes in precision deployments of production systems by leveraging the combined resources your team provides.
Leverage and improve upon existing data center deployments to ensure continuous operation
Work with engineering teams to understand useful metrics to collect and implement such monitoring and alerting with existing monitoring solutions at the datacenter level.
Organize and document implemented solutions for long term information retention with our internal ticketing and documentation system.
Work closely with involved parties automated workflows that can be easily implemented by remote hands with little or no understanding of internal systems.
As part of the team, respond to, and document submitted support tickets relating to the functionality of various systems present in the datacenter.
Help develop automated tools to collect information that can be directly used to assist users creating root cause analysis for issues reported.
BS in Computer Science, Electrical Engineering or related field or a Bachelor’s degree with 3 years of additional equivalent experience
5+ years experience with:
Computer deployment and operations (CPU / GPU)
Networking infrastructure deployment and operation
Linux operating system flavors (CentOS/RHEL, Ubuntu)
Systems monitoring and alerting (Ganglia, Telegraf, Splunk, etc.)
3+ years experience with:
Storage systems (On-prem and/or in-cloud)
DCIM type software for monitoring, alerting, automation
Working knowledge of power and cooling infrastructure at the datacenter scale and planning for such
Working knowledge of datacenter, network, and compute deployments at scale
Working knowledge of programming and/or scripting with python, bash, or similar
Excellent time management and communication skills are absolute musts
Ability to step up and take ownership to bring complex tasks to completion
Nice to have:
Experience with multi-site on-prem and in cloud hybrid software and hardware deployments
Familiarity with public cloud compute and storage resource orchestration
Interest and knowledge in energy efficient high performance computing, planning, and understanding of liquid cooling
Tesla participates in the E-Verify Program