Systems Engineer, YouTube

Systems Engineer, YouTube - Mountain View

Systems Engineer, YouTube - Mountain View San Bruno, CA The area: YouTube and Video The first video posted on YouTube was a 19-second clip called "Me at the Zoo."-- Today, more than 48 hours of video are uploaded every minute. The YouTube and Video team helps budding filmmakers and musicians build careers, creates products like Google TV and YouTube Live and runs collaborative projects like Life in a Day and the YouTube Symphony Orchestra. We are leading a change in how we entertain, inform and share with one another, whether through cat videos or footage of a revolution in progress. The role: Systems Engineer, YouTube As a Systems Engineer working on Google's critical production applications and infrastructure, your mission will be to ensure Google is always fast, available, scalable and engineered to withstand unparalleled demand. You will be in the thick of solving the often unexpected problems of systems at scale in a way most engineers never experience. Your scope is from the kernel level to the continent level. This position requires the flexibility and aptitude to zoom in to fine-grained detail, and the agility to zoom right back out and up the stack. Delve into how software performs, packets flow, and hardware and code interact, in support of managing services, steering global traffic and predicting and preventing failures.... all in a day's work. You will design and develop systems to run Google Search, Gmail, YouTube, Maps, Voice, AppEngine, and more. You'll manage, automate, and make data- based decisions and judgment calls which influence globally distributed applications. You'll own the production services which comprise *.google.com, and critical infrastructure like GFS, BigTable, MapReduce and large- scale cloud computing clusters. You will also be driving performance and reliability from software and infrastructure at massive scale -- where dealing in petabytes and gigabits and shifting by orders of magnitude is routine. You will tackle challenging, novel situations every day and work with just about every other engineering and operations team at Google. You will be looked upon as an expert and advocate to fellow engineers on making design and reliability trade-offs in running large- scale services and engineering complex systems that fail gracefully and transparently to users. As a successful candidate for this role you will have strong analytical and troubleshooting skills, fluency in coding and systems design, solid communication skills and a desire to tackle the complex problems of scale which are uniquely Google. We are particularly interested in software engineers familiar with aspects of running web services at scale -- depth in either networking technologies and Unix system calls are strong pluses. Responsibilities: * Manage availability, latency, scalability and efficiency of Google services by engineering reliability into software and systems. * Respond to and resolve emergent service problems; write software and build automation to prevent problem recurrence. * Participate in service capacity planning and demand forecasting, software performance analysis and system tuning. * Review and influence ongoing design, architecture, standards and methods for operating services and systems. Minimum Qualifications: * BA/BS degree in Computer Science or related field (In lieu of degree, 4 years relevant work experience). * 3 years of relevant work experience, including with Unix/Linux systems requiring the use of languages like Python, C, C++, Java, Perl, Shell or PHP. * Technical troubleshooting and performance tuning experience. Preferred Qualifications: * 6 years relevant work experience, including in a high-volume or critical production service environment as well as experience leading short projects involving outside teams. * Experience coordinating or leading small cross-team technical projects. * Experience in OSes and systems (e.g. UNIX internals, device drivers, FreeBSD), open source tools (e.g. dtrace, ktrace), web service components (e.g. load balancing, LAMP stack), storage and clustering (e.g. column stores, Hadoop), scripting and programming languages (e.g. Erlang, Lashell, Scala or Scheme). * Knowledge of IP networking, network analysis, performance and application issues using standard tools like tcpdump. * Ability to handle periodic oncall duty as well as out-of-band requests. * Strong written and spoken English language skills. Department: YouTube and Video