Software Development Engineer-Data Mining/Text Analysis/Machine Learning Job in Seattle 98119, Washington US

Software Development Engineer-Data Mining/Text Analysis/Machine Learning

The Catalog Quality organization drives continuous improvement in the accuracy and completeness of Amazon.com product catalog data, in order to directly improve the shopping experience for Amazon's customers. We focus on the data which matters most to customers; we build high-leverage software systems which greatly extend the reach of human judgments; and we devote much effort to measuring and improving the results.

The Catalog Quality team is looking for a passionate, results-oriented, software engineer to be part of a new product line and business. You'll work with a talented and nimble team of engineers to create innovative ways to improve analytics and add data mining approaches to analyze data with the goal of helping customers find what they need. Our problems are in the application of Machine Learning techniques at enormous scale - processing billions of data elements each year, with the Amazon product catalog growing in different markets, in new categories and adding products every day. Theoretical approaches may work fine with small datasets, but applying new ideas in a fast moving business context is a unique challenge that few teams can offer. The team will build a distributed data pipeline that can adapt and deal with the large volume of data elements to process while maintaining a high quality and operating efficiency. Our operational load is low and there is a lot of new software development with open ended business problems.

We are a highly-motivated, co-operative and fun loving team who thrive on solving challenging problems with innovation. As part of this team you will be analyzing data, developing new algorithms, building large-scale distributed software systems in Java using open source technologies such as Hadoop, Lucene and JBoss and other Amazon.com proprietary technologies.

The ideal candidate will have the following qualifications:

• Bachelor's Degree in Computer Science or related field with 4+ years relevant work experience
• Fundamentals in design and coding skills in Java/C++ on Unix Platforms
• Familiarity with Perl or other scripting languages and a understanding of SQL
• Computer Science fundamentals in object-oriented design
• Computer Science fundamentals in data structures
• Computer Science fundamentals in algorithm design, problem solving, and complexity analysis
• Proficiency in, at least, one modern programming language such as C, C++, Java

• PHD/Master's degree in Computer Science or Math or related field with 5+ years of relevant work experience
• Experience in Perl, Java, Object Oriented Design and familiarity with application and database programming under UNIX/Linux
• Past experience in at least one of the following areas - Information Retrieval, Data Mining, Text Analysis or Machine Learning
• Experience with building high-performance, highly-available and scalable distributed systems
• Experience building complex software systems that have been successfully delivered to customers
• Experience with large database driven applications and/or distributed computing
• Proficiency with HTTP Protocol, REST, XML, J2EE, JavaScript, and AJAX
• Be highly innovative, flexible and self-directed

Amazon is an Equal Opportunity Employer