MTS 2, Data Platform Engineer Job in San Jose, California Us

Team

This position is part of the Trust Science - Applied research group. We are responsible for applying advanced technology research to solve Trust business problems. Trust applications proactively prevent fraud, catch fraud, enforce eBay policies, as well as collect mine data that will help build future Trust and Safety strategies. We build real time machine learning applications processing 100s of millions of transactions a day, learning from terabytes of historical data.

Responsibilities - Data Platform Engineer/Architect

  1. Design and build the next generation eBay Trust Science Data Platform
  2. Design scalable and robust platforms and applications for eBay-scale data.
  3. Seamlessly operate on and move data between Hadoop, relational databases, and NoSQL storage
  4. Analyze various data sources statistically to discover Fraud patterns.
  5. Design and conduct various Machine Learning experiments and implement the models to detect Fraud/Risk.
  6. Evaluate the performance of the models, investigate false positive/false negatives and implement continuous improvements to the prediction models.
  7. Build predictive models/features using one or more of advanced techniques like Text Mining, NLP, Graph Mining and Relevance Ranking.
  8. To deliver on above stated responsbilities, you may also need to work on one or more of the following.

    1. Design and develop data preparation components and processes that extract and transform data across many DB tables and log files to prepare it for machine learning experiments.
    2. Build high performance and scalable data mining processes to sort, merge, join and aggregate large data files.
    3. Define complex SQL and other data extraction schemes to gather and filter needed data.
    4. Work with a large cross functional team consisting of scientists and engineers from eBay research and engineering teams, as well as analysts and business leaders from TnS and other business teams.

  1. A strong data analysis background with some exposure to statistical data analysis.
  2. 2+ years of experience with large data set processing and data mining.
  3. Strong programming skills (One or more of C/C++, java, perl, python, R with application to data mining)
  4. Parallel and/or distributed experience (MPI or Map/Reduce frameworks)
  5. 2+ years of hands-on SQL experience.
  6. Good communication skills, ability to work with large cross functional teams.
  7. Specialized Skills like Text Mining, NLP, Graph Mining, Relevance Ranking are a plus.