HDP Analyst: Data Science

HDP Analyst: Data Science Course Description

Duration: 3.00 days (24 hours)

Price: $2,295.00

This course provides instruction on the processes and practice of data science, including machine learning and natural language processing. Included are: tools and programming languages (Python, IPython, Mahout, Pig, NumPy, pandas, SciPy, Scikit-learn), the Natural Language Toolkit (NLTK), and Spark MLlib.

Next Class Dates

Contact us to customize this class with your own dates, times and location. You can also call 1-888-563-8266 or chat live with a Learning Consultant.

Back to Top

Intended Audience for this HDP Analyst: Data Science Course

  • » Architects, software developers, analysts and data scientists who need to apply data science and machine learning on Hadoop

Back to Top

Course Prerequisites for HDP Analyst: Data Science

  • » Students must have experience with at least one programming or scripting language, knowledge in statistics and/or mathematics, and a basic understanding of big data and Hadoop principles. Students new to Hadoop are encouraged to attend the HDP Overvi

Back to Top

HDP Analyst: Data Science Course Objectives

  • » Describe supervised and unsupervised learning differences
  • » List the six machine learning tasks
  • » Use Mahout to run a machine learning algorithm on Hadoop
  • » Describe the data science life cycle
  • » Use Pig to transform and prepare data on Hadoop
  • » Write a Python script
  • » Use NumPy to analyze big data

Back to Top

HDP Analyst: Data Science Course Outline

      1. Describe supervised and unsupervised learning differences
      2. List the six machine learning tasks
      3. Use Mahout to run a machine learning algorithm on Hadoop
      4. Describe the data science life cycle
      5. Use Pig to transform and prepare data on Hadoop
      6. Write a Python script
      7. Use NumPy to analyze big data
      8. Use the data structure classes in the pandas library
      9. Write a Python script that invokes SciPy machine learning
      10. Describe options for running Python code on a Hadoop cluster
      11. Write a Pig User-Defined Function in Python
      12. Use Pig streaming on Hadoop with a Python script
      13. Write a Python script that invokes scikit-learn
      14. Use the k-nearest neighbor algorithm to predict values
      15. Run a machine learning algorithm on a distributed data set
      16. Describe use cases for Natural Language Processing (NLP)
      17. Perform sentence segmentation on a large body of text
      18. Perform part-of-speech tagging
      19. Use the Natural Language Toolkit (NLTK)
      20. Describe the components of a Spark application
      21. Write a Spark application in Python
      22. Run machine learning algorithms using Spark MLlib
      23. Take data science into production
      24. Labs:
        1. Setting Up a Development Environment
        2. Using HDFS Commands
        3. Using Mahout for Machine Learning
        4. Getting Started with Pig
        5. Exploring Data with Pig
        6. Using the IPython Notebook
        7. Data Analysis with Python
        8. Interpolating Data Points
        9. Define a Pig UDF in Python
        10. Streaming Python with Pig
        11. K-Nearest Neighbor and K-Means Clustering
        12. Using NLTK for Natural Language Processing
        13. Classifying Text using Naive Bayes
        14. Spark Programming and Spark MLlib

Back to Top

Do you have the right background for HDP Analyst: Data Science?

Skills Assessment

We ensure your success by asking all students to take a FREE Skill Assessment test. These short, instructor-written tests are an objective measure of your current skills that help us determine whether or not you will be able to meet your goals by attending this course at your current skill level. If we determine that you need additional preparation or training in order to gain the most value from this course, we will recommend cost-effective solutions that you can use to get ready for the course.

Our required skill-assessments ensure that:

  1. All students in the class are at a comparable skill level, so the class can run smoothly without beginners slowing down the class for everyone else.
  2. NetCom students enjoy one of the industry's highest success rates, and pass rates when a certification exam is involved.
  3. We stay committed to providing you real value. Again, your success is paramount; we will register you only if you have the skills to succeed.
This assessment is for your benefit and best taken without any preparation or reference materials, so your skills can be objectively measured.

Take your FREE Skill Assessment test »

Back to Top

Award winning, world-class Instructors

Our instructors are passionate at teaching and are experts in their respective fields. Our average NetCom instructor has many, many years of real-world experience and impart their priceless, valuable knowledge to our students every single day. See our world-class instructors.   See more instructors...

Back to Top

Client Testimonials & Reviews about their Learning Experience

We are passionate in delivering the best learning experience for our students and they are happy to share their learning experience with us.
Read what students had to say about their experience at NetCom.   Read student testimonials...

Back to Top

Ways to contact us

Back to Top