Introduction to Data Science: Building Recommender Systems
Data scientists build information platforms to ask and answer previously unimaginable questions. Learn how data science helps companies reduce costs, increase profits, improve products, retain customers, and identify new opportunities.
This three-day course helps participants understand what data scientists do and the problems they solve. Through in-class simulations, participants apply data science methods to real-world challenges in different industries and, ultimately, prepare for data scientist roles in the field.
- » This course is suitable for Developers, Data Analysts, Data Scientist, and Statisticians with basic knowledge of Apache Hadoop: HDFS, MapReduce, Hadoop Streaming, and Apache Hive. Students should have proficiency in a scripting language; Python is strongly preferred, but familiarity with Perl or Ruby is sufficient.
Back to Top
- » The role of data scientists, vertical use cases, and business applications of data science
- » Where and how to acquire data, methods for evaluating source data, and data transformation and preparation
- » Types of statistics and analytical methods and their relationship
- » Machine learning fundamentals and breakthroughs, the importance of algorithms, and data as a platform
- » How to implement and manage recommenders using Apache Mahout and how to set up and evaluate data experiments
- » Steps for deploying new analytics projects to production and tips for working at scale
Back to Top
- 1 - Data Science
- What is Data Science?
- Growing Need for Data Science
- Role of a Data Scientist
- Use Cases
- Defense and Intelligence
- Telecommunications and Utilities
- Healthcare and Pharmaceuticals
- Project Life Cycle
- Steps in the Project Life Cycle
- Data Acquisition
- Where to Source Data
- Acquisition Techniques
- Evaluating Input Data
- Data Formats
- Data Quantity
- Data Quality
- Data Transformation
- File Format Conversion
- Joining Datasets
- Data Analysis and Statistical Methods
- Relationship Between Statistics and Probability
- Descriptive Statistics
- Inferential Statistics
- Fundamentals of Machine Learning
- Three Cs of Machine Learning
- Spotlight: Naïve Bayes Classifiers
- Importance of Data and Algorithms
- What is a Recommender System?
- Types of Collaborative Filtering
- Limitations of Recommender
- Systems Fundamental Concepts
- Apache Mahout
- What Apache Mahout is (and is not)
- History of Mahout
- Availability and Installation
- Demonstration: Using Mahout's Item-Based Recommender
- Implementing Recommenders with Apache Mahout
- Similarity Metrics for Binary Preferences
- Similarity Metrics for Numeric Preferences
- Experimentation and Evaluation
- Measuring Recommender Effectiveness
- Designing Effective Experiments
- Conducting an Effective Experiment
- User Interfaces for Recommenders
- Production Deployment and Beyond
- Deploying to Production
- Tips and Techniques for Working at Scale
- Summarizing and Visualizing Results
- Considerations for Improvement
- Next Steps for Recommenders
Back to Top
We ensure your success by asking all
students to take a FREE Skill Assessment test.
These short, instructor-written tests are an objective measure of your current skills that help us determine whether or not you will be able to meet your goals by attending this course at your current skill level. If we determine that you need additional preparation or training in order to gain the most value from this course, we will recommend cost-effective solutions that you can use to get ready for the course.
Our required skill-assessments ensure that:
- All students in the class are at a comparable skill level, so the class can run smoothly without beginners slowing down the class for everyone else.
- NetCom students enjoy one of the industry's highest success rates, and pass rates when a certification exam is involved.
- We stay committed to providing you real value. Again, your success is paramount; we will register you only if you have the skills to succeed.
This assessment is for your benefit and best taken without any preparation or reference materials, so your skills can be objectively measured.
Take your FREE Skill Assessment test »
Back to Top
Jose Marcial Portilla has a BS and MS in Mechanical Engineering from Santa Clara University. He has a great skill set in analyzing data, specifically using Python and a variety of modules and libraries. He hopes to use his experience in teaching and data science to help other people learn the power of the Python programming language and its ability to analyze data, as well as present the data in clear and beautiful visualizations. He is the creator of some of most popular Python Udemy courses including "Learning Python for Data Analysis and Visualization" and "The Complete Python Bootcamp". With almost 30,000 enrollments Jose has been able to teach Python and its Data Science libraries to thousands of students. Jose is also a published author, having recently written "NumPy Succintly" for Syncfusion's series of e-books.
Back to Top