About this Course:

Data Scientists enjoy one of the top-paying jobs, with an average salary of $120,000 according to Glassdoor and Indeed. That's just the average! And it's not just about money - it's interesting work too!

If you've got some programming or scripting experience, this course will teach you the techniques used by real data scientists and machine learning practitioners in the tech industry - and prepare you for a move into this hot career path.

If you do not have any programming experience, there is a learn to code with Python bootcamp at the start of the course which you need to attend. The Python bootcamp costs $100 and will teach you the fundamentals of Python code so you can be able to pick up the class. All students who do not know Python are required to attend this class.

Each concept is introduced in plain English, avoiding confusing mathematical notation and jargon. It’s then demonstrated using Python code you can experiment with and build upon, along with notes you can keep for future reference. You won't find academic, deeply mathematical coverage of these algorithms in this course - the focus is on practical understanding and application of them. At the end, you'll be given a final project to apply what you've learned!

What am I going to get from this course?

• Develop using iPython notebooks
• Understand statistical measures such as standard deviation
• Visualize data distributions, probability mass functions, and probability density functions
• Visualize data with matplotlib
• Apply conditional probability for finding correlated features
• Use Bayes' Theorem to identify false positives
• Make predictions using linear regression, polynomial regression, and multivariate regression • Understand complex multi-level models
• Use train/test and K-Fold cross validation to choose the right model
• Build a spam classifier using Naive Bayes
• Use decision trees to predict hiring decisions
• Cluster data using K-Means clustering and Support Vector Machines (SVM)
• Build a movie recommender system using item-based and user-based collaborative filtering
• Predict classifications using K-Nearest-Neighbor (KNN)
• Apply dimensionality reduction with Principal Component Analysis (PCA) to classify flowers
• Understand reinforcement learning - and how to build a Pac-Man bot
• Clean your input data to remove outliers
• Implement machine learning, clustering, and search using TF/IDF at massive scale with Apache Spark's MLLib • Design and evaluate A/B tests using T-Tests and P-Values