Data Science Introduction (Python)

The course bridges the gap between programming and data science, focusing on the libraries for introducing data science, web development and basic statistics knowledge.

  • 3rd week of June, 2020
  • 6 sessions
  • 2.5 hours/session
  • Data Science
Data Science Introduction (Python) Skillsfuture Courses / skills future courses - pay with skillsfuture credit
U.P. $600
$200See Available Slots

What is Virtual Classroom?

This course will be taught using a combination of video presentation, video online conferencing, screen sharing and virtual technology. We replicate the complete experience of a classroom right inside your living room!

During the virtual classroom lesson, you will be given a link to call in and speak to the instructor at the scheduled timings. The instructor will be guiding the lesson through prepared videos, and also hands on teaching to ensure that you will have a complete learning experience.

Learn from the comfort and safety of your own home during the COVID19 pandemic!


A level up from our Python Development course, our Data Science Introduction (Python) course will teach students to build a simple yet functional web application. Working within the parameters set by the individual instructors, this fun project will see students building three web applications: a spam email classifier, taxi availability predictor and an air pollution index predictor.


settings_input_antenna15h 00mVirtual Class
description11h 45mAssignment

Welcome to the Data Science Introduction course! In order to prepare for this course, you need to install Anaconda Jupyter Notebook and the Visual Studio Code editor.

Students will be taught how to use Python to do file processing, including reading in tabular (CSV, Excel) data.

You will be analysing historical titanic data sets to predict the probability of a person surviving this tragedy. We will be using Python to explore the relationship between gender, age, number of kids with you on board etc, with the chances you will survive.

In this exercise, you will calculate the probability of a passenger surviving on Titanic surviving based on the different cabin class of the passenger. You will have to use Python and Pandas to derive the probabilty and visualise the data in Jupyter notebook.


Data Engineering is an important part of Data Science. In this exercise, you are given a set of data that has missing fields. Your task is to perfrom data engineering to approximate and fix the missing fields.

In this class, we will be using Python programming to deep dive data sets and carry out analysis. We will be covering basic statistics knowledge - such as mean medium, mode, looking at doing regression models - linear and polynomial, and going into probability theories - bayesian theorem (conditional probability).

We will be using Singapore's GDP and birth rate data to build a regression model to predict the country's performance based on top of birth rate.

In this exercise, you will perform a polynomial regression using Python to predict the GDP of Singapore using data provided by

While analysing, visualising and predicting data are great, a modern Data Scientist needs to be able to understand front end interface design and back end HTTP web technology. For this class, you will be learning full stack web development, on how to build your data science project so that other people can interface your code.

Digressing a little from statistics, we will look at how to build web servers using Python. This lesson will teach us HTML, CSS and JS, served by Python Flask web servers. You will be able to build a functional web application and let users interface with your code.

As a modern data scientist, it is important to understand web technology and use it to present and interface with our data science projects.

We will be building a web application where a user can submit an email and we will test and decide if the email is spam or ham (non-spam)

In Part 1 of this exercise, we will be building a web application to interface with our data science application. We will build the code using Flask as a web server and creating a front end interface to submit details. Student is expect to create the correct routing with redirection and form to collect data form parameters from the user.

A huge issue that companies and Data Scientists face is the lack of viable/ready-made data sets that are available for usage. In this lesson, we teach students how to build, create and clean up data set for your own usage.

Building off last week's lesson, you will learn how to write scripts which will allow you to mine web data by crawling HTML pages and using it to build a database of information. Working on top of IMDB's website, we will scrape useful information and data for our own project. We will also be exploring other ways to get data sets.

In Part 2, you will be integrating the email spam classifier code into the backend of our web application. By the end of the exercise, you should have a fully functional web application that shows the result whether a text is spam or ham.

For this lesson, we will be looking at how websites build recommendation services. e.g. Amazon's "You might also be interested in purchasing ...".

We will be working on a movie database and learning how to use collaborative filtering to make recommendations. At the end of the class, we will be able to correctly recommend movies based on their ratings.

For example, when a user rates "Star Wars: A New Hope" 10/10, our code will be able to recommend "Star Wars: The Empire Strikes Back".

For the last assignment, we will combine all that we learnt so far to build two different projects:

1. We will create a Singapore Pollutant Standards Index (PSI) value calculator to predict the air quality. Using historical data available in the past, we will use regression modelling to predict the PSI value of the country at a certain day.

2. A taxi availability predictor. We will be using probability to visualise Singapore's available taxi data set using Python. Students will be able to predict the availability of taxis on the road given a certain timing.

Cohort - 15 Jun, 2020
Current: 2
Max: 25

About this course

This course pays special attention to specialised Python libraries for data retrieval and visualisation techniques. Statistic knowledge will also be taught to help students understand the advanced concepts of Data Science.

As part of the course, students will also be taught key data libraries, namely Pandas, Numpy and Matplotlib; these libraries are responsible for data loading, mathematical computations and data visualisation respectively.

Students can also look forward to receiving course materials that will serve as useful reference for post-course revision and practice. At the end of the course, students will be taught the basics of web service development and will be equipped with the basic tools needed to start their career in data science.


This course requires a basic understanding of Python. You should either have sat and completed our Python Development Course or already have intermediate-level understanding of Python.

If you have not sat our Python Development course, a Student Affairs Officer will reach out to you upon registration to confirm your mastery of Python.
Data Science Introduction (Python) Skillsfuture Courses / skills future courses - pay with skillsfuture credit
UpCode Academy

Attend coding classes taught by true experts working in the industry. Get practical instructions and interact with these practitioners during the classes.

  • 32 Courses
  • 1,362 Students
  • 32 Instructors

Subscribe to our Newsletter