Data Science Introduction (Python)

The course bridges the gap between programming and data science, focusing on the libraries for introducing data science, web development and basic statistics knowledge.

  • TBD
  • 6 sessions
  • 2.5 hours/session
  • Data Science

About this course

This course pays special attention to specialised Python libraries for data retrieval and visualisation techniques. Statistic knowledge will also be taught to help students understand the advanced concepts of Data Science.

As part of the course, students will also be taught key data libraries, namely Pandas, Numpy and Matplotlib; these libraries are responsible for data loading, mathematical computations and data visualisation respectively.

Students can also look forward to receiving course materials that will serve as useful reference for post-course revision and practice. At the end of the course, students will be taught the basics of web service development and will be equipped with the basic tools needed to start their career in data science.


A level up from our Python Development course, our Data Science Introduction (Python) course will teach students to build a simple yet functional web application. Working within the parameters set by the individual instructors, this fun project will see students building three web applications: a spam email classifier, taxi availability predictor and an air pollution index predictor.

Course Plan

Students will be taught how to use Python to do file processing, including reading in tabular (CSV, Excel) data.

You will be analysing historical titanic data sets to predict the probability of a person surviving this tragedy. We will be using Python to explore the relationship between gender, age, number of kids with you on board etc, with the chances you will survive.
In this class, we will be using Python programming to deep dive data sets and carry out analysis. We will be covering basic statistics knowledge - such as mean medium, mode, looking at doing regression models - linear and polynomial, and going into probability theories - bayesian theorem (conditional probability).

We will be using Singapore's GDP and birth rate data to build a regression model to predict the country's performance based on top of birth rate.
While analysing, visualising and predicting data are great, a modern Data Scientist needs to be able to understand front end interface design and back end HTTP web technology. For this class, you will be learning full stack web development, on how to build your data science project so that other people can interface your code.

Digressing a little from statistics, we will look at how to build web servers using Python. This lesson will teach us HTML, CSS and JS, served by Python Flask web servers. You will be able to build a functional web application and let users interface with your code.
A huge issue that companies and Data Scientists face is the lack of viable/ready-made data sets that are available for usage. In this lesson, we teach students how to build, create and clean up data set for your own usage.

Building off last week's lesson, you will learn how to write scripts which will allow you to mine web data by crawling HTML pages and using it to build a database of information. Working on top of IMDB's website, we will scrape useful information and data for our own project. We will also be exploring other ways to get data sets.
For this lesson, we will be looking at how websites build recommendation services. e.g. Amazon's "You might also be interested in purchasing ...".

We will be working on a movie database and learning how to use collaborative filtering to make recommendations. At the end of the class, we will be able to correctly recommend movies based on their ratings.

For example, when a user rates "Star Wars: A New Hope" 10/10, our code will be able to recommend "Star Wars: The Empire Strikes Back".
For last lesson, we will combine all that we learnt so far to build three different projects:

1. We will be building a spam email classifier web application. User will be able to input an email into the web application which we built. Our web application will be able to predict if the email is a spam or not.

2. We will create a Singapore Pollutant Standards Index (PSI) value calculator to predict the air quality. Using historical data available in the past, we will use regression modelling to predict the PSI value of the country at a certain day.

3. A taxi availability predictor. We will be using probability to visualise Singapore's available taxi data set using Python. Students will be able to predict the availability of taxis on the road given a certain timing.


This course requires a basic understanding of Python. You should either have sat and completed our Python Development Course or already have intermediate-level understanding of Python.

If you have not sat our Python Development course, a Student Affairs Officer will reach out to you upon registration to confirm your mastery of Python.

There are no open runs for this course at the moment. If you're interested in taking this course, you may join the waitlist and you will be notified when there are vacancies.

UpCode Academy

Attend coding classes taught by true experts working in the industry. Get practical instructions and interact with these practitioners during the classes.

  • 30 Courses
  • 1,349 Students
  • 30 Instructors

Subscribe to our Newsletter