Data Analysis with Python

PYPDA

This course aims to equip delegates with a practical level of statistics and a substantial knowledge of Python libraries (NumPy, Pandas, Matplotlib and others) to allow them to engineer enterprise level solutions in a data-driven environment.

More details

More info

This course is aimed at:

Someone who already has good programming experience and sufficient mathematical background and wants to develop a solid statistical background for data analysis in Python, or, someone with good statistical knowledge and some Python wanting to develop extensive practical skills for data analysis with Python packages, would benefit from this course. This course will benefit anyone who requires a solid theoretical and practical foundation in Data Analysis or Data Science (Machine learning and Artificial Intelligence) in Python.

Course Objectives

  • Perform numerical calculations and simulations using the Python NumPy library
  • Read, explore, manipulate and process tabular data from various sources, including excel and csv files, using Pandas and other libraries
  • Visualise and generally explore data using Matplotlib and Seaborn
  • Carry out descriptive statistical summaries on data in Python
  • Calculate inferential point statistics, including confidence intervals
  • Design and carry out relevant hypothesis tests in Python
  • Interpret graphs and statistical results correctly
  • Defend the methods used based on sound statistical principles

Course Content:

Statistics

  • Understand basic concepts of probability
  • Understand and apply conditional probability and Bayes' theorem in simple cases
  • Calculate basic descriptive statistical measures such as
    • Measures of Central Tendency: Mean, Median, Mode
    • Measures of Dispersion: Variance, Standard Deviation and Quantiles
  • Understand and use discrete and continuous probability distributions including
    • Binomial, Poisson, Normal, t-Distribution, Chi-Square, F-Distribution
  • Understand and or calculate, for inferential statistics
    • Sampling bias
    • Sampling distributions
    • Confidence intervals
    • Hypothesis testing
  • Understand and perform basic Linear Regression
  • Produce various visual representation (or plots) of data

Python

  • Create and manipulate NumPy arrays
  • Use NumPy vectorized functions
  • Use random number generators and perform simple simulations
  • Read cvs, excel and other format data into Pandas DataFrame objects
  • Clean, group, manipulate and summarise tabular data using Pandas data processing features
  • Use the scipy.stats module for statistical calculations
  • Plot Bar and Pie charts, line graphs, box-plots, histograms and scatterplots using Matplotlib and Seaborn
  • Use Jupyter Notebook

£ 2,795.00 ex.vat

Data sheet

Course Duration 5 Days
Location London