Virtual Open Day! 19 November @1PM GMT | Register Now!
Virtual Open Day! 19 November @1PM GMT | Register Now!
Virtual Open Day! 19 November @1PM GMT | Register Now!
Virtual Open Day! 19 November @1PM GMT | Register Now!
Virtual Open Day! 19 November @1PM GMT | Register Now!
Virtual Open Day! 19 November @1PM GMT | Register Now!
10 Credits

Web Scraping, and end-to-end data pipelines

Please note to take this course you must first have completed Advanced Machine Learning & Programming in Python

This course is an introduction to web scraping. It covers basic techniques of web scraping, reviews common libraries and frameworks for web scraping in Python, extraction from HTML and XML, and discusses more advanced techniques. This course also covers the basics of data engineering, including data ingestion, cleaning, and transformation, as well as data storage and retrieval.

​This module can be taken as part of a PG Certificate, PG Diploma or Full Masters Program.

Web Scraping, and end-to-end data pipelines
  • 10 Credits
  • 100 hours of study
  • 15 contact hours
  • 85 hours for private study
icon
Qualifications accredited by Lancaster University
icon
Buildable Qualifications
icon
Learn Around
Your Schedule
icon
World-Class
Faculty
icon
Fully Online

Structure

Software

Python

Apache Airflow

Module Programme

Introduction to Web Scraping

Session Content
  • Definition and purpose
  • Legality and Ethical Considerations
  • Internet in a nutshell
  • Basics of HTML
  • Introduction to Beautifulsoup

Advanced web scraping

Session Content
  • Reading documents
  • Data extraction forms and logins
  • Introduction to Selenium
  • Dynamic websites and JavaScript
  • Web scraping best practices

Introduction to Data Engineering

Session Content
  • What is data engineering
  • Data engineering in organizations
  • Main data engineering architecture
  • The process of data transformation

Apache Airflow and data orchestration

Session Content
  • Introduction to orchestrators and Apache Airflow
  • Airflow main concepts
  • Create and run automated data pipelines
  • Monitor and debugging pipelines

Session Content

Session Content

Session Content

Session Content

Session Content

Session Content

Session Content

Session Content

Session Content

Prerequisites

English Language Requirements

Both Programmes are open to applicants anywhere in the world. We may ask applicants to provide a recognised English language qualification, dependent upon their nationality and where they have studied/worked previously.

 The requirement is an IELTS (Academic) Test with an overall score of at least 6.5, and a minimum of 6.0 in each element of the test. We will also consider other English language qualifications. If their score is below our requirements, they may be eligible for one of Lancaster University's pre-sessional English language programmes.

Academic Requirements

Applicants to the Postgraduate Certificate of Achievement, Postgraduate Certificate, Postgraduate Diploma or full MSc in either programme require either an upper second-class degree in economics, econometrics or related subjects.

Learning Outcomes

Key Skills
  • Data Engineering Basics: Fundamental understanding of data engineering principles
  • Data Pipeline Setup: Ability to set up data pipelines for efficient data processing
  • Web Scraping Techniques: Skills in collecting data from the web using scraping methods
  • Tool Proficiency: Familiarity with tools necessary for data engineering tasks
  • Data Collection from the Web: Competence in extracting data from web sources
  • Practical Application: Applying data engineering skills to real-world scenarios
  • Data Processing Skills: Handling and processing data efficiently within pipelines
  • Understanding Tools for Data Engineering: Knowledge of tools essential for data engineeringtasks.
Desired Skills
  • Understand the basics of web scraping and its applications
  • Extract data from HTML and XML using web scraping techniques
  • Use common libraries and frameworks for web scraping in Python, such as BeautifulSoupand others
  • Handle advanced web scraping challenges, such as dealing with dynamic websites andavoiding detection
  • Communicate effectively about web scraping techniques and their applications
  • Understand the basics of data engineering and the role of end-to-end data pipelines in dataanalytics
  • Design and implement data pipelines for data ingestion, cleaning, and transformation
  • Use common tools and technologies for building data pipelines ingestion, such as ApacheAirflow
  • Monitor and troubleshoot data pipelines for performance and reliability
  • Communicate effectively about data engineering concepts and techniques.

Frequently Asked Questions

Are the courses within either programme conducted synchronously or asynchronously?

All sessions are conducted live and online at a scheduled time, but are also recorded. Students may attend live and watch the recordings back to recap the material or watch the recordings only if unable to attend live. We always advise students to attend live where possible as this will allow them the best opportunity to engage with the content and ask the lecturer's questions.

Is all examination undertaken online or in-person?

All modules are examined through online coursework submissions, you will have the support of your module lecturer/tutor in this poccess.

Do I need to buy any statistical/econometric software?

No, all necessary software is provided to students.

What do I do if I can't attend a course live?

All courses are recorded and available on the LUMS internet platform throughout the current academic year. They can therefore be viewed 24 hours a day.

A Collaboration Like No Other

Timberlake Consultants and Lancaster University Management School (LUMS) Economics department have a longstanding partnership; combining 40+ years of industry expertise with over 50 years of academic excellence. We are delighted to build on this with our micro-credential postgraduate courses.

Apply Now
Timberlake Postgraduate Courses
Flexible learning
Online postgraduate courses
Specialist courses
Best-in-field experts
Tailored learning
Career-oriented education

Subscribe To Our Newsletter

Enter your email to receive updates on our Postgraduate Programmes: