This course is part of the UBC Micro-certificate in Natural Language Processing to Improve Patient Care. The program consists of three courses that can be taken individually or combined into the Micro-certificate.
Clinical notes, reports and assessments contain valuable insights but are often difficult to use at scale. This course helps you apply NLP techniques to classify, extract and summarize information from clinical documents to support research, reporting and patient care.
You build practical skills for working with real-world text, including text segmentation, information extraction, classification and summarization. You also learn how LLMs support clinical questions and assist with documentation. The course is designed for healthcare professionals and researchers seeking to use text data more effectively in clinical settings.
By the end of this course, you will be able to:
- Apply text classification techniques to label clinical documents
- Summarize healthcare documents using NLP methods
- Extract relevant information using LLM-based tools
- Use question-answering techniques for clinical document analysis
- Evaluate NLP use cases in healthcare contexts
- Assess the performance of NLP models for document extraction and analysis
For those who are new to NLP, we recommend starting with Overview of NLP and Large Language Models [link].
Course activities include videos, short quizzes, practical lab assignments and instructor-moderated discussions. Optional weekly office hours provide a space to review methods and troubleshoot challenges with clinical text.
Course outline
Module 1: NLP Workflows
This module focuses on real-world NLP applications in cancer-related clinical documentation, particularly around classification tasks, segmentation and privacy.
Module 2: Advanced Document Classification
This module focuses on text segmentation, binary/multi-class classification tasks and fine-tuning LLMs, with an emphasis on evaluation, optimization and real-world deployment of classifiers.
Module 3: Information Extraction and Clinical Question Answering
This module dives into structured and unstructured data extraction, followed by introductory QA techniques.
Module 4: Summarization, Decision Support & LLM Limitations
The content in this module spans summarization, speech-to-text, hallucination risks, and integrating NLP for decision-making and critical evaluation of LLM performance in clinical settings.
How am I assessed?
You are assessed through quizzes, discussion posts and hands-on lab assignments. Multiple-choice quizzes confirm your understanding of lecture content. Discussion posts evaluate your critical thinking and engagement with weekly topics, with feedback from the instructor or TAs. Lab assignments assess your ability to apply techniques and explain your results using scoring rubrics.
A minimum grade of 70% is required to pass.
Expected effort
Expect to spend five to seven hours per week to complete readings, videos, quizzes, lab assignments and optional office hours.
Technology requirements
- an email account
- a computer, laptop or tablet, using Windows, macOS or Linux
- the latest version of a web browser (or previous major version release)
- a Google account to access Google Drive
- a reliable internet connection
- a video camera and microphone
For virtual office hours, you’ll also need:
- a video camera and microphone
One day before the start of your course, we’ll email you step-by-step instructions for accessing your course.
Course format
This course is 100% online and instructor supported with weekly instructor office hours. Course work is done independently and at your own pace within deadlines set by your instructor. Log in anytime to your course to access the lessons as they become available.