Downloading Coursera Materials

Confidence building can’t replace real learning.
Lisa Simpson

A long time ago I did write a posting on quickly downloading Coursera videos. Looking at ways to do so today, well, end of January 2021, I stumbled upon coursera-dl, a python tool.

Using it wasn’t easy on my MacBook, I needed a current version of Python, which then was too current to run the tool. But let’s go step by step, as far as I remember them.

A caveat before we begin: First off, no warranty. No idea whether the tools or commands do anything else than they supposed to do. And yeah, I’m a sorcerer, not a wizard. I can make the computer make what I want, but I don’t completely understand the commands. It’s powerful, but there might be unwanted side-effects.

1. Install Python 3

You can find the installer for macOS (Big Sur and earlier versions) on the python.org page, although there apparently are better ways to do so. If you have homebrew installed,

brew install python

would have worked as well.

2. Make Python 3 the default

Using this page, I got the path for the installed version via

ls -l /usr/local/bin/python*

and I used

ln -s -f /usr/local/bin/python3.9 /usr/local/bin/python

to make it accessible via python commands.

IIRC the path information was updated, so the python command worked from any folder on the disk. Otherwise you have to update the path information (https://coolestguidesontheplanet.com/add-shell-path-osx/ might work).

3. Update Pip

As far as I understand it, pip is the package installer for Python. It is already installed but needs to be upgraded (forgot where I found that line).

python3 -m pip install --upgrade pip

4. Install coursera-dl

Now you can install coursera-dl via:

pip install coursera-dl

5. Adapt the download command to use cookie authentification

The normal download process as described on the github-page of coursera-dl did not work for me. Despite creating an coursera-dl.conf file in the directory in which I executed the command and putting:

--username MYEMAIL
--password MYPASSWORD
--subtitle-language en

in it, I did get an authentification error (HTTPError: 400 Client Error: Bad Request) when using my eMail-Address and password. (I still keep it, not sure whether it is necessary for the downloads to work. Could remove it, but, nah.) There is a long thread on the issues page (tip: jump to the end and work your way upwards). The current (end of January 2021) solution is to provide not the email and password, but use the cookie authentification. You get this by being logged into Coursera (best go to the course page) and then examining the cookie it sets.

If you use a Chrome-based browser (e.g., Chrome, Brave, Dissenter) you should be able to press cmd + option + i to open the Developer Tools of the Browser (see also this guide). Look under Storage > Cookies and select https://www.coursera.org. You need the CAUTH Value, so double-click on this field, select all, and copy the whole cookie (mine had 281 characters, a mixture of letters, numbers and special characters). Do not simply use “copy” via right-click and the menu — it will not get the whole cookie.

Click on the image to enlarge it.

Now use the Terminal, go to the directory in which you want to download the files, and enter the command

coursera-dl -ca CAUTHCOOKIETEXT COURSENAME

CAUTHCOOKIETEXT is the cookie text

COURSENAME is the path information when you enter a course in Coursera. Just audit a course, e.g., “U101: Understanding College and College Life” and look for the folder-information after coursera.org/learn/, here it is: college-life

So, if the CAUTHCOOKIETEXT was abcdefgh (no chance in hell, just an example) and the course would be “U101: Understanding College and College Life”, the command would be:

coursera-dl -ca abcdefgh college-life

But for it to work under Python 3.9, you have to do another step (see point 6 below).

BTW, if you have never used Terminal, use

ls

to see the files and folders in the directory you are in (you should start with your home directory)

and

cd FOLDERNAME

to go to a directory you see, e.g.,

cd Documents

would mean changing to the Documents directory. With

cd ..

you move up a level.

6. Edit the utils.py file

You likely get an AttributeError (‘HTMLParser’ object has no attribute ‘unescape’) as described on this issue page.

The issue page points to this solution which requires you to change some text in the utils.py of coursera-dl

First find the coursera-dl directory. On my Mac, it was in:

/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/coursera/

Just switch to the Finder, select Go, then Go to Folder … (Shift + Cmd + G) and paste the path information in it and select Go. Then open the utils.py in a plain text editor like BBEdit or TextWrangler.

You need to comment out the line

from six.moves import html_parser

by just putting an # before it so it looks like

# from six.moves import html_parser

and add

import sys
if sys.version_info[0] >= 3:
import html
else:
from six.moves import html_parser
html = html_parser.HTMLParser()

immediately below it.

Then you need to comment out any occurrence of

h = html_parser.HTMLParser()

again by putting an # in front of it, so it looks like

# h = html_parser.HTMLParser()

and put

h = html

immediately below it. I had to do it twice in the utils.py file.

7. Download the course material

Now quit and reopen Terminal, go to the directory in which you want to download the files, and try the command again, e.g.,

coursera-dl -ca abcdefgh college-life

(replace abcdefgh with your CAUTH Value)

It should work now.

Note that the CAUTH value can change (e.g., when you delete your cookies/history and login again), so if something does not work, re-check and update the value.

On the other hand, I might have forgotten a step or something else might have changed.

8. Learn

Of course, downloading is only one step, which you can do very quickly and easily now (hopefully). I did play around a bit and had 21 GB downloaded without realizing it. Ooops. But anyway, that is not important. What is much more important is actually learning what you want to learn.

But now, at least, you can do it offline on any device.

And yeah, I know, you can download files on the Coursera App, but I rather have the files outside specialized apps. Being able to watch them in GoodReader or vlc on any mobile device and jot down notes in Notability … me like.