Google Colab
Google Colab
google colab
Quick Primer on Colab Jupyter
Start
Firstly, you need to have a google drive. (Of course you also have to get a google account before this.)
New a folder as your work space in your google drive.
-
If you already have a .ipynb file, you can double click it directly (or right click it > Open with > Google Colaboratory) to open this .ipynb file with Colab.
-
If you want to new a .ipynb file, just right click blank space > More > Google Colaboratory, then google drive will create a Untitled.ipynb file and open it with Colab automatically.
The use of Colab is vary similar with jupyter notebook.
Some skills
You can add !
before a command to indicate it should be run in shell. For example
1 | ! pip list |
By runing this command, Colab will return the names of all the packages it has installed including most common packages.
1 | ... |
If you want to install another package, take the lightning
for an example, use the command below.
1 | ! pip install lightning |
Once you finish the install, you can import it in your python code by add this command.
1 | import lightning |
Note: Using short cut key
Ctrl + Enter
to run the code block quickly.
Mount to google drive
1 | from google.colab import drive |
Change working directory
Using os.getcwd()
or !pwd
to get current working directory
1 | import os |
1 | !pwd |
Using os.chdir('...')
or magic cd %cd '...'
to change working directory to your folder of Google Driver.
1 | os.chdir('/content/drive/MyDrive/Colab Notebooks') |
1 | %cd '/content/drive/MyDrive/Colab Notebooks' |
Note: Note that
!cd '...'
doesn’t work for this purpose because it will take effect only in the current code block.
magic-cd | Ipython
Linux pwd命令 | Runoob
Linux cd 命令 | Runoob
How to Change Python Version in Google Colab
1 | ! cat /etc/os-release # check the system version of Colab |
1 | ! python --version # check the default version of Python in Colab |
1 | ! ls /usr/bin/python* # check the availabel versions of python in Colab(Ubuntu) |
How to Change Python Version in Google Colab
1 | !cat /etc/os-release |
Alternative reading
Update-alternatives Command: A Comprehensive Guide for Linux Users
1 | !sudo update-alternatives: --install <link> <name> <path> <priority> |
unzip
Upload the .zip file to your google driver and open a .ipynb file in colab, then mount your google driver to colab and execute the following command.
The first path is where your package is ('.../Inception-v4.zip'
in my case), the second path is the destination folder.
Note that the destination folder must be under '/content/drive/MyDrive/'
which is your owe google driver directory.
1 | !unzip '/content/drive/MyDrive/Colab Notebooks/Inception-v4.zip' -d '/content/drive/MyDrive/Colab Notebooks' |
How to find the location of package that you pip installed
1 | !python -m pip show <package_name> |
Take timm for example
1 | !python -m pip show timm |
Reload your modified python file
Jupyter Notebook Reload Module: A Comprehensive Guide
If you have imported a python file and later make changes to it, you’ll need to reload it in your Jupyter Notebook (or Colab) to take advantage of any recent changes.
Here’s the scenario. You are working in a Jupyter Notebook and you’ve imported a custom python file. While working in your notebook, you make some changes to the python file and want to work with those new changes in your Jupyter Notebook. After saving your python file, you run your import some_file code again. However, your recent changes don’t get imported.
You could restart your entire kernel, or you can simply reload the file by running this code:
1 | import python_file |
Once you run the code, you’ll see that any changes you made to your python file are correctly loaded into your jupyter notebook.
Using the %load_ext and %autoreload Magic Commands
1 | %load_ext autoreload |
TypeError
1 | TypeError: reload() argument must be a module |
The reason why you got this error is that this module hadn’t been imported, or you used from python_file import *
rather than import python_file
so you didn’t import the python file (the module) actually.
There is two way to solve this problem.
The first is using import python_file
to take place of from python_file import *
;
The second is using sys.modules['python_file']
to reload your module. Then reimport your python_file.
1 | from python_file import * |
1 | from python_file import xxx |
e.g.
1 | from geoseg.datasets.potsdam_dataset import PotsdamDataset |
Cuda in Colab
Click top left corner Edit > Notebook settings >
1 | Hardware accelerator: None/GPU/TPU |
Use this command to test if GPU is available. You should get True
if you set correctly.
1 | import torch |
Get the number of GPU
1 | torch.cuda.device_count() # 1 |
Get the index of current device
1 | torch.cuda.current_device() # 0 |
Get the name of device
1 | torch.cuda.get_device_name(0) # Tesla T4 |
Using this command to open nvidia-smi
(Invidia System Management Interface) to check more information about your GPU.
1 | !/opt/bin/nvidia-smi |
By default, tensors are generated on the CPU. You have to manually make sure that the operation is done through the GPU.
Using .cuda()
you can convert (copy) Tensor from CPU to GPU.
If there are multiple Gpus, use .cuda(i)
to represent the i-th GPU, and i defaults to 0.
Besides, you can use .is_cuda
or .is_cpu
to check whether a tensor is on GPU or CPU.
1 | x = torch.tensor([1, 2, 3]) |
Using .cpu()
you can convert(copy) a Tensor back to CPU from GPU. And you can print a Tensor’s device attribute to check it is on a CPU or a GPU.
1 | x = torch.tensor([1, 2, 3]) |
You can specify a tensor’s device attribute when you create it.
1 | x = torch.tensor([1, 2, 3], device="cuda:0") |
device
and .to(device)
is commonly used in training with GPU. And This method is more recommended.
1 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") |
1 | x = torch.tensor([1, 2, 3], device=device) |
If a calculation of a Tensor on a GPU device is performed, the result of the calculation will also be stored in that device:
1 | x = torch.tensor([1, 2, 3]).cuda() |
However, Tensor on different devices cannot be calculated directly, including Tensor on the GPU cannot be calculated directly with Tensor on the CPU, and Tensor on different GPU devices cannot also be calculated directly:
1 | x = torch.tensor([1, 2, 3]).cpu() |
By default, your model is generated on CPU
1 | model = torch.nn.Linear(3, 1) |
1 | for param in model.parameters(): |
You can convert your model to a GPU via .cuda()
or .to(device)
command. What .cuda()
or .to(device)
do is moving all model parameters and buffers to the GPU.
torch.nn.Module.cuda
1 | model = torch.nn.Linear(3, 1).cuda(0) |
How to run your .py file rather than .ipynb on Colab
First change directory to where your .py file on your google driver
1 | import os |
Then use the shell command to run your .py file.
1 | !python filename.py |
empty_cache
torch.cuda.empty_cache()
pandas
1 | import pandas as pd |
sklearn
sklearn.model_selection.train_test_split
1 | from sklearn.model_selection import train_test_split |
segmentation_models_pytorch
segmentation_models.pytorch | GitHub
segmentation_models_pytorch’s documentation
1 | import segmentation_models_pytorch as smp |
smp.Unet() HTTP Error 502
When you are invoking smp.Unet() or other model of smp, you get a http error as follows:
1 | Downloading: |
The reason for this is that the certificate of the website has expired, as shown in the following picture (the date is 2023-8-6).
You need to download the .pth file manually (for example inceptionv4-8e4777a0.pth in my case) from the website, and move or copy it to where it should be (check the terminal to get the website and the path it should go).
In the case of Colab
In the case of Colab, you first need to upload the .pth file to your Google Driver. Then use the following commands to copy the .pth file to the destination folder.
If you have not used torch hub before, you need to create a folder under this path (you can get it from the Error information) for torch hub first.
1 | !mkdir -p '/root/.cache/torch/hub/checkpoints/' |
Otherwise, you may get this error if you excute copy command directly.
1 | cp: cannot create regular file '/root/.cache/torch/hub/checkpoints/': No such file or directory |
Now you can use the following command to copy the .pth file to the destination folder. The first path is the path of your source file (…/inceptionv4-834777a0.pth in my case), and the second path is the path of your destination folder.
1 | !cp -r |
albumentations
1 | import albumentations as A |
torch.hub
Pytorch Hub
pytorch_vision_resnet
Module.train&eval
torch.save&load
torch.save
torch.load
Saving and loading pytorch tensors and module states
saving and loading tensors
1 | import torch |
By convention, PyTorch files are typically written with a ‘.pt’ or ‘.pth’ extension.
- you can also save multiple tensors as part of Python objects like tuples, lists, and dicts
1 | import torch |
- Saving and loading tensors preserves views
1 | import torch |
- use
.clone()
to avoid unnecessary waste of storage
1 | import torch |
1 | import torch |
Saving and loading Modules
Tutorial: Saving and loading modules
- saving and loading state_dict()
1 | import torch |
- saving and loading entire model
1 | # save: |
Load a model saved from a CUDA device to a CPU device
If you load a model saved from a CUDA device to a CPU device directly, you will get a RuntimeError as following:
1 | RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. |
You have to check torch.load() again to see its map_location
attribute and add it to your torch.load
1 | torch.load('xxx.pt', map_location=torch.device('cpu')) |
Train, Validation and Test Sets
About Train, Validation and Test Sets in Machine Learning
Swin-Transformer for Classification
Data Set: flower-photos
1 | # Folder Structure |
1 | # read_split_data(root: str, val_rate: float = 0.2) |
Structure
Input Image: 3x224x224
Number of Classes: 5
Patch Partition: 56x56x48, 4224×4224×48
Stage 1
- Linear Embedding:
- Swin Transformer Block:
- Linear Embedding:
- Swin Transformer Block:
Stage 2
- Linear Embedding:
- Swin Transformer Block:
- Linear Embedding:
- Swin Transformer Block:
Stage 3
- Linear Embedding:
- Swin Transformer Block:
- Linear Embedding:
- Swin Transformer Block:
- Linear Embedding:
- Swin Transformer Block:
- Linear Embedding:
- Swin Transformer Block:
- Linear Embedding:
- Swin Transformer Block:
- Linear Embedding:
- Swin Transformer Block:
Stage 4
- Linear Embedding:
- Swin Transformer Block:
- Linear Embedding:
- Swin Transformer Block:
Stage 1: Linear Embedding + SwinTransformerBlock*2
Stage 2: PatchMerging + SwinTransformerBlock*2
Stage 3: PatchMerging + SwinTransformerBlock*6
Stage 4: PatchMerging + SwinTransformerBlock*2