# /// script
# requires-python = ">=3.10"
# ///
# Standard library imports (no need to declare in dependencies)
import random
import statistics as stats
from datetime import date
Welcome to first step in Python!
This notebook is written like a small interactive book: you will read, explore, and do.
Who is this for?
Anyone who has Python or Jupyter installed and wants to move from “I can run cells” to “I can write programs.”
Why Python?
- Readability first – its syntax looks like pseudocode.
- A giant ecosystem – from data science (
pandas
,scipy
) to machine learning (scikit-learn
,pytorch
,tensorflow
) to hardware (pymmcore
).- Batteries included – the standard library gives you file I/O, HTTP clients, math, testing, and more.
In bioimage analysis, Python helps automate tasks like:
- loading and processing large image datasets,
- applying filters and segmentations,
- extracting features, and
- visualizing results.
You will learn core building blocks:
Chapter | Concept | Why it matters |
---|---|---|
0 | Introduction | Python, environment, and Jupyter noteboks |
1 | Variables and data types | Store and label information so programs remember things |
2 | Data structures | Understand the different types of data in Python |
3 | Functions | Package logic into reusable, testable pieces |
4 | Control Flow with if /else |
Make decisions and branch logic |
5 | Control Flow with for |
Repeat work without copy‑pasting |
6 | Mini project | Your turn! |
Each chapter has:
When choosing a programming language, context matters. Here's how Python stacks up against R and Java in bioimage analysis and scientific computing:
Feature | Python | R | Java |
---|---|---|---|
Learning Curve | Gentle – readable syntax, beginner-friendly | Steep for programming, but easy for statistics | Steep – verbose syntax, strong typing |
Primary Strengths | General-purpose, excellent for data analysis, scripting, and automation | Specialized for statistics and plotting | Fast performance, robust for large systems |
Bioimage Support | Strong (scikit-image , napari , cellpose , etc.) |
Limited; mostly via third-party packages or Python bridges | Used in tools like ImageJ/Fiji, but not for prototyping |
Speed | Fast enough for most tasks; easy to optimize | Slower; often relies on calling C/C++ under the hood | High performance; suitable for computationally intensive tasks |
Community & Ecosystem | Massive, with libraries in AI, biology, and automation | Strong in statistics and epidemiology | Strong in engineering and enterprise |
Use Case Fit | Ideal for scripting analysis pipelines and integrating tools | Great for exploratory statistics and quick plots | Better for building plugins or standalone software tools |
In summary:
Analogy: Imaging Setup
Think of a Python environment as a virtual imaging setup:
Environments help keep these project-specific tools isolated and clean, so:
How to Create an Environment (Optional)
Not covered in this course, because we are using Google Colab.
Using uv:
uv venv --python 3.10
uv pip install package1 package2
Using conda:
conda create --name bioimage-env python=3.10
conda activate bioimage-env
Note: uv is faster, more compatible, and more secure.
Note: a requirements.txt file is typically used to specify the dependencies for a project, and can then be used to create an environment with uv e.g. uv pip install -r requirements.txt
.
What is Jupyter?
Jupyter Notebooks are like lab notebooks for code:
How to Open Jupyter?
Using Anaconda Navigator:
Using Terminal:
jupyter notebook
Concept.
A variable is a labeled box that can hold any Python object.
Because Python is dynamically typed, the label does not declare a type – the object itself knows its type.
┌─────────────────────────────┐
│ label: `pixel_intensity` │
└──────────────┬──────────────┘
↓
┌────────────────────┐
│ object: 3883.03 │
│ (float) │
└────────────────────┘
number_of_cells
or cell_number
).temperature_c
> t
. Code is read by humans far more than by machines.Variables can be rebound:
pixel_intensity = 3883.03
pixel_intensity = "high_intensity" # ↓ the label now points elsewhere and to another type!
But some objects themselves can change (lists) – we call this mutability.
Concept. Python's primitive (built‑in) data types are:
int
, float
– numbers
int
: Whole numbers like -1, 0, 42 (e.g. z_slice = 30
)float
: Decimal numbers like 3.14, -0.001 (e.g. pixel_size = 0.25
)str
– text
bool
– truth values (True
, False
)
is_segmented = True
)None
– explicit "nothing"
Example:
Create two string variables:
channel_name
(the name of your favorite imaging channel) e.g. CY5stain_name
(the name of your favorite stain) e.g. Ki67Then print: e.g. “Ki67 is imaged using the CY5 channel.”
Hint: Use an f‑string. f-string is a way to embed variables inside string literals, using curly braces {}
.
list
, tuple
, set
, and dict
list
: Ordered, mutable sequences [1, 2, 3]tuple
: Ordered, immutable sequences (1, 2, 3)set
: Unordered collection of unique items {1, 2, 3}dict
: Key-value pairs {"a": 1, "b": 2}Example:
Lists – Ordered Collections: A list holds a collection of items like a channel stack.
Example:
Tuples – Fixed-size Groupings: Tuples are like lists but immutable (can’t be changed). Useful for things like storing important information e.g. image shape.
Example:
Used for pixel dimensions, coordinates, etc.
A dictionary is an associative array (hash map) mapping keys → values. Hashmaps are a fundamental data structure in computer science, and are implemented in Python as dictionaries.
channel_colors = {"GFP": "green", "CY5": "red"}
Dictionaries map keys to values, like an image’s metadata.
Example:
Predict what will be printed:
nums = [1, 2, 3]
alias = nums
alias.append(4)
print('nums:', nums)
print('alias:', alias)
Will the two lists differ? Why/why not? Is 4 added to the beginning of the list?
Follow up: what if we use nums = alias
instead of alias = nums
?
Follow up: how to get the number of elements in nums
?
Hint: use the len
function.
Create and print variables:
sample_name = "embryo_02.tif"
z_planes = 40
pixel_spacing = 0.32
is_noise_filtered = False
# print the depth of the imaged sample
Work with a list:
fluorophores = ["Hoechst", "GFP", "mCherry"]
# print the third fluorophore in the list
# add a new fluorophore to the list and print the updated list
# remove the first fluorophore from the list and print the updated list
# Hint: use the `pop` method to remove the fluorophore at index 0
# remove the fluorophore at index 1 and print the updated list
Concept.
A function groups statements, giving them a name, inputs (parameters), and output (return value).
Syntax of a function definition:
def function_name(parameters):
"""Docstring"""
return value
Then, call the function with the function_name(arguments)
.
Note: A parameter is a variable named in the function or method definition. It acts as a placeholder for the data the function will use. An argument is the actual value that is passed to the function or method when it is called.
Why it matters:
Docstrings become the function’s documentation (try running
help(fahrenheit_to_celsius)
to see it). It's a good practice to include a docstring for every function you write, as it helps you and others understand what the function does.
Example:
38.46153846153847
Predict what will be printed:
def foo(base):
"""What does this function do?"""
base_map = {"A": "T", "T": "A", "C": "G", "G": "C"}
return base_map[base]
def foofoo(triplet):
"""What does this function do?"""
return foo(triplet[0]) + foo(triplet[1]) + foo(triplet[2])
dna_list = ["GTA", "ACC", "TTT"]
result1 = foofoo(dna_list[0])
result2 = foofoo(dna_list[1])
result3 = foofoo("CGT")
print(result1)
print(result2)
print(result3)
What does the function does to DNA codons?
Output:
TAC
GAG
GAC
Write a function bmi(weight_kg, height_m)
that returns the Body‑Mass Index, rounded to 1 decimal.
Then call it with (70 kg, 1.75 m).
Hint: use the round(value, ndigits)
function.
# write your code here
def bmi(weight_kg, height_m):
return round(weight_kg / (height_m ** 2), 1)
print(bmi(70, 1.75))
Concept: Control Flow
Control flow statements allow your program to make decisions and branch into different paths depending on conditions.
These statements let your code respond to data — like a GPS recalculating your route based on traffic or wrong turns.
Key Keywords
if
: the primary gate — only runs the code block if the condition is True
elif
: (else if) — test an additional condition if the previous one was False
else
: fallback — runs only if all above conditions are False
How it works
Example:
Moderately bright image
Truthiness in Python
In Python, not just True
and False
matter — any object can be evaluated in a boolean context:
Value | Boolean Equivalent |
---|---|
0 , 0.0 , '' , [] , {} |
False |
Non‑zero numbers, non‑empty strings/lists | True |
if []:
print("This won't run.")
if [1, 2, 3]:
print("This will!") # Lists with items are truthy
Write a function that classifies cells based on their size and intensity.
The function should take two arguments:
size
: the size of the cell (in µm²)intensity
: the intensity of the cell (in a.u., a fluorescence unit)The function should return 4 possible outputs:
Try running the function with the following inputs:
print(classify_cell(120, 50)) # → Large & Active
print(classify_cell(50, 0.3)) # → Small & Inactive
print(classify_cell(130, 12)) # → Large & Inactive
print(classify_cell(80, 75)) # → Small & Active
Hint: use the if
/elif
/else
structure to check the conditions.
Predict what will be printed:
def special_cell_classifier(size, intensity, roundness):
"""What does this function do?"""
if size > 100 and intensity > 25:
return "Proliferating"
elif size <= 100 and roundness > 0.85:
return "Resting"
elif intensity < 0.2 or roundness < 0.2:
return "Likely debris"
else:
size_label = "Large" if size > 100 else "Small"
activity_label = "Active" if intensity > 25 else "Inactive"
shape_label = "Round" if roundness > 0.85 else "Irregular"
return size_label + " & " + activity_label + " & " + shape_label
What will the following code print?
print(special_cell_classifier(120, 50, 0.9))
print(special_cell_classifier(50, 0.3, 0.2))
print(special_cell_classifier(130, 0.4, 0.2))
print(special_cell_classifier(80, 125, 0.85))
Output:
Proliferating
Likely debris
Likely debris
Large & Active & Round
Concept.
for
loops iterate over iterables: lists, strings, ranges, files, generators…
Why loops matter:
Pythonic looping embraces iteration over indices:
Example:
Predict what will be printed:
def foo(lst):
"""What does this function do?"""
new_lst = []
for i in range(len(lst)-1, -1, -1):
new_lst.append(lst[i])
return new_lst
numbers = [1, 2, 3, 4, 5]
print(f"Original list: {numbers}")
new_numbers = foo(numbers)
print(f"New list: {new_numbers}")
Output: Original list: [1, 2, 3, 4, 5] Reversed list: [5, 4, 3, 2, 1] Original list unchanged: [1, 2, 3, 4, 5]
Think about what the function does. How is the output achieved with the for
loop?
Write a loop that goes through:
images = {'img1.tif': 1000, 'img2.tif': 2240, 'img3.tif': 3000}
And processes each image by checking if it's a large image:
Hint: use a for
loop to iterate over the dictionary, and use the items()
method to get the key-value pairs.
Let's create a program that analyzes metadata from microscopy images to help organize and validate your dataset.
You'll work with a dictionary of image metadata containing:
Tasks:
# dictionary of image metadata
image_metadata = {
'img1.tif': {
'magnification': 40,
'exposure_time': 100,
'cell_type': 'neuron',
'staining': 'DAPI'
},
'img2.tif': {
'magnification': 60,
'exposure_time': 150,
'cell_type': 'astrocyte',
'staining': 'GFP'
},
'img3.tif': {
'magnification': 40,
'exposure_time': 200,
'cell_type': 'neuron',
'staining': 'DAPI'
}
}
# solution to the mini project
Checking exposure times: img1.tif has valid exposure time: 100 img2.tif has valid exposure time: 150 img3.tif has valid exposure time: 200 Images grouped by cell type: neuron: ['img1.tif', 'img3.tif'] astrocyte: ['img2.tif'] Average exposure times by magnification: 40x: 150.0ms 60x: 150.0ms
numpy
, matplotlib
.“Programs must be written for people to read, and only incidentally for machines to execute.”
— Harold Abelson