Overview - Why tensors are PyTorch's core data structure

What is it?

Tensors are multi-dimensional arrays that store numbers. In PyTorch, tensors are the main way to hold and work with data. They can represent simple lists, tables, images, or even complex data like videos. PyTorch uses tensors to perform fast math and build machine learning models.

Why it matters

Without tensors, PyTorch wouldn't be able to handle data efficiently or run calculations quickly on computers or GPUs. Tensors let PyTorch process large amounts of data in parallel, which is essential for training smart models like neural networks. Without tensors, machine learning would be much slower and harder to do.

Where it fits

Before learning about tensors, you should understand basic programming and arrays or lists. After tensors, you can learn how PyTorch uses them to build models, run training loops, and perform automatic differentiation for learning.

Mental Model

Core Idea

Tensors are like flexible, powerful containers for numbers that let PyTorch do fast math and learn from data.

Think of it like...

Imagine tensors as Lego blocks that can be stacked in many shapes and sizes. Just like Lego blocks can build simple or complex structures, tensors can hold simple lists or complex data like images and videos, ready to be shaped by PyTorch's math tools.

Tensor (multi-dimensional array)
┌───────────────┐
│ Dimension 0   │
│ ┌───────────┐ │
│ │ Dimension 1│ │
│ │ ┌───────┐ │ │
│ │ │ Values│ │ │
│ │ └───────┘ │ │
│ └───────────┘ │
└───────────────┘

Each dimension adds a new level of depth, like layers in a cake.

Build-Up - 6 Steps

1

FoundationUnderstanding arrays and lists

Concept: Learn what arrays and lists are as simple collections of numbers.

Arrays and lists are ways to store multiple numbers together. For example, a list of temperatures for a week is a 1D list. A table of students and their scores is a 2D array. These structures help organize data but may not be efficient for math operations.

Result

You can hold and access multiple numbers in an organized way.

Knowing arrays and lists helps you see why we need more powerful structures like tensors for math and machine learning.

2

FoundationIntroducing multi-dimensional arrays

3

IntermediateWhat makes tensors special in PyTorch

4

IntermediateTensors and automatic differentiation

5

AdvancedTensors on CPUs and GPUs

6

ExpertMemory and computation optimization in tensors

Under the Hood

Underneath, a tensor is a contiguous block of memory storing numbers with metadata about shape, data type, and device (CPU/GPU). PyTorch uses this metadata to interpret the memory as multi-dimensional data. When you do math, PyTorch calls optimized C++ and CUDA libraries to perform operations in parallel. For gradients, PyTorch builds a computation graph dynamically, tracking operations on tensors to compute derivatives automatically.

Why designed this way?

Tensors were designed to unify data storage and computation for machine learning. Before PyTorch, frameworks separated data and computation or required static graphs. PyTorch's dynamic graph and tensor design allow flexibility, ease of debugging, and hardware acceleration. Alternatives like NumPy arrays lack GPU support and automatic differentiation, so tensors fill this gap.

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Tensor Object │──────▶│ Memory Buffer │──────▶│ Raw Numbers   │
│ (shape, dtype,│       │ (contiguous)  │       │ (floats, ints)│
│  device)      │       └───────────────┘       └───────────────┘
│               │
│ Computation   │
│ Graph Tracker │
└───────┬───────┘
        │
        ▼
┌─────────────────────────────┐
│ Optimized Math Libraries     │
│ (CPU: BLAS, GPU: CUDA/cuDNN)│
└─────────────────────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Do tensors only work with numbers, or can they hold any data type? Commit to your answer.

Common Belief:Tensors can hold any kind of data like strings or objects.

Tap to reveal reality

Quick: Do you think PyTorch tensors always copy data when you slice them? Commit to yes or no.

Common Belief:Slicing a tensor always creates a new copy of the data.

Tap to reveal reality

Quick: Do you think tensors are just like NumPy arrays with GPU support? Commit to yes or no.

Common Belief:Tensors are basically NumPy arrays that can run on GPUs.

Tap to reveal reality

Quick: Do you think tensors always run faster than Python lists? Commit to yes or no.

Common Belief:Tensors are always faster than Python lists for any operation.

Tap to reveal reality

Expert Zone

1

Tensors can share underlying storage with other tensors, enabling memory-efficient views and in-place operations that require careful management to avoid bugs.

2

The dynamic computation graph built by tensors allows flexible model architectures but requires understanding of when to detach or clone tensors to control gradient flow.

3

PyTorch's tensor backend switches between CPU and GPU libraries seamlessly, but performance depends on data layout and operation fusion, which experts optimize manually.

When NOT to use

Tensors are not ideal for non-numeric data like text or categorical variables without encoding. For symbolic math or exact arithmetic, specialized libraries like SymPy or arbitrary precision tools are better. Also, for very small datasets or simple scripts, plain Python lists or NumPy arrays may be simpler and sufficient.

Production Patterns

In production, tensors are used to batch data for efficient GPU processing, enable mixed precision training for speed and memory savings, and integrate with deployment tools like TorchScript or ONNX for optimized inference. Experts also use tensor hooks and custom autograd functions to extend PyTorch's capabilities.

Connections

NumPy arrays

Tensors build on and extend NumPy arrays with GPU support and automatic differentiation.

Understanding NumPy arrays helps grasp tensor basics, but tensors add key features for machine learning.

Automatic differentiation

Tensors are the data structure that enable automatic differentiation by tracking operations.

Knowing how tensors track math operations clarifies how models learn by adjusting parameters.

Linear algebra

Tensors generalize vectors and matrices used in linear algebra to multiple dimensions.

Recognizing tensors as multi-dimensional linear algebra objects helps understand their role in computations.

Common Pitfalls

#1Trying to perform tensor operations on CPU tensors while expecting GPU speed.

Wrong approach:x = torch.tensor([1, 2, 3]) y = x * 2 # runs on CPU by default

Correct approach:x = torch.tensor([1, 2, 3], device='cuda') y = x * 2 # runs on GPU

Root cause:Not specifying device leads to default CPU tensors, missing GPU acceleration.

#2Modifying a sliced tensor expecting it to not affect the original tensor.

Wrong approach:x = torch.tensor([1, 2, 3, 4]) y = x[1:3] y[0] = 10 # changes x as well

Correct approach:x = torch.tensor([1, 2, 3, 4]) y = x[1:3].clone() y[0] = 10 # x remains unchanged

Root cause:Slices are views sharing memory; cloning creates independent copy.

#3Trying to store strings in tensors.

Wrong approach:x = torch.tensor(['a', 'b', 'c']) # error

Correct approach:Use Python lists or specialized libraries for text data.

Root cause:Tensors only support numeric data types.

Key Takeaways

Tensors are multi-dimensional numeric containers that power PyTorch's fast math and learning.

They support automatic differentiation by tracking operations, enabling model training.

Tensors can run on CPUs or GPUs, making computations efficient and scalable.

Understanding tensor memory sharing and device placement is key to writing efficient PyTorch code.

Tensors are designed specifically for machine learning, extending arrays with features NumPy lacks.