Overview - Log scale and symlog scale

What is it?

Log scale and symlog scale are ways to display data on a graph where the axis values grow exponentially or cover both very small and very large numbers. A log scale shows values based on powers of a base number, like 10, making it easier to see patterns in data that spans many orders of magnitude. Symlog scale is a mix of linear and log scales, allowing negative and zero values to be shown alongside positive values on a log scale. These scales help visualize data that changes rapidly or has both small and large values.

Why it matters

Without log or symlog scales, graphs with very large or very small numbers can be hard to read because small values get squished and large values dominate. This makes it difficult to spot trends or compare data points. Log and symlog scales solve this by spreading out the data more evenly, revealing patterns and relationships that would otherwise be hidden. This is important in fields like science, finance, and engineering where data often spans wide ranges.

Where it fits

Before learning log and symlog scales, you should understand basic plotting and linear scales in graphs. After mastering these scales, you can explore advanced data visualization techniques like custom scales, interactive plots, and handling special data types in matplotlib.

Mental Model

Core Idea

Log and symlog scales transform axis values to better show data that changes exponentially or includes both positive and negative values across wide ranges.

Think of it like...

Imagine a map where distances are shown normally for nearby places but get compressed for faraway places to fit everything on one page. Log scale compresses large numbers while expanding small ones, and symlog scale lets you see places both east and west of a center point, like negative and positive values.

Linear scale: 1 ── 2 ── 3 ── 4 ── 5
Log scale:    1 ── 10 ── 100 ── 1000 ── 10000
Symlog scale: -1000 ── -10 ── 0 ── 10 ── 1000

Build-Up - 7 Steps

1

FoundationUnderstanding linear scales

Concept: Linear scale shows data points spaced evenly based on their actual values.

In a linear scale, the distance between 1 and 2 is the same as between 4 and 5. This is the default way graphs display data. For example, plotting y = x on a graph with linear scale shows a straight line.

Result

Data points are spaced evenly, making it easy to see differences when values are close.

Understanding linear scale is essential because it is the baseline for all other scales and helps you see why log scales are needed for wide-ranging data.

2

FoundationBasics of logarithms

3

IntermediateApplying log scale in matplotlib

4

IntermediateIntroducing symlog scale

5

AdvancedCustomizing symlog parameters

6

AdvancedHandling tick formatting on log and symlog

7

ExpertLimitations and numerical issues in symlog

Under the Hood

Log scale transforms each axis value x into log_base(x), where base is usually 10. This means equal distances on the axis represent multiplication by the base. Symlog scale uses a piecewise function: linear transformation for values between -linthresh and +linthresh, and log transformation outside this range. This allows negative and zero values to be plotted by treating the linear region as a buffer zone around zero.

Why designed this way?

Log scale was designed to handle data spanning many orders of magnitude, common in science and engineering. However, it cannot handle zero or negative values, which are common in real data. Symlog was created to overcome this by combining linear and log scales, preserving the benefits of log scaling while allowing full range data visualization. Alternatives like logit or power scales exist but don't handle negative values as naturally.

Axis value x
  │
  ├─ if |x| ≤ linthresh: linear scale (x)
  └─ if |x| > linthresh: log scale (sign(x) * log_base(|x|))

Visualization:
  ┌─────────────┬─────────────┬─────────────┐
  │  Negative   │   Linear    │  Positive   │
  │  Log Scale  │  Region     │  Log Scale  │
  └─────────────┴─────────────┴─────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Can log scale display zero or negative values? Commit to yes or no.

Common Belief:Log scale can show zero and negative values just like linear scale.

Tap to reveal reality

Quick: Does symlog scale treat all values logarithmically? Commit to yes or no.

Common Belief:Symlog scale applies logarithmic scaling to all values on the axis.

Tap to reveal reality

Quick: Is changing the base in log scale just a cosmetic change? Commit to yes or no.

Common Belief:Changing the logarithm base only changes the labels but not the data representation.

Tap to reveal reality

Quick: Does symlog scale perfectly preserve distances for all values? Commit to yes or no.

Common Belief:Symlog scale preserves relative distances for all values like linear scale does.

Tap to reveal reality

Expert Zone

1

Symlog scale's linear region size (linthresh) must be chosen carefully to balance detail near zero and log compression outside.

2

Tick placement on symlog axes can be tricky because standard log tick locators don't work well near zero, requiring custom locators.

3

Floating-point precision limits can cause subtle artifacts in log and symlog plots, especially with very large or very small numbers.

When NOT to use

Avoid log scale when data includes zero or negative values; use symlog or linear scale instead. Avoid symlog when data is strictly positive and you want pure log behavior. For probabilities or proportions, consider logit scale. For data without wide range, linear scale is simpler and clearer.

Production Patterns

In real-world data science, log scale is used for plotting distributions like income or population where values span many orders. Symlog is common in scientific data with measurements around zero, such as sensor readings or financial returns. Professionals customize linthresh and base parameters and combine these scales with interactive zooming for detailed analysis.

Connections

Exponential growth

Log scale is the inverse transformation of exponential growth.

Understanding log scale helps interpret exponential processes by turning multiplicative growth into additive linear trends.

Signal processing

Symlog scale relates to how signals with positive and negative amplitudes are analyzed.

Knowing symlog scale aids in visualizing signals that cross zero, common in audio and sensor data.

Human perception of sound (decibels)

Decibel scale is a logarithmic scale similar to log scale in plotting data.

Recognizing log scale's similarity to decibel perception helps understand why log scales match human senses for loudness.

Common Pitfalls

#1Trying to plot zero or negative values on a log scale axis.

Wrong approach:plt.yscale('log') plt.plot([1, 0, -1, 10], [10, 20, 30, 40])

Correct approach:plt.yscale('symlog') plt.plot([1, 0, -1, 10], [10, 20, 30, 40])

Root cause:Misunderstanding that log scale cannot handle zero or negative values causes runtime errors or missing data.

#2Using default linthresh in symlog without adjusting for data range.

Wrong approach:plt.yscale('symlog') plt.plot(data_with_small_values)

Correct approach:plt.yscale('symlog', linthresh=0.01) plt.plot(data_with_small_values)

Root cause:Not tuning linthresh leads to poor visualization near zero, hiding important small value details.

#3Assuming changing log base only changes labels, not data representation.

Wrong approach:plt.xscale('log', base=2) # Treats data same as base 10

Correct approach:plt.xscale('log', base=2) # Recognizes different tick spacing and scale compression

Root cause:Ignoring that base affects scale spacing causes misinterpretation of data patterns.

Key Takeaways

Log scale transforms data to show multiplicative relationships clearly but cannot handle zero or negative values.

Symlog scale combines linear and log scales to visualize data with negative, zero, and positive values smoothly.

Choosing parameters like linthresh and base in symlog scale is crucial for clear and accurate data visualization.

Proper tick formatting and understanding scale limitations prevent misinterpretation and errors in plots.

Log and symlog scales are essential tools for exploring data spanning wide ranges or crossing zero in many real-world fields.