Overview - Simulated annealing (dual_annealing)

What is it?

Simulated annealing is a method to find the best solution to a problem by trying many possibilities and slowly focusing on better ones. The dual_annealing algorithm in scipy is a special version that combines two ways of searching to find the lowest point in a landscape of possible answers. It works by exploring widely at first, then narrowing down to the best solution. This helps solve problems where the answer is hidden among many tricky options.

Why it matters

Without simulated annealing, finding the best solution in complex problems can take too long or get stuck in bad answers. This method helps computers explore many options smartly, like how metal cools slowly to become strong. It makes solving hard problems faster and more reliable, which is useful in science, engineering, and business decisions.

Where it fits

Before learning simulated annealing, you should understand basic optimization and how algorithms search for best answers. After this, you can explore other advanced optimization methods like genetic algorithms or machine learning tuning. It fits in the journey of learning how to solve complex problems with computers.

Mental Model

Core Idea

Simulated annealing finds the best solution by exploring many possibilities broadly at first, then gradually focusing on better options as it 'cools down'.

Think of it like...

Imagine trying to find the lowest point in a bumpy landscape while blindfolded. At first, you take big random steps to explore widely, sometimes going uphill to avoid getting stuck. As you keep searching, you take smaller steps and focus on going downhill to settle in the lowest valley.

Start: Wide exploration (big steps)
  ↓
Gradual cooling (smaller steps)
  ↓
Focused search (fine tuning)
  ↓
Best solution found

┌───────────────┐
│ Initial state │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Explore widely│
│ (random jumps)│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Reduce step   │
│ size (cool)   │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Focus on best │
│ solutions     │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Final answer  │
└───────────────┘

Build-Up - 7 Steps

1

FoundationWhat is optimization?

Concept: Optimization means finding the best answer from many possibilities.

Imagine you want to find the shortest route to visit several friends. Optimization is the process of checking different routes to find the shortest one. In math and science, optimization helps us pick the best choice, like the lowest cost or highest profit.

Result

You understand that optimization is about searching for the best solution among many options.

Understanding optimization is key because simulated annealing is a method designed to solve these best-choice problems.

2

FoundationWhy random search alone is not enough

3

IntermediateBasic simulated annealing idea

4

IntermediateDual annealing algorithm specifics

5

IntermediateUsing scipy's dual_annealing function

6

AdvancedParameter tuning and cooling schedule

7

ExpertWhy dual annealing outperforms basic methods

Under the Hood

Dual annealing works by alternating between a stochastic global search and a deterministic local search. The global search uses a probabilistic acceptance rule that allows uphill moves to escape local minima, controlled by a temperature parameter that decreases over time. The local search refines solutions found by the global phase using gradient-free methods. Internally, the algorithm maintains state about current best solutions and adapts step sizes based on progress.

Why designed this way?

Simulated annealing was inspired by metallurgy where slow cooling leads to stable crystals. Early algorithms struggled with slow convergence or local traps. Dual annealing was designed to combine the strengths of global exploration and local refinement to improve speed and reliability. Alternatives like pure random search or gradient descent were either too slow or prone to local minima, so this hybrid approach balances exploration and exploitation.

┌───────────────────────────────┐
│          Start                │
└──────────────┬────────────────┘
               │
               ▼
┌───────────────────────────────┐
│ Global Search (stochastic)    │
│ - Random jumps                │
│ - Accept worse solutions      │
│ - Temperature controls moves │
└──────────────┬────────────────┘
               │
               ▼
┌───────────────────────────────┐
│ Local Search (deterministic)  │
│ - Refines current solution   │
│ - Gradient-free methods      │
└──────────────┬────────────────┘
               │
               ▼
┌───────────────────────────────┐
│ Update best solution          │
│ Adjust temperature and steps │
└──────────────┬────────────────┘
               │
               ▼
┌───────────────────────────────┐
│ Termination condition met?    │
│ (max iterations or tolerance)│
└───────┬───────────────┬───────┘
        │               │
       Yes             No
        │               │
        ▼               ▼
┌─────────────┐   ┌───────────────┐
│ Return best │   │ Continue loop │
│ solution    │   └───────────────┘
└─────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does simulated annealing always find the absolute best solution? Commit yes or no.

Common Belief:Simulated annealing guarantees finding the absolute best solution every time.

Tap to reveal reality

Quick: Do you think accepting worse solutions early is a bug or a feature? Commit your answer.

Common Belief:Accepting worse solutions during search is a mistake and should be avoided.

Tap to reveal reality

Quick: Is dual annealing just a faster version of basic simulated annealing? Commit yes or no.

Common Belief:Dual annealing is simply a faster implementation of simulated annealing.

Tap to reveal reality

Quick: Does cooling faster always improve results? Commit yes or no.

Common Belief:Faster cooling schedules always lead to better and quicker solutions.

Tap to reveal reality

Expert Zone

1

Dual annealing's local search uses a gradient-free method, making it suitable for problems where derivatives are unavailable or noisy.

2

The algorithm's acceptance probability depends on a carefully designed temperature schedule that balances exploration and exploitation dynamically.

3

Dual annealing can be sensitive to bounds and initial temperature settings; experts often tune these based on problem knowledge for best results.

When NOT to use

Dual annealing is not ideal for very high-dimensional problems where the search space is huge; in such cases, methods like genetic algorithms or Bayesian optimization may perform better. Also, if gradient information is available and reliable, gradient-based optimizers are usually faster.

Production Patterns

In real-world systems, dual annealing is used for tuning hyperparameters in machine learning, optimizing engineering designs with complex constraints, and solving scheduling problems. It is often combined with domain-specific heuristics and run multiple times with different seeds to ensure robust solutions.

Connections

Metropolis-Hastings algorithm

Dual annealing's acceptance of worse solutions is based on the Metropolis criterion from this algorithm.

Understanding Metropolis-Hastings helps grasp why simulated annealing probabilistically accepts worse solutions to escape local minima.

Thermodynamics

Simulated annealing mimics the physical process of cooling metals to reach low-energy states.

Knowing thermodynamics principles explains the cooling schedule and temperature analogy in simulated annealing.

Evolutionary algorithms

Both use stochastic search and population-based exploration but differ in mechanisms.

Comparing these helps understand different strategies for global optimization and when to choose each.

Common Pitfalls

#1Stopping the algorithm too early before it cools sufficiently.

Wrong approach:result = dual_annealing(func, bounds, maxiter=10)

Correct approach:result = dual_annealing(func, bounds, maxiter=1000)

Root cause:Misunderstanding that the algorithm needs enough iterations to explore and cool properly.

#2Setting bounds too narrow, excluding good solutions.

Wrong approach:bounds = [(0, 1)] # Problem needs wider search space

Correct approach:bounds = [(-10, 10)] # Wider bounds to include better solutions

Root cause:Not analyzing the problem domain to set appropriate variable ranges.

#3Removing acceptance of worse solutions to speed up convergence.

Wrong approach:# Custom code that rejects worse solutions always if new_cost < current_cost: accept = True else: accept = False

Correct approach:# Use acceptance probability based on temperature accept = (new_cost < current_cost) or (random() < exp(-(new_cost - current_cost)/temperature))

Root cause:Misunderstanding the role of probabilistic acceptance in escaping local minima.

Key Takeaways

Simulated annealing is a powerful optimization method that balances exploration and exploitation by accepting worse solutions early and focusing later.

Dual annealing improves basic simulated annealing by combining global and local search strategies for better performance.

Proper parameter tuning, especially cooling schedules and bounds, is critical to achieving good results.

Understanding the underlying mechanism helps avoid common mistakes like premature stopping or removing key algorithm parts.

Simulated annealing connects deeply to physics and probability, making it a rich concept bridging multiple fields.