Monte Carlo Simulation#
Note
Source: Adapted from scalaworkshop by George K. Thiruvathukal and Konstantin Läufer.
A simulation uses a computer to model a process that would be expensive, slow, or impossible to run in the real world. Monte Carlo simulation is a family of techniques that use random sampling to estimate numerical results. The name comes from the Monte Carlo casino in Monaco — randomness is the engine.
The Monte Carlo Method#
The core idea is simple: if you cannot compute an answer exactly, throw a lot of random samples at the problem and see what fraction satisfy your criterion. The fraction converges to the true probability (or area, or integral) as the number of samples grows.
A classic example is estimating the area of a unit circle — and therefore π — by throwing random “darts.”
Estimating π with Darts#
Imagine a 2×2 square centred at the origin, with a unit circle (radius 1) inscribed inside it. The area of the square is 4; the area of the circle is π. If you throw darts uniformly at random at the square, the fraction that land inside the circle is π/4. Multiply by 4 to get π.
P(dart inside circle) = area of circle / area of square = π / 4
Therefore: π ≈ 4 × (darts inside) / (total darts)
Generating Darts#
Each dart is a random point (x, y) with x and y drawn uniformly from [-1, 1]. A dart is inside the circle when x² + y² ≤ 1:
def generate_darts(n: int) -> pd.DataFrame:
"""Generate n random darts thrown at the 2×2 square [−1,1]×[−1,1]."""
x = [random.uniform(-1, 1) for _ in range(n)]
y = [random.uniform(-1, 1) for _ in range(n)]
inside = [xi * xi + yi * yi <= 1.0 for xi, yi in zip(x, y)]
return pd.DataFrame({'x': x, 'y': y, 'inside': inside})
The function returns a list of dictionaries, one per dart, with keys
x, y, and inside. Using a dictionary makes the data
self-describing and easy to save to CSV later.
Estimating π#
def estimate_pi(darts: pd.DataFrame) -> float:
"""Estimate pi from dart throws: 4 × (inside / total)."""
return 4.0 * darts["inside"].sum() / len(darts)
Counting the darts inside and applying the formula gives the estimate. With 100 000 darts a typical run produces:
import math
darts = generate_darts(100_000)
print(f"π ≈ {estimate_pi(darts):.6f} (true: {math.pi:.6f})")
Output:
π ≈ 3.141440 (true: 3.141593)
Saving Darts to CSV#
def save_darts(darts: pd.DataFrame, filename: str = 'darts.csv') -> None:
"""Save dart throws to a CSV file with columns x, y, inside."""
darts.to_csv(filename, index=False)
csv.DictWriter writes each dart dictionary as one row, with the keys
becoming column headers. The saved file can be reloaded for later
analysis or shared with other programs.
How Accuracy Improves with N#
The estimate improves as n grows, following the law of large numbers: each time you multiply n by 10 the error roughly shrinks by a factor of √10 ≈ 3.16. The function below prints the estimate and its absolute error at several scales:
def convergence_table(sizes: list[int]) -> None:
"""Print how the pi estimate improves as n grows."""
print(f"{'n':>10} {'π estimate':>12} {'error':>10}")
print("-" * 38)
for n in sizes:
darts = generate_darts(n)
estimate = estimate_pi(darts)
print(f"{n:>10,} {estimate:>12.6f} {abs(estimate - math.pi):>10.6f}")
convergence_table([100, 1_000, 10_000, 100_000, 1_000_000])
Output (results vary due to randomness — typical run):
n π estimate error
--------------------------------------
100 3.120000 0.021593
1,000 3.144000 0.002407
10,000 3.138800 0.002793
100,000 3.141440 0.000153
1,000,000 3.141612 0.000019
Each tenfold increase in darts roughly halves the error — the price of one more decimal place is 100× more work.
Visualizing the Simulation#
A scatter plot shows exactly where the darts land, with the boundary circle drawn over them. Blue dots are inside the circle; red are outside:
def plot_darts(darts: pd.DataFrame, filename: str = 'darts.png') -> None:
"""Scatter-plot dart throws, coloring inside/outside the circle differently."""
inside = darts[darts["inside"]]
outside = darts[~darts["inside"]]
fig, ax = plt.subplots(figsize=(6, 6))
ax.scatter(inside["x"], inside["y"], s=0.5, color='steelblue', alpha=0.4, label='inside')
ax.scatter(outside["x"], outside["y"], s=0.5, color='salmon', alpha=0.4, label='outside')
theta = [math.tau * i / 1000 for i in range(1001)]
ax.plot([math.cos(t) for t in theta], [math.sin(t) for t in theta],
'k-', linewidth=1.5)
pi_est = estimate_pi(darts)
ax.set_aspect('equal')
ax.set_title(f'Monte Carlo: π ≈ {pi_est:.5f} (n = {len(darts):,})')
ax.legend(markerscale=8, loc='lower right')
plt.tight_layout()
plt.savefig(filename, dpi=100)
plt.close()
print(f"Saved {filename}")
from monte_carlo import generate_darts
darts = generate_darts(20_000)
plot_darts(darts, 'darts.png')
Install matplotlib if needed:
pip install matplotlib
The resulting image shows the quarter-circles of blue darts filling the circle while the red darts occupy the corners of the square, making the geometry of the estimate immediately visible.
Where Monte Carlo Is Used#
Monte Carlo methods power calculations across science and engineering:
Physics — particle collisions, nuclear reactions, radiation transport
Finance — option pricing, risk analysis (Value at Risk)
Climate modelling — uncertainty quantification in weather forecasts
Computer graphics — path tracing for photorealistic rendering
Machine learning — dropout regularisation, Bayesian inference
The idea is always the same: replace an intractable exact computation with a large number of cheap random samples.
Exercises#
Run
convergence_table([100, 1_000, 10_000, 100_000])several times. Does the error always decrease monotonically? Why or why not?Modify
generate_dartsto accept aseedparameter and pass it torandom.seedbefore the loop. Verify that two calls with the same seed produce identical results.The unit circle has area π. Use the same dart-throwing technique to estimate the area of a unit square rotated 45° (a diamond with vertices at (±1, 0) and (0, ±1)). What should the answer be?
Write a function
estimate_pi_from_csv(filename)that reads a darts CSV saved bysave_dartsand returns the pi estimate without regenerating the darts.Plot how the absolute error changes as n increases from 100 to 1 000 000 (use a logarithmic x-axis). Does the slope of the error curve match the expected –½ on a log-log scale?