Algo Trading Notes

Monday, March 16, 2026

What is the Sharpe Ratio and Why It Matters

I spent my first few months in trading focused on returns. Positive month? Success. Negative month? Failure. Then I started tracking volatility, and everything changed. The Sharpe ratio quantifies something intuitive: not all 10% returns are equal. A strategy that delivers 10% with minimal drawdown is fundamentally different from one that swings wildly between +30% and -20% to average the same result. Named after Nobel laureate William Sharpe, the ratio measures risk-adjusted returns. It answers a simple question: how much excess return am I getting per unit of risk taken? For automated trading systems, this matters even more. An EA might show impressive backtest returns, but if those returns come with massive volatility, the strategy becomes difficult to trade live. Your position sizing breaks down. Psychological pressure mounts during drawdowns. Real execution differs from simulated results. The Sharpe ratio strips away that noise. It lets you compare strategies on equal footing, regardless of their absolute return levels. A conservative strategy with a Sharpe of 1.5 often outperforms an aggressive one with a Sharpe of 0.8, even if the latter shows higher nominal returns. When I evaluate trading logic, Sharpe ratio is one of my first checkpoints. It is not the only metric that matters, but it is a reliable filter for strategies worth developing further.

Sharpe Ratio Formula Breakdown

The formula itself is straightforward:

Sharpe Ratio = (Mean Portfolio Return - Risk-Free Rate) / Standard Deviation of Returns

Three components. Each requires careful handling. Mean Portfolio Return: Your average return over the period measured. If you are analyzing daily returns, this is your daily mean. For monthly data, it is your monthly mean. Keep the frequency consistent throughout your calculation. Risk-Free Rate: The theoretical return of a zero-risk investment. In practice, I use short-term government bond yields. For USD-based strategies, that is typically the 3-month US Treasury rate. For shorter-term trading strategies with daily calculations, I often set this to zero or use an annualized rate divided by the number of trading periods. The impact is usually minimal for high-frequency strategies. Standard Deviation: Your volatility measure. This quantifies how much your returns bounce around their mean. Higher standard deviation means more risk. The denominator penalizes strategies that achieve returns through excessive volatility. The result is unitless. A Sharpe ratio of 1.0 means you are earning one unit of return for each unit of risk. Above 1.0 is generally considered acceptable. Above 2.0 is strong. Above 3.0 is rare and worth scrutiny for overfitting. One assumption to note: the formula assumes returns are normally distributed. In reality, market returns have fat tails. Extreme events occur more frequently than normal distributions predict. This is why Sharpe ratio alone is not sufficient for risk assessment, but it remains a useful benchmark. Python handles these calculations efficiently. No need for specialized libraries. NumPy and pandas give you everything required.

Python Implementation from Scratch

Here is how I calculate Sharpe ratio with minimal dependencies. This example uses daily returns:


import numpy as np
import pandas as pd

# Sample return data - replace with your actual strategy returns
returns = pd.Series([
    0.012, -0.005, 0.008, 0.015, -0.003,
    0.007, -0.010, 0.020, 0.005, -0.008,
    0.013, 0.002, -0.007, 0.018, 0.001
])

# Risk-free rate (annualized, e.g., 2%)
annual_risk_free_rate = 0.02

# Assuming 252 trading days per year
trading_days = 252
daily_risk_free_rate = annual_risk_free_rate / trading_days

# Calculate excess returns
excess_returns = returns - daily_risk_free_rate

# Mean excess return
mean_excess_return = excess_returns.mean()

# Standard deviation of returns
std_returns = returns.std()

# Sharpe ratio (daily)
sharpe_ratio_daily = mean_excess_return / std_returns

print(f"Daily Sharpe Ratio: {sharpe_ratio_daily:.4f}")

# Annualized Sharpe ratio
sharpe_ratio_annual = sharpe_ratio_daily * np.sqrt(trading_days)

print(f"Annualized Sharpe Ratio: {sharpe_ratio_annual:.4f}")

The annualization step multiplies by the square root of periods because volatility scales with the square root of time. This is a standard convention in finance. For a complete function I can reuse:


def calculate_sharpe_ratio(returns, risk_free_rate=0.02, periods=252):
    """
    Calculate annualized Sharpe ratio from return series.
    
    Parameters:
    returns: pandas Series or numpy array of periodic returns
    risk_free_rate: annualized risk-free rate (default 2%)
    periods: number of periods per year (252 for daily, 12 for monthly)
    
    Returns:
    float: annualized Sharpe ratio
    """
    if len(returns) < 2:
        return np.nan
    
    # Convert to pandas Series if needed
    if isinstance(returns, np.ndarray):
        returns = pd.Series(returns)
    
    # Periodic risk-free rate
    periodic_rf = risk_free_rate / periods
    
    # Excess returns
    excess = returns - periodic_rf
    
    # Avoid division by zero
    if returns.std() == 0:
        return np.nan
    
    # Calculate and annualize
    sharpe = excess.mean() / returns.std()
    return sharpe * np.sqrt(periods)

# Usage
sharpe = calculate_sharpe_ratio(returns)
print(f"Sharpe Ratio: {sharpe:.4f}")

I have used this exact function across multiple backtesting projects. It handles edge cases and makes the calculation reproducible. When I need to compare multiple strategies, I wrap this in a loop or apply it across dataframe columns. Clean, simple, no black-box dependencies.

Advanced Applications and Optimization

Once you have the basic calculation down, several extensions become useful. Rolling Sharpe Ratio: Track how risk-adjusted performance evolves over time. This reveals whether a strategy's edge is deteriorating.


# Calculate 30-day rolling Sharpe ratio
window = 30
rolling_sharpe = returns.rolling(window).apply(
    lambda x: calculate_sharpe_ratio(x, periods=252),
    raw=False
)

# Plot to visualize stability
import matplotlib.pyplot as plt
rolling_sharpe.plot(title="30-Day Rolling Sharpe Ratio")
plt.axhline(y=1.0, color='r', linestyle='--', label='Sharpe = 1.0')
plt.legend()
plt.show()

Comparing Multiple Strategies: Build a comparison matrix to rank approaches side-by-side.


# Assume you have multiple strategy return series
strategies = {
    'Momentum': momentum_returns,
    'Mean_Reversion': mean_reversion_returns,
    'Breakout': breakout_returns
}

sharpe_comparison = pd.DataFrame({
    name: [calculate_sharpe_ratio(returns)]
    for name, returns in strategies.items()
}, index=['Sharpe_Ratio']).T

print(sharpe_comparison.sort_values('Sharpe_Ratio', ascending=False))

I often cross-reference my backtest metrics with live results on sys-tre.com ranking to see how theoretical Sharpe ratios hold up in forward testing. Discrepancies between backtest and live Sharpe are red flags for overfitting. Dynamic Risk-Free Rate: If your strategy runs over years, the risk-free rate changes. You can pass a series instead of a scalar:


# risk_free_rates: pandas Series aligned with returns
excess_returns = returns - risk_free_rates
sharpe = excess_returns.mean() / returns.std() * np.sqrt(252)

This adds precision for long-horizon analysis but introduces complexity. For shorter-term strategies, a constant rate works fine.

Common Pitfalls and Best Practices

Annualization Assumptions: The square root of time scaling assumes returns are independent and identically distributed. If your strategy has autocorrelation (momentum or mean reversion effects), this assumption breaks. Your annualized Sharpe becomes an approximation, not gospel. Data Frequency Mismatch: Mixing daily returns with monthly risk-free rates without proper conversion produces garbage. Always align your frequencies. I keep a simple lookup: daily = 252 periods, weekly = 52, monthly = 12. Survivorship Bias: If you only calculate Sharpe on surviving strategies (those that made it through your filter), you inflate the metric. Include dead strategies in your analysis dataset. Interpreting Negative Sharpe: A negative Sharpe ratio means your strategy underperformed the risk-free rate. This is a clear signal to stop trading that logic. No exceptions. Zero Volatility Edge Case: If your return series has zero standard deviation (all returns identical), the denominator becomes zero. My function returns np.nan in this case. Handle it explicitly rather than letting calculations fail silently. Sharpe Alone Is Not Enough: I always pair Sharpe with maximum drawdown, win rate, and profit factor. A strategy with a great Sharpe but a 50% drawdown might still be untradeble. Use Sharpe as one lens among several. Sample Size Matters: Calculating Sharpe on 10 data points is statistically meaningless. I aim for at least 30 observations for daily data, preferably more. For monthly data, that means multiple years of history. When I build out a new strategy, I run Sharpe calculations at multiple stages: on training data, validation data, and then monitor it on live execution. Consistency across these stages gives me confidence. Divergence tells me to dig deeper before committing capital. The Sharpe ratio is not a magic number. It is a standardized way to compare apples to apples. Implement it cleanly, interpret it carefully, and it becomes a reliable tool in your quantitative workflow.

Friday, March 13, 2026

Why NumPy Matters for Financial Computing

I have been calculating returns, volatility, and correlations for trading strategies using Python for a while now. Early on, I relied heavily on pandas DataFrames—intuitive, clean, perfect for labeled time series data. But when the datasets got larger or the calculations more iterative, I noticed slowdowns. That is when I started paying attention to NumPy. NumPy is not just a backend for pandas. It is a high-performance library optimized for numerical operations on arrays. In finance, where you often work with thousands of price points, hundreds of assets, or millions of Monte Carlo paths, speed matters. A calculation that takes 5 seconds in Excel or 2 seconds in pandas might take 50 milliseconds in NumPy. That difference compounds when you are running backtests or recalculating risk metrics in real time. The core advantage is vectorization. Instead of looping through rows like you might in a spreadsheet formula, NumPy operates on entire arrays at once using optimized C code under the hood. This means less Python overhead, better cache utilization, and fewer lines of code. Here is where I find NumPy indispensable:

Portfolio return calculations across multiple assets and rebalancing periods
Risk metrics like standard deviation, VaR, or drawdowns computed on rolling windows
Correlation matrices for large universes of instruments
Monte Carlo simulations generating thousands of price paths efficiently

If you are working with financial data at scale—especially in algorithmic trading or quantitative research—NumPy becomes a foundational layer. It is worth understanding how to use it directly, not just through pandas wrappers.

NumPy Fundamentals for Financial Data

Let me walk through the basics that come up most often in financial work.

Creating Arrays from Price Data

Suppose you have daily closing prices for a stock. In NumPy, that is just a one-dimensional array:


import numpy as np

prices = np.array([100.5, 102.3, 101.8, 103.5, 104.2])

For multiple assets, you would use a two-dimensional array where each row is a date and each column is an asset:


# 5 days, 3 assets
prices_multi = np.array([
    [100.5, 50.2, 75.8],
    [102.3, 51.0, 76.5],
    [101.8, 50.5, 75.2],
    [103.5, 52.1, 77.0],
    [104.2, 51.8, 76.8]
])

This structure maps cleanly to how you think about market data: rows are time, columns are instruments.

Data Types and Precision

By default, NumPy uses 64-bit floats (float64), which is fine for most financial calculations. If memory becomes an issue with very large datasets, you can use float32, but be mindful of precision loss in cumulative calculations like compounded returns.


prices_32 = np.array([100.5, 102.3], dtype=np.float32)

I stick with float64 unless I am dealing with datasets in the tens of millions of rows.

Indexing and Slicing

Grabbing the first 3 days of data:


first_three = prices_multi[:3, :]

Or just the second asset across all days:


asset_two = prices_multi[:, 1]

This slicing syntax is fast and memory-efficient. You are creating views, not copies, in most cases.

Reshaping for Analysis

Sometimes you need to convert between 1D and 2D shapes. For example, turning a flat array of returns into a matrix for matrix multiplication:


returns_flat = np.array([0.018, -0.005, 0.017, 0.007])
returns_col = returns_flat.reshape(-1, 1)  # column vector

This comes up when calculating portfolio returns using weights and asset returns as vectors.

Essential Financial Calculations Using NumPy

Now the practical part. I will show how to compute the metrics you actually need in trading or portfolio analysis.

Daily Returns

Simple returns are price changes divided by the previous price. In NumPy, you can do this without loops:


prices = np.array([100.5, 102.3, 101.8, 103.5, 104.2])
returns = (prices[1:] - prices[:-1]) / prices[:-1]
# Output: [0.0179, -0.0049, 0.0167, 0.0068]

Or using np.diff and division:


returns = np.diff(prices) / prices[:-1]

For log returns (preferred in many quant models because they are additive):


log_returns = np.log(prices[1:] / prices[:-1])

Cumulative Returns

To see total performance over a period:


cumulative = np.cumprod(1 + returns) - 1
# Or starting from 100:
equity_curve = 100 * np.cumprod(1 + returns)

This gives you the equity curve you would plot in a backtest.

Volatility (Standard Deviation)

Annualized volatility from daily returns, assuming 252 trading days:


daily_vol = np.std(returns)
annual_vol = daily_vol * np.sqrt(252)

For multiple assets:


# Assuming returns_multi is a 2D array (days x assets)
vol_per_asset = np.std(returns_multi, axis=0) * np.sqrt(252)

Sharpe Ratio

Risk-adjusted return metric. Assuming a risk-free rate of 2% annually:


mean_return = np.mean(returns) * 252  # annualized
risk_free = 0.02
sharpe = (mean_return - risk_free) / annual_vol

Simple, fast, no external dependencies.

Value at Risk (VaR)

VaR at 95% confidence—what is the worst daily loss you can expect 95% of the time?


var_95 = np.percentile(returns, 5)
# For a portfolio with $100,000:
var_dollar = 100000 * var_95

This is a parametric approach. You can also use historical simulation by sorting returns and picking the 5th percentile directly.

Correlation Matrix

For a multi-asset portfolio:


corr_matrix = np.corrcoef(returns_multi, rowvar=False)

The rowvar=False tells NumPy that columns are variables (assets), not rows. This matrix feeds into portfolio optimization or risk decomposition.

Practical Examples: Portfolio and Risk Analysis

Let me show how these pieces fit together in a realistic workflow.

Multi-Asset Portfolio Returns

Suppose you hold three assets with weights 50%, 30%, 20%. You want the portfolio return each day.


weights = np.array([0.5, 0.3, 0.2])

# returns_multi is (n_days, 3)
portfolio_returns = returns_multi @ weights  # matrix multiplication

That @ operator (or np.dot) does a weighted sum across assets for each day. Clean, one line.

Rolling Volatility

You want to track volatility over a 20-day rolling window. NumPy does not have a built-in rolling function like pandas, but you can use np.lib.stride_tricks or write a simple loop. Here is a vectorized approach with views:


def rolling_std(arr, window):
    shape = (len(arr) - window + 1, window)
    strides = (arr.strides[0], arr.strides[0])
    rolled = np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
    return np.std(rolled, axis=1)

rolling_vol = rolling_std(returns, 20) * np.sqrt(252)

This is more memory-efficient than creating copies of subarrays.

Variance-Covariance Matrix

For portfolio risk calculation or optimization:


cov_matrix = np.cov(returns_multi, rowvar=False) * 252  # annualized

Then portfolio variance is:


portfolio_variance = weights @ cov_matrix @ weights
portfolio_vol = np.sqrt(portfolio_variance)

Two lines to get annualized portfolio volatility from individual asset returns.

Monte Carlo Price Simulation

Generate 10,000 price paths for a stock using geometric Brownian motion. This is where NumPy really shines.


S0 = 100  # initial price
mu = 0.10  # drift (annual return)
sigma = 0.20  # volatility
T = 1.0  # 1 year
dt = 1/252  # daily steps
n_steps = 252
n_sims = 10000

# Generate random shocks
Z = np.random.standard_normal((n_sims, n_steps))

# Price paths
drift = (mu - 0.5 * sigma**2) * dt
diffusion = sigma * np.sqrt(dt) * Z
log_returns = drift + diffusion
log_prices = np.cumsum(log_returns, axis=1)
prices = S0 * np.exp(log_prices)

This runs in milliseconds. Try doing 10,000 simulations in Excel. For live EA performance data, I often check sys-tre.com ranking—a solid dataset for comparing strategies—but for research like this, NumPy gives you the raw speed to iterate quickly.

Performance Optimization Tips

Here is what I have learned from pushing NumPy in production-like workflows.

Broadcasting Over Loops

Never loop through array elements if you can avoid it. Broadcasting lets you apply operations to arrays of different shapes without writing explicit loops.


# Bad: loop
result = np.zeros(len(returns))
for i in range(len(returns)):
    result[i] = returns[i] * 252

# Good: vectorized
result = returns * 252

The second version is 10-100x faster depending on array size.

Use Views, Not Copies

Slicing creates views by default, which is efficient. Avoid .copy() unless you need to modify data without affecting the original.


subset = prices[:100]  # view, fast
subset_copy = prices[:100].copy()  # copy, slower but independent

Memory Management for Large Datasets

If you are working with tick data or high-frequency datasets (millions of rows), use memory-mapped arrays:


data = np.memmap('prices.dat', dtype='float64', mode='r', shape=(10000000,))

This loads data on-demand rather than all at once, keeping memory usage low.

When to Use NumPy vs Other Libraries

NumPy is ideal for:

Purely numerical operations on homogeneous data
Linear algebra, statistics, simulations
Performance-critical inner loops

Switch to pandas when:

You need labeled time series (dates, tickers)
Handling missing data or irregular timestamps
Merging/joining datasets

And use specialized libraries (scipy, statsmodels) for advanced statistical tests or optimization routines that NumPy does not cover.

Final Thought

NumPy is not flashy. It does not give you pretty charts or handle datetime logic gracefully. But when you need to crunch numbers fast—whether for backtesting, risk analysis, or simulation—it is the most efficient tool in Python. The calculations I showed here are the building blocks of nearly every quantitative finance workflow I run. Master these, and you will write faster, cleaner analysis code.