Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Day 4: Mathematical elements of Data Sciense

Data Science Lab, University of Bern, 2025

Prepared by Dr. Mykhailo Vladymyrov.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

Summary

Main content is currently provided solely as interactive live discussion. We look at a few of more commonly used mathematical concepts in Data Science, such as vector spaces, probability & distributions, frequency domain, and differential calculus. Specifically, we look at those form the perspective of Data Science, and how they can be used in practice to and how we can think of them in a more intuitive way.

This notebook is primarily for running the visualizations and the exercises.

To keep track of the topics suggested for the discussion, in this notebook is also given this list of topics:

  • frequency domain

    • sine, cosine

    • Fourier transform

    • trends

Interactive Demo

Frequency Domain

import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
# Create a sample signal
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
t = np.linspace(0, 16, 500, endpoint=False)  # time vector
dt = t[1] - t[0]  # time step
100/dt, 16/dt
# exmaple: periodic signal sin wave dt*1/100 @ A = 1 + dt/16 @ A=0.2 + noise normal 0.05

y1 = 1 * np.sin(2 * np.pi * t/(dt * 100))  # periodic signal
y2 = 0.05 * np.sin(2 * np.pi * t/(dt * 6))  # periodic signal
y3 = np.random.normal(0, 0.1, t.shape)  # noise
y = y1 + y2 + y3  # combined signal

plt.plot(t, y)
# get power spectral density
f, Pxx = signal.welch(y, fs=1/dt, nperseg=512)
plt.semilogy(f, Pxx)
plt.xlabel('Frequency [Hz]')

detrending the signal

n = 120
t = np.arange(n)

sawtooth_wave = signal.sawtooth(2 * np.pi * t / 30)  # example of a sawtooth wave

trend = 10 + 0.05 * (t + sawtooth_wave*20)
seasonal = 2 * np.sin(2 * np.pi * t / 12)
noise = np.random.normal(0, 0.5, n)
y = trend + seasonal + noise


# Use a pandas Series with a datetime index (best practice)
date_index = pd.date_range(start='2020-01-01', periods=n, freq='M')
series = pd.Series(y, index=date_index)

# Seasonal decomposition
result = seasonal_decompose(series, model='additive', period=12)  # period=12 for yearly seasonality

# Detrended signal (remove trend, keep seasonality and noise)
detrended = series - result.trend

# Plot
plt.figure(figsize=(10,6))
plt.plot(series, label='Original')
plt.plot(result.trend, label='Trend')
plt.plot(date_index, trend, '--', label='Trend')
plt.plot(detrended, label='Detrended')
plt.plot(date_index, seasonal+ noise, '--', label='Seasonal+Noise')
plt.legend()
plt.title('Detrending with statsmodels.tsa.seasonal_decompose')
plt.show()