~piotr-machura/scientific-python

Tasks from the "Using Python in Science" class

ba7a761 Fix typo

2 days ago

d1b44b4 Updated pandas exercise

2 days ago

#Scientific python exercises

Python exercises from the "Using Python in Science" class, conducted by Maciej Mrowiński, PhD in the winter semester of 2022 at Faculty of Physics, Warsaw University of Technology.

Dependencies are managed with Poetry. Install them with poetry install.

Each exercise is a separate Python module in scientific_python/ and can be run with

poetry run <module name>

or used directly from inside poetry shell:

poetry shell
<module name>

#Exercise 1 - word histogram

Module: wordhist.py

usage: wordhist [-h] [-n NUMBER] [-r REGEX] [-m MIN_LEN] [-c] FILE [FILE ...]

Word histogram script.

Reads words from provided FILEs and plots an ASCII histogram of the most frequent
ones.

positional arguments:
  FILE                  File(s) to create the histogram with

options:
  -h, --help            show this help message and exit
  -n NUMBER, --number NUMBER
                        Number of highest count words to show (default 10)
  -r REGEX, --regex REGEX
                        Regular expression defining a word (default \w+)
  -m MIN_LEN, --min-len MIN_LEN
                        Minimum length of processed words (default 0)
  -c, --no-clear        If used the console is not cleared at invocation

Example:

Histogram output for Dox Quixote in polish and english

#Exercise 2 - Ising model Monte Carlo simulation

Module: ising.py

usage: ising [-h] [-J J] [-B B] [-beta BETA] [-s SEED] [-d DENSITY]
             [-m MAGNETIZATION_FILE] [-a ANIMATION_FILE] [-i IMAGE_PREFIX]
             STEPS SIZE

Monte carlo simulation of the 2D Ising model.

The 2D lattice of spins with states s_i in {-1,+1} evolves over time according to
the neares-neighbor Hamiltonian with the interaction strenght J and external magnetic
field B.

The simulation by default does not save anything. Use the appropriate options to save
the images, animation or subsequent magnetization values.

For more information, see https://en.wikipedia.org/wiki/Ising_model.

positional arguments:
  STEPS                 Number of 'big' Metropolis steps to take
  SIZE                  Lattice size

options:
  -h, --help            show this help message and exit
  -J J                  Strength of spin interaction (default 1)
  -B B                  Strength of external magentic field (default 0)
  -beta BETA            Temperature parameter (default 1e64 ~ absolute zero)
  -s SEED, --seed SEED  Number seed for the simulation RNG
  -d DENSITY, --density DENSITY
                        Initial density of 'Up' (+1) spins (default 0.5)
  -m MAGNETIZATION_FILE, --magnetization-file MAGNETIZATION_FILE
                        Name of text file to write magnetization to (newline separated)
  -a ANIMATION_FILE, --animation-file ANIMATION_FILE
                        Name of the animation file ('.gif' will be added if necessary)
  -i IMAGE_PREFIX, --image-prefix IMAGE_PREFIX
                        Prefix of the image files (without step number and '.jpg' extension)

Example 1: absolute zero

Invokation screenshot for absolute zero

Resulting ising_1.gif:

Ising model animation at absolute zero

Example 2: high temperature

Invokation screenshot at high temperature

Resulting ising_2.gif:

Ising model animation at high temperature

#Exercies 3 - decorator example

Module: decorator.py

Example decorator, recording invocation stats of decorated function.

Example:

$ poetry run decorator
Starting n = 2...

<-- snip -->

Summary of stats for long_exec
Average execution time: 1.002s.
SD execution time: 0.000s.
Max execution time: 1.002s.
Min execution time: 1.001s.

long_exec was invoked 3 times.
First at 2022-10-27 10:59:16.884197 with positional arguments
(2,) and keyword arguments {'do_print': False}
Last at 2022-10-27 10:59:18.887366 with positional arguments
(4,) and keyword arguments {'do_print': False}

#Exercise 4 - static scraper

Module: ddscraper.py

usage: ddscraper [-h] [-j JSON] [-n N] [-t TIMEOUT]

Static scraper for Drew DeVault's (https://drewdevault.com) blog.

Produces a JSON file with the blog stats - word count, code block languages, lines
of code etc.

options:
  -h, --help            show this help message and exit
  -j JSON, --json JSON  Name of JSON file to write scraped data to
  -n N, --limit N       If provided, scrape only the first N articles
  -t TIMEOUT, --timeout TIMEOUT
                        Timeout for the HTTP requests (default 60s)

Example:

Scraping example

Resulting scrape.json:

{
  "scraped_at": "2022-11-09T10:49:18.970770",
  "articles": [
    {
      "url": "https://drewdevault.com/2022/10/27/Kernel-hacking-with-Hare-part-3.html",
      "title": "Notes from kernel hacking in Hare, part 3: serial driver",
      "date": "October 27, 2022",
      "words": 3887,
      "paragraphs": 47,
      "code_blocks": 22,
      "languages": [
        "hare"
      ]
    },
    <-- snip -->
    {
      "url": "https://drewdevault.com/2021/04/15/Status-update-April-2021.html",
      "title": "Status update, April 2021",
      "date": "April 15, 2021",
      "words": 317,
      "paragraphs": 5,
      "code_blocks": 1,
      "languages": [
        "hare"
      ]
    }
  ]
}

#Exercise 5 - dynamic scraper

Module: webexperience.py

Note: this module uses the Firefox Selenium driver (geckodriver binary) by default.

usage: webexperience [-h] [-j JSON] [-n]

Automating the Modern Web Experience™.

Dynamically scrapes https://how-i-experience-web-today.com/ using Selenium.

options:
  -h, --help            show this help message and exit
  -j JSON, --json JSON  Name of JSON file to save stats to (default scrape.json).
  -n, --headless        Run headless (also disables delay between clicks)

Example 1: headless operation

Only the data is saved

Resulting scrape.json:

{
  "scraped_at": "2022-11-19T12:54:46.677862",
  "annoyances": {
    "ads": 12,
    "banners": 1,
    "panels": 19,
    "inputs": 3,
    "icons": 3
  }
}

Example 2: the Full Experience™

Navigating the popup-ridden web

#Exercise 6 - numba-accelerated Ising model

Module: ising-numba.py

This module reuses model structures from ising.py. The Hamiltonian is calculated using @njit-compiled Python loops instead of scipy convolution, leading to ~2x speed increase.

Example:

Comparision of numba and scipy Ising model speed

The GIFs resulting from both regular and numba-compiled Ising simulations are identical (their SHA256 hashes match).

#Exercise 7 - SymPy harmonic oscillator

Module: oscillator.py

Symbolically solves the equations of the form

F(t) - kx - c dx/dt = d2x/dt2

where F(t) is a cosinusoidal driving force of the form Acos(ωt + φ).

Example:

Solution time series for different parameters

#Exercise 8 - SIR model numerical solutions

Module: sir_solutions.py

Numerically solves the SIR model equations for a fixed set of parameters and plots the results.

Example:

Grid of SIR solution plots

#Exercise 9 - SIR model dashboard

Module: sir_dashboard.py

Simple Bokeh dashboard for interactive visualization of SIR model solutions from exercise 8.

Unlike other exercises, this one must be run with a special command to start a Bokeh server and open the dashboard in a browser:

poetry run bokeh serve --show scientific_python/sir_dashboard.py

Example:

Screenshot of interactive SIR dashboard

#Exercise 10 - parallel downloads

Module: pngdownload.py

usage: pngdownload [-h] [-t TIMEOUT] [-p PATH] [-g GAUSSIAN] [-b] URL

Parallel PNG download script.

Downloads all PNG images from provided URL and saves to provided directory.

positional arguments:
  URL                   URL do download PNG images from

options:
  -h, --help            show this help message and exit
  -t TIMEOUT, --timeout TIMEOUT
                        Timeout for the HTTP requests (default 60s)
  -p PATH, --path PATH  Path to save downloaded images to
  -g GAUSSIAN, --gaussian GAUSSIAN
                        Apply Gaussian blur with provided radius to each image
  -b, --black-white     Apply a blac-white filter to each image

Example:

Screenshot of some PNGs being downloaded

#Exercise 11 - natural resources

Module: natural_resources.py

Simple data exploration in Pandas.

Example:

$ poetry run natural_resources
Mean:
               Gas production Gas consumption  ... Coal net imports per capita Oil net imports per capita
                         mean            mean  ...                        mean                       mean
Entity                                         ...
Afghanistan      7.890248e+08    2.725105e+08  ...                    0.000000                  -0.000783
Albania          1.266962e+08    1.266962e+08  ...                    0.031471                  -0.067123
Algeria          5.431857e+10    2.016324e+10  ...                    0.016863                  -0.792465
American Samoa   0.000000e+00    0.000000e+00  ...                    0.000000                   0.000000
Angola           7.812592e+08    4.663816e+08  ...                    0.000000                  -2.256367
...                       ...             ...  ...                         ...                        ...
Western Sahara   0.000000e+00    0.000000e+00  ...                    0.000000                   0.000000
World            2.139833e+12    2.146542e+12  ...                   -0.001034                   0.007340
Yemen            1.118596e+09    1.363676e+08  ...                    0.001507                  -0.324281
Zambia           0.000000e+00    0.000000e+00  ...                    0.001488                   0.063354
Zimbabwe         0.000000e+00    0.000000e+00  ...                   -0.003984                   0.000000

[231 rows x 43 columns]

              Coal consumption per capita  Coal imports per capita  Coal production per capita
Entity  Year
Germany 1991                        4.654                   0.2372                       4.432
        1992                        4.111                   0.2388                       3.926
        1993                        3.787                   0.1999                       3.556
        1994                        3.540                   0.2240                       3.275
        1995                        3.340                   0.2118                       3.101
        1996                        3.312                   0.2250                       2.956
        1997                        3.164                   0.2725                       2.806
        1998                        2.997                   0.2972                       2.597
        1999                        2.854                   0.3039                       2.520
        2000                        2.932                   0.3654                       2.519
        2001                        2.982                   0.4359                       2.529
        2002                        3.029                   0.4103                       2.588
        2003                        3.050                   0.4261                       2.547
        2004                        3.068                   0.4844                       2.585
        2005                        2.959                   0.4282                       2.524
        2006                        2.979                   0.4867                       2.456
        2007                        3.079                   0.5320                       2.517
        2008                        2.934                   0.5155                       2.399
        2009                        2.779                   0.4356                       2.285
        2010                        2.850                   0.4883                       2.270
        2011                        2.903                   0.5299                       2.343
        2012                        3.039                   0.5409                       2.433
        2013                        3.042                   0.6305                       2.352
        2014                        2.947                   0.7044                       2.290
        2015                        2.930                   0.6886                       2.259
        2016                        2.825                   0.6711                       2.137
        2017                        2.698                   0.5896                       2.119
        2018                        2.601                   0.5441                       2.033
        2019                        2.051                   0.4951                       1.572
        2020                        1.644                   0.3569                       1.282
        2021                        0.000                   0.0000                       0.000

Does Germany import coal when the consumption increases?
Pearson correlation: 0.04456, p-value 0.81185
Does not seem like it. Maybe they just produce more instead?
Pearson correlation: 0.97654, p-value 0.00000
That's it!