Mastering large data processing with mpi4py in python
Explore the power of mpi4py for simplifying data distribution in parallel computing with its efficient broadcast functionality, which seamlessly sends data f...
You write code. Maybe you build data pipelines, tune algorithms, or just enjoy a good hands-on tutorial. Everything on this track — Python, algorithms, DevOps, data engineering, web apps, machine learning — works on its own terms: no earth-science background needed. (When a post uses an earthquake as the example dataset, that's just a more interesting dataset.)
Web apps & computing Tutorial collections All posts
Explore the power of mpi4py for simplifying data distribution in parallel computing with its efficient broadcast functionality, which seamlessly sends data f...
This post explains the concepts of object-oriented programming (OOP) and its key features, including encapsulation, abstraction, inheritance, and polymorphis...
A beginner-friendly walkthrough to build and run Whisper Journal with multilingual dictation, local Whisper transcription, AI-assisted title/tag generation, ...
A practical introduction to Docker for geophysics students, including images, containers, volumes, and a simple workflow for reproducible seismic data analys...
You will learn how to generate and set up an SSH key for github so that you don’t need to always type your username and password when accessing github repo.
Prefer one chronological stream of everything? See all recent posts.
A tour of how modern earthquake-monitoring systems turn continuous seismic waveforms into earthquake catalogs — the ML pipeline of picking, association, and ...
A beginner-friendly walkthrough to build and run Whisper Journal with multilingual dictation, local Whisper transcription, AI-assisted title/tag generation, ...
I built SeismoAlert to fetch USGS earthquake data, run statistical analysis, detect anomalies, and generate interactive maps from a single CLI.
A practical, beginner-friendly walkthrough of a complete FastAPI DevOps workflow: clean code, layered testing, CI with Jenkins and GitHub Actions, and runtim...
Turn your everyday computer into a home server you can access from anywhere using Dynamic DNS, a simple update script, and secure SSH access.
A Python-based solution for indexing and searching files on a macOS system using SQLite, FAISS, and semantic search.
Cloud computing is transforming geophysical and seismological research by enabling scalable processing, faster collaboration, and reproducible workflows for ...
A practical introduction to Docker for geophysics students, including images, containers, volumes, and a simple workflow for reproducible seismic data analys...
Learn how to set up Databricks, create your first Spark cluster, upload data, and run PySpark notebooks for scalable big data analysis.
Discover how Genetic Algorithms can be applied to solve the earthquake location problem in seismology. This post walks through generating synthetic seismic d...
Learn how to seamlessly sync your Zotero files across devices using WebDAV with Koofr and Google Drive. This step-by-step guide ensures your research materia...
Learn how to use RabbitMQ with Python to publish and consume seismic waveform messages reliably in real time for distributed seismology workflows.
The Metropolis-Hastings algorithm is a cornerstone of Markov Chain Monte Carlo (MCMC) methods, enabling us to generate samples from complex probability distr...
Explore the power of mpi4py for simplifying data distribution in parallel computing with its efficient broadcast functionality, which seamlessly sends data f...
Navigating the complexities of relocating your Anaconda or Miniconda installation on a Linux system can be daunting. This concise guide provides a step-by-st...
Explore the Multi-Taper Method’s unique ability to refine spectral estimates in seismology. Leveraging multiple orthogonal tapers, this approach minimizes va...
Pink noise, also known as 1/f noise or flicker noise, is a type of random signal that has equal energy per octave. It is called pink because it is analogous ...
Python script to read a PDF document, and perform question-answering on the document, and to generate an output based on the provided query
This post explains the concepts of object-oriented programming (OOP) and its key features, including encapsulation, abstraction, inheritance, and polymorphis...
The article discusses the mpi4py module, which is a Python wrapper for MPI, used for parallel computing in supercomputing environments. It provides higher-le...
We will plot the boundaries of the states of the USA on a basemap figure
You will learn how to generate and set up an SSH key for github so that you don’t need to always type your username and password when accessing github repo.
We will see how to load shared libraries in C/C++. We will write a library to convert km to degrees and vice-versa. Then we create a utility program to conve...
We will see how to read a YAML file in Bash, C/C++ and Python.
Using multiple threads in C for concurrent process flow
We look into a quick overview of the idea of linked list data structure with some examples.
We will inspect the L-BFGS optimization method using one example and compare its performance with the gradient descent method
We will use the Python package to create beamer presentation and append existing figures to each slide
While analyzing time series data, we often come across data that is non-uniformly sampled, i.e., they have non-equidistant time-steps. Infact, most of the re...
We will learn how to use the poetry package to create, manage, build and publish Python package on PyPI
Learn to encrypt and decrypt any files, data or software with python
We will introduce the concepts of distributed computing and then use the open-source Python library Ray to write scalable codes that can work on distributed ...
We will see how we can read a large data file with earthquake catalog in Python using the Dask library
We will see compare the convolution functions in Python (Numpy) with the conv function in MATLAB. If you have tried them both then you would know that its no...
We will read and write text and numeric data into a file using modern fortran.
We will see how we can use amazon web services, specifically amazon polly to convert any text into a speech
If you have recently bought a M1 mac and have been doing blogging using Jekyll, then you must have experienced that installing Jekyll on the M1 architecture ...
We learn how to write a Makefile to automate the compilation of our source code. We will use one example from Fortran.
We will learn the basics of the maximum likelihood method, and then apply it on a regression problem. We will also compare it with the least-squares estimati...
The boundary value problems require information at the present time and a future time. We will see how we can use shooting method to solve problems where we ...
Librosa can efficiently compute the spectrogram for large time series data in seconds. We will use that to plot the spectrogram using matplotlib
We learn how to plot selected shapefile data using geopandas on top of PyGMT maps
Runge-Kutta methods are most popular method to solve ordinary differential equations (ODEs) with a better approximation than the Euler method. We compare the...
The simplest algorithm to solve a system of differential equations is the Euler method. We understand the Euler method by looking into a simple heat transfer...
How can you link unsplash images directly to your blog without hosting it locally.
Uses pandas to read the html page and extract the html data into pandas dataframe
The Newton–Raphson method (commonly known as Newton’s method) is developed for finding roots of a given function or polynomial iteratively. We show two examp...
Spin up a zero-dependency web server with Python’s built-in http.server to share files or preview a site on your local network.
We see how to download seismic waveforms, convert them into mat format from mini-seed and then perform denoising using wavelet analysis. We first performed w...
Follow the instructions to make your python script executable from anywhere in Linux system.
Paramiko module can be used in Python to securely send data from the local client to the remote server. It is analogous to the SSH and SCP in Linux.
We will learn the basic concepts of wavelet tranform and multi-resolution analysis starting from the Fourier Transform, and Gabor Transform.
Read the seismic traces from the miniseed files and compute the cross-correlation and spectrogram
Use Principal Component Analysis (via SVD) to decompose a space-time signal into a few dominant modes and reduce its dimensionality, with Python code.
We will learn the basics of Fourier analysis and implement it to remove noise from the synthetic and real signals
Transfer learning using the pre-trained deep learning networks from MATLAB can be easily implemented to achieve fast and impressive results
In this introduction to the concepts of Pytorch data structures, we will learn about how to create and reshape tensors using Pytorch and compare it with the ...
We learn how to read huge csv file containing time series data by breaking it into chunks and then visualizing it with matplotlib
A PyQt5 application for retrieving and visualizing sound waveforms in real time. Codes included.
This tutorial gives a brief description of scientific computing using Pandas by introducing Series, DataFrame, Pandas common operations, methods, conditional...
This tutorial gives a brief description of scientific computing using numpy by introducing arrays, methods, attributes, random numbers, indexing, broadcastin...
I used the sktime library to forecast the airline data using NaiveForecaster, KNeighborsRegressor, Statistical forecasters, and auto ARIMA model.
What is the fastest and most efficient way to loop in Python. We found that the numpy is fastest and python builtins are the most memory efficient.
An introduction to the basics of genetic algorithm along with a simple numerical example and solution of an earthquake location problem
An introduction to the basics of genetic algorithm along with a simple numerical example and solution of an earthquake location problem
How can we use the MATLAB functions in Python? MATLAB implementation are usually reliable as it is developed by the professionals. But the advantages of usin...
The common geophysical problems most often have multimodal objective function with many possible minima. In this post, we will look into the Monte Carlo meth...
If you are ready to use the Microsoft Word as your favourite tool for writing your awesome scientific thoughts and ideas into a manuscript, then I would like...
I built a cross-platform Python desktop app to monitor CPU, RAM, disk, processes, and network usage in real time.
This post gives a quick introduction on how to build a web application using Flask and deploy on Heroku server. Then, I share my codes for building advanced ...
Parallel computing is quickly becoming a necessity. Modern computers comes with more than one process and we most often only use single process to do most of...
Codes for plotting advanced 2D plots using matplotlib library in Python. Includes simple 2D plot, error bars, bar graphs, histograms, multiple plots, etc
We pose a null hypothesis and enquire that given that the null hypothesis is true, how likely is the observed pattern of results? This likelihood is known as...
NetCDF file format has been designed for storing multidimensional scientific data such as temperature, rainfall, humidity, etc. In this post, we will see how...
In this post, we will see how we can use Python to low-pass filter the 10 year long daily fluctuations of GPS time series. We need to use the “Scipy” package...
It is essential to insert equation numbers in your thesis and/or any scientific paper. In this post, I will show you some of the easiest ways to insert equat...
In Earth Sciences, we often deal with multidimensional data structures such as climate data, GPS data. It ‘s hard to save such data in text files as it would...
Ulysses is a natural, freestyle way of writing. If you got any idea, just write it down, worry about the format and other things when you’re done. Don’t let ...
Short demonstration of how to plot the track or trajectory of a hurricane on a map. Codes are included.
Some handy tweaks for mac like relocating default screenshot location, renaming batch files etc
Mac can be easily automated by the help of several tools such as automator, quick actions, applescripts
Shortcut code for quickly logging temperature in Apple health app
Quick action for mac to easily love, dislike, rate songs in apple music app
We test for the correlation coefficients or the covariance between two sets of random numbers selected from normal distribution using the Monte Carlo simulat...
Visualize the statistics of the data using MATLAB: mean, median, std, interquartile range, skewness, kurtosis, t-statistic, degrees of freedom
Using Randomization to test the disprove the null hypothesis
Two time series with predominant linear trends (very low DOF) can have a very high correlation coefficient, which can hardly be construed as an evidence for ...
A basic to advanced guide to making interactive plots in Bokeh.
Generators don’t hold the entire result in memory. It yields one result at a time.
Tutorial on how to use Git and GitHub for team collaboration on a project. Content includes installing, setting up, creating a repository, making commits, un...
In this tutorial post, I give a quick demo of how to install Python (using anaconda) and then getting started with writing simple scripts.
These are the blog's most algorithm-flavored earth-science posts — no geophysics background needed:
Algorithms, data science, and the tools that make modern software tick — delivered as concise, hands-on tutorials you can run today. Unsubscribe anytime.