Open science, open geostatistics! 40 years of GSLIB.


Posted on

image of code on paper

Geostatistics is a branch of statistics that deals with spatial data and spatial variation. Its development began at the beginning of the 1960s inspired by the seminal work of Mercer and Hall on the implications of spatial autocorrelation in agricultural randomised studies with adjacent plots (The experimental error of field trials, 1911 – with commentary from Student); of Student himself work on the elimination of spurious correlation due to position in time or space (1914); and the first theories on spatial stationarity developed by Kolmogorov in 1941 (The local structure of turbulence in an incompressible fluid at very large Reynolds numbers).

It is true that geostatistics at its inception days was seen as an esoteric discipline due to its complex theory and the unavailability of free or low-cost software containing geostatistical codes. In reality there is nothing complex in geostatistics – the variogram is a model of local variance, therefore based on the variance equation, and the kriging is just a system of simple linear regressions, with or without linear constrains. It was therefore predictable that geostatistical tools would become common in many statistical software and widely applied in geographical analyses.

At the time Ross Ihaka and Robert Gentleman were developing R-cran software (certainly the most popular free statistical software today) by combining Scheme programming language and S syntax (the latter is a free statistical computing language developed in the seventies – later commercially knows as S-PLUS), Clayton Deutsch and André Journel published “The geostatistical Software Library and user’s Guide” (simply GSLIB) (Oxford University Press), containing 39 programs: from variogram estimation to sequential Gaussian simulation, kriging and plotting. It was summer 1992.

Deutsch was a PhD student of Prof Journel at Stanford University. Journel arrived in Stanford in 1978, coming from the Paris School of Mines - which in those years was directed by Georges Matheron, the father of the theory of regionalised variables (1971), variogram and kriging estimator (1962). In 1980, Journel started developing a suite of Fortran codes (Fortran was the main programming language for numeric computations and statistics) for geostatistical analyses later joined by Deutsch. These codes, initially tested and implemented by the students at Stanford University, will become the source code of GSLIB 12 years later.

GSLIB programs have been widely used in industry and academia thanks to their free availability through a Stanford hosted anonymous ftp. Recently its popularity has been maintained by GSLIB translation and adaptation to fast computing. For example, recently it have been translated in Python by Michael Pyrcz (https://github.com/GeostatsGuy/GeostatsPy) and in Python and C/C++ for enhanced and fast computing by Vladimir Savichev and colleagues (HPGL, enhanced performance, http://hpgl.github.io/hpgl/). Finally, for those that still love Fortran but need high computation power, Oscar Peredo and collaborators optimised GSLIB for fast computation via OpenMP directives and MPI instructions (https://doi.org/10.1016/j.cageo.2015.09.016 and available here: https://github.com/operedo/gslib-alges).

At present, more advanced geostatistical tools (generalised geostatistical linear models, model based geostatistics and other Bayesian geostatistical frameworks) are freely available in R-cran software which host the largest open geostatistical code archive and its community. But at that time, GSLIB publication was the first landmark for open source geostatistical software which provided easy geostatistical calculation and enhanced communication between scientists involved in experimental geostatistics. As per any open discipline, a free code/knowledge results in more scientists working on that discipline – a legacy of open science.

Related Blogs


Disclaimer

The opinions expressed by our bloggers and those providing comments are personal, and may not necessarily reflect the opinions of Lancaster University. Responsibility for the accuracy of any of the information contained within blog posts belongs to the blogger.


Back to blog listing