Methodologically enhanced virtual labs: the next crucial step to support collaborative environmental data science


Posted on

data plot

In DSNE, we have been actively promoting virtual labs as a means of supporting more open and transparent environmental data science as well as crucially enabling collaboration, including the necessary cross-disciplinary collaboration to tackle environmental grand challenges. There is mounting evidence that this really works and it is so encouraging to see new evangelists emerging particularly from the science domain when they see how virtual labs can indeed support and enhance their scientific endeavours.

So what next for virtual labs? The next big step for us is to seek methodologically enhanced virtual labs that are collaborative lab environments that build on our great methodological work and offer data science methods as intrinsic and key components of the lab environment. Imagine a next generation of virtual labs that have rich libraries of methods drawn from machine learning and AI, geostatistics, time series analyses and extreme value analyses. Imagine if the lab environments are carefully designed to support ready experimentation with these techniques in isolation or combination on a variety of data sets. Imagine if these investigations are being carried out collaboratively, involving scientists and data scientists working in tandem to see “what if we try out these techniques on given data sets – what insights emerge”?

We already have some interesting perspectives on this through our sister project - ‘Methodologically enhanced virtual labs for early warning of significant or catastrophic change in ecosystems: changepoints for a changing planet’. This project is funded as a feasibility study under the UKRI Strategic Priority Fund, Constructing a Digital Environment; the research is looking at enhancing virtual labs with data science methods, particularly changepoint methods, to look for signs of significant or catastrophic change in data collected as part of the Environmental Change Network (ECN) across a number of sites in the UK. As part of this research, we have developed a novel multivariate changepoint method that allows for potential misalignment in time steps across different time series, thus significantly enhancing the potential of such techniques in the real world.

We recently saw the potential of methodologically enhanced virtual labs in an internal, online workshop where we all collaborated in real-time through notebooks, having ready access to the ECN data sets and different data science methods (including our multivariate changepoint method). We witnessed truly collaborative and transdisciplinary science happen right in front of us: ‘can we try this’, ‘would this work’, ‘ah – that’s REALLY interesting, can we try this as well’, ‘what about now trying this across all the ECN sites – are the patterns the same’? This enthralling experience has energised us all to push further towards our vision of making data science methods an intrinsic part of virtual labs. This experience also clarified what we need: i) a plethora of data science methods readily available from within the virtual labs infrastructure, ii) the ability to discover such methods in the same way that labs currently support discoverability of data sets, iii) the ability to readily apply such methods in isolation or together with other methods on available data sets without too much programming effort, iv) the ability to combine such methods with other services available through virtual labs including exciting ways to visualise and communicate the outputs, and ultimately v) the ability to directly publish your outputs in the form of archival notebooks.

So what can you do to help? The first and most important step is to ensure everyone within DSNE is using virtual labs as part of their natural mode of working. This is particularly important for our researchers who are developing or applying interesting and/or novel data science methods so these are already being developed in a virtual lab context. This will help enormously when we come to collect these methods together within a consistent framework to support method execution in virtual labs. We are currently offering support to ease this transition to working in virtual labs and if you need further help talk to Maria Salama if you have not done so already (email: m.salama@lancaster.ac.uk). This is a virtuous cycle: as more and more people use and enhance virtual labs, they will become more exciting environments, drawing others in. Through this, we can achieve our goal of making virtual labs central and indispensable in our quest for a data science of the natural environment.

The virtual labs team in DSNE consists of John Watkins, Gordon Blair, Maria Salama and Michael Hollaway. Feel free to contact any of these members to discuss our research around methodologically enhanced virtual labs. The investigator team on the changepoints project is: Gordon Blair, Idris Eckley and Rebecca Killick from Lancaster University and John Watkins, Don Monteith and Peter Henrys (UK CEH), and they are joined by Will Simm, Aaron Lowther, Lindsay Banin, Susannah Rennie, Michael Tso and Mike Hollaway in delivering this exciting research.

Related Blogs


Disclaimer

The opinions expressed by our bloggers and those providing comments are personal, and may not necessarily reflect the opinions of Lancaster University. Responsibility for the accuracy of any of the information contained within blog posts belongs to the blogger.


Back to blog listing