Data Science Institute

We aim to set the global standard for a truly interdisciplinary approach to contemporary data-driven research challenges. Established in 2015, the Data Science Institute (DSI) has over 300 members and has raised over £40 million in research grants.

An abstract diagram of networks

About us

We are working to create a world-class Data Science Institute at Lancaster (DSI@Lancaster) that sets the global standard for a truly interdisciplinary approach to contemporary data-driven research challenges. DSI@Lancaster aims to have an internationally recognised and distinctive strength in being able to provide an end-to-end interdisciplinary research capability - from infrastructure and fundamentals through to globally relevant problem domains and the social, legal and ethical issues raised by the use of Data Science.

The Institute is initially focusing on the fundamentals of Data Science including security and privacy together with cross-cutting theme areas consisting of environment, resilience and sustainability;health and ageing, data and society and creating a world-leading institute with over 300 affiliated academics, researchers, and students.

Our data science, health data science and business analytics programmes have launched the careers of hundreds of data professionals over the last 10 years. Students from our programmes have progressed to data science roles at Amazon, PWC, Ernst & Young, Hawaiian Airlines, eBay, Zurich Insurance, the Co-operative Group, N Brown, the NHS and many others - please look at our Education pages for further details of the courses on offer.

Advancing Geospatial Data Science at Lancaster University

A Vision for the Data Science Institute Geospatial Group

This activity is led by Duncan Whyatt, Patricia Murrieta-Flores, Emanuele Giorgi and Amber Leeson

The Geospatial Program at Lancaster University's Data Science Institute embodies a transformative vision for geospatial data science, interweaving research, education, and community engagement into its core mission. This program is dedicated to promoting geospatial research and teaching within the university. The program's mission transcends traditional academic boundaries, aiming to drive innovation and expand the frontiers of geospatial data science, from the Humanities to the Sciences. By identifying novel interdisciplinary research opportunities and leveraging the university’s strong academic disciplines, the program aspires to keep Lancaster University at the forefront of this evolving field. Recognising the importance of nurturing the next generation of geospatial experts, the program also aims to actively support postgraduate research. This fosters an environment conducive to research and collaboration, empowering aspiring geospatial scientists. Furthermore, the program aims to enhance geospatial education across the institution, ensuring students and staff have access to the latest research and training. The program's influence aims to extend beyond the university, engaging the broader community through eventual initiatives like summer schools, events, dedicated interdisciplinary workshops and retreats. This outreach aims to disseminate knowledge but also to foster collaboration with external stakeholders, centring Lancaster University's as a hub for geospatial data science excellence.

See leadershp page for further details about the activity leads

Please get in touch with Data Science if you woudl like to know any more.

Latest News

Fully-funded EPSRC PhD studentship available with Lancaster Data Science Institute

Novel geospatial methods for combining data at multiple spatial and temporal scales in the context of historical disease mapping Home status students only

With the advent of advanced technology, a vast amount of data has become available, making the integration of information across spatial and temporal scales increasingly crucial for understanding complex systems and making informed decisions. For instance, in environmental sciences, researchers often seek to understand the impact of climate change on ecosystems and biodiversity. To achieve this, data must be integrated from various spatial scales, ranging from local habitats to regional landscapes, and temporal scales, spanning decades or even centuries. Similarly, in epidemiology, the spread of infectious diseases is influenced by factors operating at different spatial and temporal scales. For example, the transmission of vector-borne diseases like malaria or dengue fever depends not only on local environmental conditions but also on global climate patterns and human mobility.

However, existing geospatial methods often fail to effectively account for the misalignment in time and/or space of data from multiple sources, with an increasing need of novel methods that are tailored to the characteristics and complexities of the data being integrated.

This research project aims to develop a novel geospatial modelling framework to address these challenges and improve the integration of data across different spatial and temporal scales in the context of historical disease mapping. This framework will then be applied to mapping, analysing, and interpreting sixteenth-century primary sources on diseases and epidemics, utilizing datasets and methods from the ESRC project Digging into Early Colonial Mexico. Epidemics, like the recent COVID-19, have had profound and enduring consequences throughout history. The application area of this project will be in the context of the introduction of diseases, like smallpox, during the Conquest of America that led to catastrophic declines in populations, with mortality rates reaching 97% in some regions. Traditional research methods struggle with the sheer volume, incompleteness, and uncertainty of historical data. In this project we will leverage advancements in machine learning, corpus linguistics, and geographic information sciences presents opportunities to explore vast historical collections and better address data variability and uncertainty. Specifically, the student will:

  1. Develop geospatial methods to extract and analyse and geographic information from ambiguous and uncertain descriptions of health and disease within textual records.
  2. Develop and apply spatio-methods capable that enable the identification of clusters of disease using uncertain data from historical records.

The successful applicant will be supervised by an interdisciplinary team consisting of experts from history, linguistics, statistics and geography.

Prospective candidates should have a first or upper second-class honours degree, or a combination of qualifications or experience equivalent to that level in a relevant subject.

For informal enquiries about the project please contact Duncan Whyatt (d.whyatt@lancaster.ac.uk), Patricia Murrieta-Flores (p.murrieta@lancaster.ac.uk) or Emanuele Giorgi (e.giorgi@lancaster.ac.uk).

To apply, please send a CV and cover letter demonstrating your motivation for the post to dsi@lancaster.ac.uk . The closing date for applications is 15th July 2024 and we anticipate a start date of October 2024 for the successful candidate.

The technologies of everyday bordering, past, present and future

Keynote speaker: Professor Melissa Hamilton (University of Surrey)

Speakers: Dr Esmorie Miller (Lancaster), Dr Hannah Ishmael (Kings College London), Tactical Tech, and Dr Kathryn Cassidy (Northumbria)

This afternoon workshop seeks to discuss the technologies and impacts of everyday bordering and deportations in the hostile environment. Investigating the ways in which data and technology is used to create everyday bordering and deportations requires collaboration between data scientists, historians, criminologists and sociologists, and activists. The interdisciplinary expertise covers the following thematic areas: everyday bordering, whereby ordinary citizens are required to perform as border-guards; the use of technologies, including monitoring, tracking, personal profiling and linked data, to identify suspected illegitimates and exacerbate the power of everyday bordering; the use of predictive algorithms which embed the belief that race and social status are linked to illegality and illegitimate citizenship; how predictive algorithms obscure problems such as racial bias and infer skewed patterns in datasets that exacerbate social division; and, exploring contemporary everyday bordering as a longer historic trajectory where race intersected with Empire and coloniality.

(see show more for sign up link)

The workshop aims to bring together interdisciplinary social critique of the technologies of everyday bordering and deportations in Britain. In doing so, this collective has the potential to form ethical and decolonised collaborative models of data governance practice and policy for, first, studying the mobilisation of technology in everyday exclusions in historic and current contexts, and second, illustrating how data science technologies can be used to engineer positive change, reducing biases and the negative impacts of the hostile environment, while maximising accountability and transparency of institutions.

Organised by Dr Zoe Alker [History] and Dr Esmorie Miller [School of Law] and supported by the Interdisciplinary Network Fund, Data Science Institute, Lancaster University.

Venue – The Storey, Lancaster

Times – Friday 5th July - 1-4pm

Sign up for The technologies of everyday bordering

First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security (NLPAICS 2024)

On the 29th – 30th July 2024, Lancaster University is hosting the first international conference on Natural Language Processing and Artificial Intelligence for Cyber Security.

This is the very first event of its kind and it could not come at a more critical time. In today’s digital age, cyber security has emerged as a heightened priority for both individual users and organisations. With the surge in online information, new innovative solutions are required to address the challenge of tradition security measures falling short against evolving threats.

The conference will also have a special theme track with the goal of stimulating discussion on the future of cyber security in the era of Large Language Models (LLMs) and generative AI. Delegates will examine the challenges, risks and safety issues associated with employing these models in everyday tasks, focusing on aspects including fairness, ethics, and responsibility.

Lancaster staff and students who want to attend, please fill out this form https://forms.office.com/e/mc9C3kChxP or contact Julia Carradus if you have any questions.

For more information on the conference please see the NLPAICS website or email info@nlpaics.com.

NLPAICS 2024 will bring researchers, academics, and business industry leaders together as they hear the latest solutions to address risks in processing digital information. It will explore a range of themes around the employment of NLP and AI for cyber security training including:

  • Societal and human security and safety
  • Anomaly detection and threat intelligence
  • Systems and infrastructure security
  • Ethics, bias, and legislation in cyber security

The event is being organised by the Lancaster University UCREL NLP research group, the Data Science Institute and Security Lancaster

Lancaster staff and students who want to attend - please complete this form https://forms.office.com/e/mc9C3kChxP

NLPAICS logo

Workshop on time-series analysis of noisy data at Lancaster University - 25th to 27th September 2024

Workshop on time-series analysis of noisy data

The aim of the workshop is to review recent progress in discerning cyclic processes in noisy background, focusing especially on the widespread case of oscillations with time-varying frequencies.

This topic will be discussed in the context of linear, stationary, non-stationary, nonlinear, chaotic, stochastic, autonomous and non-autonomous processes and systems. Practical problems in all areas of human endeavour where data are being collected will be addressed. These include, but are not limited to, living systems, medicine, neuroscience, chronobiology, ecology, climate, economics, space science, astrophysics, lasers, optics and photonics, semiconductors, battery lifecycles, classical and quantum turbulence, engineering and oceanography.

More details can be find on the conference website: https://wp.lancs.ac.uk/tsand24/

DSI Workshop Funding Announcement

We are delighted to announce our workshops for 2024. Further details will follow in the DSI newsletter.

Taken place

Katie McDonough (History) & Daniel Wilson (Turing Institute) Advertising Machines - 21st & 22nd March 2024

Elisa Rubegni, PhD (SCC) & Kate Cain (Psychology) Empowering Tomorrow: Bridging Disciplines for Inclusive Child-Tech - 2nd & 3rd May 2024

Kate Cain (Psychology) and Anastasia Ushakova (CHICAS/FHM) T-READS Tracking Reading and Educational Attainment through Data Science - 3rd & 4th June

Jess Bridgen (Mathematical Sciences) and Jon Read (Lancaster Medical School) Real-time Modelling of Nosocomial Transmission: the unanswered Questions - 11th June 2024

David Parkes (LEC), Luke Rhodes-Leader (Management Science), Paul Cureton (LICA), Eduard Campillo-Funollet (Mathematical Sciences). Calibration and validation for complex system modelling - 11th & 12th June

Paul Rayson (SCC), Jo Knight (CHICAS/FHM), Daisy Harvey and Nick King HealTAC 2024 Healthcare Text Analysis Conference 12th - 14th June 2024

Up coming

Paul Smith (Mathematical Sciences), Alex Bush (LEC), Emma Eastoe (Mathematical Sciences) Environmental and Ecological Statistics Group - 1st-3rd July 2024 at Lancaster University

Zoe Alker (History) and Esmorie Miller (Sociology), The Technologies of Everyday Bordering, Past, Present and Future - 5th July at The Storey

Barbara Shih (Biomedical and Life Science) & Richard Mort (Biomedical & Life Science) Disentangling the genes contributing to Dalmatian spotting and associated disease - 10th July, Lancaster University

DSI ECR Showcase Talks

These talks are to showcase the incredible research our ECR community is involved in, it also gives speakers the opportunity to practice for conference presentations and get feedback on unpublished research.

Get in touch with any questions - we can do these talks in the Lent and Summer term too - always happy to talk about your research and how to showcase your work.

If you are interested in giving a talk, please email David Parkes (d.parkes@lancaster.ac.uk) to discuss dates.

Get in touch with any questions - we can do these talks in the Lent and Summer term too - always happy to talk about your research and how to showcase your work.

Research Themes

Data Science at Lancaster was founded in 2015 on Lancaster’s historic research strengths in Computer Science, Statistics and Operational Research. The environment is further enriched by a broad community of data-driven researchers in a variety of other disciplines including the environmental sciences, health and medicine, sociology and the creative arts.

  • Foundations

    Foundations research sits at the interface of methods and application: with an aim to develop novel methodology inspired by the real-world challenge. These could be studies about the transportation of people, goods & services, energy consumption and the impact of changes to global weather patterns.

  • Health

    The Health theme has a wide scope. Current areas of strength include spatial and spatiotemporal methods in global public health, design and analysis of clinical trials, epidemic forecasting and demographic modelling, health informatics and genetics.

  • Society

    Data Science has brought new approaches to understanding long-standing social problems concerning energy use, climate change, crime, migration, the knowledge economy, ecologies of media, design and communication in everyday life, or the distribution of wealth in financialised economies.

  • Environment

    The focus of the environment theme has been to seek methodological innovations that can transform our understanding and management of the natural environment. Data Science will help us understand how the environment has evolved to its current state and how it might change in the future.

  • Data Engineering

    The Data Engineering theme aims to explore how we can utilise digital technologies to accelerate and enhance our research processes across the University.

Research Software Engineering

Within the Data Science Institute, our aim is to improve the reproducibility and replicability of research by improving the reusability, sustainability and quality of research software developed across the University. We are currently funded by the N8CIR, and work closely with our partner institutions across N8 Research.

Research Software Engineering

Upcoming Events