How can your code be reusable for others?

Reusability – the vital driver for faster, quality research
One of the biggest bottlenecks in health data research is the time and effort it often takes to prepare data for use. Wider use of quality, reusable code could be transformative, avoiding duplication, accelerating research, freeing up time for other work and reducing costs.
Lars Murdock, a health data scientist with the BHF Data Science Centre, has created a series of videos which provide guides to good practice that just about any team working in the field can readily adopt. Free to view in your own time they are available on our Futures right now:

Meet the Trainer
Lars has huge expertise in helping maximise health data research quality, speed and efficiency. At the BHF Data Science Centre he and his team provide data management and curation to support research within the Trusted Research Environments (TREs). He previously worked in data analytics for Cancer Research UK. This experience made him the ideal person to create our Futures series on reusability.
How did you get into health data research?
I fell into this industry. I worked in jobs I didn't particularly like but eventually did an internship with a health data team at Cancer Research UK. It spoke to the mathematical side of me and the statistical element behind epidemiology and the study of diseases is fascinating. It's very technical, but it involves a huge amount of problem solving.
How did you become interested in reusability?
A few years ago my team wanted to research cancer treatment rates. It took years to access the data, and then after arriving was delayed even further as it hadn’t been prepared properly. It occurred to me then that some of the biggest bottlenecks are at the accessing and data preparation stages.
Did the pandemic bring changes?
COVID opened some doors to quicker data access, but the preparation time was still a problem. That’s what my team at the BHF Data Science Centre has been helping with.
What does your team do?
We are a small group that helps prepare assets and datasets and identifies where researchers are all doing the same thing. That means it can be done once and shared.
How much difference can reusability make?
It’s really important for the dozens of researchers we support. Around 80% of their time and effort using data was spent on cleaning steps. We've cut that from months to weeks. And there's much further we can go. It means researchers can spend longer doing research and gaining the insights.
How much of a problem is the lack of reusability generally?
In health data science a lot of time and effort is spent in taking vast amounts of data and transforming it into something better organised, or that offers insights, or for building reporting tools. Very often you have to start from scratch, even though someone has used that data set before and taken a lot of similar steps. There’s a lot of reinventing the wheel.
How can we solve this?
A lot of it is about adopting best practice.
If people were to start making some of their code a bit more accessible so others can borrow it that saves a lot of time. It doesn't need to be a finished product, just a handy starting point so if another person wants to do a similar thing, they are halfway there.
Who feels the benefit?
It's a community benefit. If you make sure your code is reusable it's you benefiting the next person down the line. But if the person before you does it, then you benefit. It’s a virtuous circle.
There are also selfish reasons. Sometimes I've gone back to a project after a few months, and I don't even recognise my own work. If I’d done it differently it would have made it much easier looking back on my own stuff. It also benefits your team – for example if someone goes off and a colleague has to pick up a project.
How do the videos help?
They point out some of the ways in which you can think about reusability. But we know that different teams and areas of the industry have certain styles, so the videos are not too prescriptive.
We're trying to keep it to more the top line. Here's things you want to think about, and here's why, rather than do it exactly like this, because not everything works for every use case.
The BHF Monthly Webinar Series
You can also discover the BHF Monthly Webinar series free on Futures.
The BHF Data Science Centre, a partnership between HDR UK and the British Heart Foundation (BHF), helps partners including researchers and NHS organisations carry out research using health data into the causes, prevention and treatment of heart and circulatory diseases.
The webinars, created by subject matter experts including Lars, provide invaluable insights into the centre’s work. Topics include:
- Smartphone and Wearable Data for Cardiovascular Research
- Public and Patient Involvement
- Using Imaging Data to Help Understand Cardiovascular Disease
- Estimating Excess Death
- Using Medicines Data
- Healthcare Systems Data
Futures Newsletter
Where will Futures take you?