A Case for Swarming Medical History
Over the last six and a half years, IPFS has come a long way. Successful implementations are increasingly encountered. That declared, is Swarm technology a more suitable solution for distributed web application (“dApp”) developers tasked with preserving medical records?
This article builds a case for using Ethereum’s brand new Swarm cryptoeconomic technology for storing variable-length medical history in Solidity dApps instead of going the IPFS route. The argument is theoretical and only based on publicly available information on IPFS/Filecoin and Swarm. It also assumes that you don’t want to contribute/dedicate storage capacity to the decentralized storage network.
The age of decentralized storage is drawing closer. IPFS has been around since February 2015. Its complement, Filecoin, was launched on October 15th, 2020. Its latest substitute, Swarm, released their mainnet client on June 21st, 2021.
With each passing moment, dApp developers are more confident about plunging into new decentralized storage technologies. If any doubt exists in a developer’s mind, it’s probably more to do with deciding which decentralized storage pool to dive into.
That’s because there are so many decentralized storage networks (“DSNs”) to choose from. Decentralized file storage solutions are increasingly more abundant.
Chia, Storj, and Sia are other notable DSNs, but they are not under consideration because IPFS has already emerged as the standard for the Ethereum community. That stated, Swarm was custom-built for the Ehereum community and tackles data privacy concerns right out of the box.
Given that the stated goal of the technology is to return the control of data to its providers, Swarm may be a more convenient and flexible solution to the IPFS/Filecoin combination for Solidity developers when it comes to storing data that needs to have any permanence. 
There is a chance that medical records encrypted and stored in IPFS drop off the face of the earth. That is because IPFS nodes can choose what data they prefer to cache with their limited storage. IPFS nodes are computers (like a PC or Mac running the IPFS client software) that cache and share resources with other nodes on the peer-to-peer network.
If every node chooses not to cache the particular medical records uploaded to the peer-to-peer network, the data vanishes forever. Consequently, IPFS cannot guarantee the permanent availability of data and this lack of permanence in IPFS seriously challenges a dApp developer’s peace of mind.
Unless the dApp developer is running their own IPFS node and taking steps to ensure the availability of uploaded medical records, the lack of permanence in IPFS is a hazard.
To mitigate and make data more persistent the developer could retain a pinning service. With the persistence acquired from a pinning service, the uploaded data should be available 24/7 and could (potentially) be stored eternally.
The first issue with pinning services is that the uploader always needs to pay for it. Furthermore, if there is some black swan incident (or circumstances do not accommodate timely payment), the data may be flushed from the pinning service node if the provider is not paid.
The second issue is that employing only one pinning service takes the dApp developer right back to the single-point-of-failure weakness of centralized storage systems.
Pinning isn’t the most elegant solution. Knowing this, Protocol Labs (the makers of IPFS) created an incentive and security layer for IPFS called Filecoin.
Filecoin is built on IPFS and it ensures permanence. However, there is a slip between the cup and the lip: Filecoin has data upload minimums of several gigabytes and will not guarantee the permanence of medical records of sizes below that threshold.
Medical records grow over time for most individuals and that data should persist for decades. Storing medical records presently may entail ensuring persistence for :
- Medical Documents & Telehealth Diagnostics Data (Official Documents)
* Computed Tomography (CT) Scans
* Diagnoses, PCR Tests
* Vaccination Records
* DNA Test Results
* Blood Test Results
- Data From Tracking Devices (Device Verified)
- Manual Data Input (Unverified)
* Apple/Samsung Watch Tracked Data (steps, geolocation, etc.)
* Blood Type
* Water Consumption
* Fitness Information
* Percentage Body Fat
* Eye/Hair Color
* Sleep Patterns
* Caloric Intake
* Smoking/Alcohol Consumption Habits
Some of these data are large. CT scans are approximately 0.5 gigabytes each. However, the data storage requirements of each individual will vary greatly.
Furthermore, many individuals will likely (initially at least) need less storage space than the multi-gigabyte threshold limit of Filecoin. For developing nations, Filecoin is not a solution for data persistence needs.
Swarm addresses the data persistence needs of storing medical records that Filecoin cannot. Furthermore, for developing nations like Mongolia, it is a more elegant solution than the IPFS pinning mechanism.
In IPFS, pinning requires periodic payments. There is a parallel concept in Swarm. The periodic payment made by content providers (the owners/creators of the data) is called a postage subscription.
As with pinning in IPFS, a postage subscription may not be appropriate for storing medical records. That’s because users (called “net providers” in Swarm nomenclature) uploading data will want to provide content and not manage it.
Fortunately, net providers may purchase storage in advance instead of periodically. Payment is accomplished through purchasing a batch of postage stamps.
Users attach one postage stamp to each chunk generated from their uploaded content to Swarm. This indicates the desire to have the data persist. Swarm storer nodes then use this information (i.e. the data’s market value) to prioritize persistence. This is called prioritization in Swarm parlance.
The larger the postage stamp the longer the data persists. Swarm allows data to persist “forever”. 
Further research into Swarm may indicate that it isn’t the perfect technology for storing medical information. However, initial research shows that it does address data permanence (i.e. the persistence/prioritization of data storage problem) quite elegantly.
There are strengths and weaknesses to Swarm.
For example, all uploaded data needs a postage stamp. This means that users need to get comfortable with the concept of having to pay for data storage.  This could be perceived as a weakness to a net provider.
Fortunately, savvy dApp designers will see this scenario as an opportunities to empower net providers to commercialize aspects of their private medical information. For example, net providers could charge net users (for example, data consumers like pharmaceutical companies, drug manufacturers, or non-governmental organizations) for their COVID-19 status for a fee. Those receipts could then be used to offset the fees associated with data permanence on the Swarm network.
One clear strength of the protocol is its GDPR compliance. Under GDPR, data storage networks are required delete to all personal data at the end of the period specified in a service agreement. Since Swarm nodes possess limited storage space and have no economic incentive to make data persist beyond the period paid for, uploaded data will be removed through garbage collection whether or not it contains sufficient information to identify a natural person.
Another strength is that it’s built for the Ethereum community and has been part of the initial (“trinity”) world computer design envisioned by Dr. Gavin Wood, Vitalik Buterin, et. al. since 2014 if protocol maturity is of any consequence.
What do you think? Please applaud if you agree or leave comments if you don’t. Feedback is greatly appreciated.
- Swarm is superior for storing data (like medical records) that are of a size that cannot accurately be predetermined.
- This list is not exhaustive and is just for illustration purposes.
- Forever is defined as 10 years.
- Yes… A revolution is afoot.