Storage solution for Photographers so far in 2018
Storage solution for Photographers
In so far 2018
With the announcement of Nikon’s flagship camera D850 and Sony’s iconic A7R III, I was quite inquisitive on ever growing need for storage space. Nikon’s D850 45.7 Megapixel sensor can produce raw 14-bit lossless compressed files at 51.6 Mb and Sony’s A7R III 42.4 Megapixel sensor produces 14-bit uncompressed raw files at 36 Mb. Which leads me to believe that the megapixel war is not yet over. Hasselblad’s H6D can produce 16-bit lossless compressed images up to 120Mb on average with its 100 MP CMOS sensor. If someone is doing an 8K time-lapse with these cameras, one could possibly imagine how much storage space is required to store such huge files.
A day in the life of a photographer
I shot this Orion Nebula at the Nottley dam in the Chattahoochee National Forest. This image is stacked with 400 images using Nebulosity. That’s my favorite star stacking software. Each images were shot in RAW using Nikon D810 which is 36.3 Mb 14-bit lossless compressed files. That takes up 14.52GB of storage space for a single night shoot. Then I have DJI Phantom 4 pro that produces nearly 8 – 10 GB for 4K videos at 30fps with flying time around 15-20mins. That’s roughly 25 GBs for one day. I have shot time lapses that would go more than 50 GBs on a single shoot.
What are the storage options we have in this era of technological advancement?
Here I discuss the three options that I believe are the best available solution in today’s digital age. Scalability, Performance, and Reliability are the three important factors for me when deciding on the optimal solution for my storage needs. Although I do use hybrid model both on-premise and on cloud to keep my costs low.
- Network Attached Storage
- AWS Storage Gateway
- Google – Cloud Storage Fuse
Needless to say, there are various other storage options you could build on premise but expanding it as the storage increases is the real challenge. Restoring all your files should one or all of your disks fail can be quite difficult and time consuming. It’s never a good idea to have your data and backup residing on the same file system. In an event of disaster, your NAS explodes and you lose all your data in one event with no option for recovering your files.
Studio and wedding photographers may find this very feasible to build NAS disk station in-house. QNAP, Synology, NetGear or build your own with whatever suits your lifestyle. You can even have your web-server failover to NAS in the event of an outage or you could even have your entire website hosted on NAS. I don’t recommend it because it requires that your router & NAS to be available 24/7 and it’s slow.
QNAP -Unlike Synology, QNAP doesn’t hide its technical details and is available readily. The CPU performance on QNAP is best compared to Synology. I have not used it, and so I am unable to offer any comparison metrics.
Synology – I have DS418J with 4 bay drives setup as Raid0 array. Raid0 because I do not care if my disks fail. My files are backed up on Amazon Glacier. That way I am able to use 16TB drives in full capacity. Synology DSM is the best available operating system for NAS and it also offers SHR – Hybrid RAID configuration for quick and easy expansion of disk drives. It is basically a combination of RAID1 and RAID5 configuration depending on the drive size in the array.
Custom NAS – FreeNAS on VMware ESXi is much more powerful and flexible than pre-built NAS. FreeNAS is an open source storage operating system with its ZFS file system. Offers a lot of cool features such as deduplication and snapshots. Deduplication is quite memory intensive and is not recommended for home users. Because if you do not have enough RAM it will degrade its performance. Setting up FreeNAS requires dedicated hardware – motherboard, raid cards and hard drives for ZFS. I will not go into detail because I discarded this setup years ago. But if you need any help setting up one, give me a shout.
If you are like me, always traveling, on the road and anxious about having to squeeze your storage disks into your backpack then this will save you from all the trouble. It is now possible to mount Amazon S3 block based storage onto your local as NFS drive and all you need is hi-speed internet connection to save your images through AWS Storage Gateway. At the time of writing, AWS Storage Gateway supports only VMWare ESXi and Microsoft Hyper-V virtual machine images. There is no support for Xen or Kvm as yet. AWS File gateway, allows you to use hybrid cloud connection where frequently access files is saved in your local cache or wherever your gateway server is (Local or Amazon EC2) and is uploaded to S3 through the gateway interface. This can be viewed as an expansion to your storage or you can completely have everything saved on AWS. Once the files are on S3, you can setup lifecycle policies to archive your files into Amazon Glacier.
You do not need dedicated hardwares to setup a gateway. You can very well set it up on your MAC using virtualbox or if you are on windows vSphere Hypervisor. virtualbox can be setup on Linux as well. Amazon recommends that the VM should have four virtual CPU, 16GiB of RAM and 80Gib of disk space. I was able to setup virtualbox with 1vCPU and less than 16GiB RAM. Again it depends on your lifestyle. Once your VM is setup, all you have to do is setup S3 as NFS/SMB mount point and you can access your S3 objects directly.
The cool thing about Amazon Web Services is that it integrates with other micro services such as Cloudfront, Amazon Elastic Transcoder, Amazon Rekognition and others which makes it easier to deploy photo/video projects. Also Amazon storage infrastructure spreads your data across multiple data enter to achieve durability and high availability, that way your data is online and always available for instant access.
Google Cloud Platform (GCP) offers Regional, Multil-Regional, NearLine and CloudLine cloud storage options on the similar line as Amazon S3/Glacier but are relatively more cheaper. With Cloud Storage Fuse, you can mount object based Cloud storage as file-and-directory based filesystem on Linux or Mac OS. It is not like NFS or CIFS filesystem and hence it cannot be used to expand your on-premise storage disks. Google cloud storage is said to have higher latency and throughput compared to Amazon S3 and Azure making it ideal for cloud backup service. I’ve not come across any GCP service with the ability to mount buckets via NFS on a virtual machine. But I could be wrong or things could change.
AFP is more fast and stable but it is deprecated in MAC. SMB is the easy approach but Adobe products doesn’t bode well with File System connected through SMB. In fact, none of these protocols suit well for Adobe products because they never supported saving files over network file share. What worked best for me with all the Adobe products that I use is NFS v4. With NFSv4 mount, you can even setup ACL to restrict access to files and directories which is more like POSIX permissions but offers much more specific options. If you could have your NAS setup to support block over iSCSI, it’s performance is even better than all the others.
While things seem to work great with NFS, keep in mind that Adobe does not provide support for network attached storage or removable media.
Adobe Technical Support only supports using Photoshop and Adobe Bridge on a local hard disk. It’s difficult to re-create or accurately identify network- and peripheral-configuration problems.
Now that modern routers and NAS disk-stations come with advanced logging capabilities, why doesn’t ADOBE look into this arena is beyond me.
It really is the kind of trade off you’d want to make between capacity and fault tolerance when deciding on the RAID configuration for your NAS system.
RAID0 – No Redundancy. Used when you need improved throughput/performance.
RAID1 – Redundant. 1-2 drive capacity usage. No performance benefits.
RAID5 – Redundant. Min 3 disks with speed benefits. Performance is slower than RAID10.
RAID6 – Same as RAID5 with extra parity bit. Supports 2 disks failure opposed to RAID5 which supports one.
RAID10 – Used in cases where you want redundancy and speed of RAID0. Offers best performance but not very cost efficient.
RAID Z2 – ZFS version of RAID6.
RAID10 is best for most of the use-cases. You get additional capacity in RAID6 configuration in exchange for reduced performance. RAID5 works fine on smaller disks but I wouldn’t recommend it for large array. Once again, RAID is not a backup solution. All the non-RAID0 array configuration offers protections against one or more disk failures. You should look at backup options on-site or on-cloud for quick recovery.
Amazon S3 is object based storage which provides 99.999999999% durability and 99.99% availability over your data. It costs as low as $0.023 /GB and it varies by the region you chose to host your files. It comes in various flavors of storage types such as standard, in frequent access and reduced redundancy. S3 resources are charged not just for storage but also for the requests and data transfer in & out. Glacier on the other hand offers affordable pricing as low as $0.004 for backup and archiving solutions. Glacier charges for data retrieval which is $0.01 per GB standard retrieval time. Therefore, Glacier is more ideal for long term and infrequently accessed data.
One could think of various ways to have their home photo/video data warehouse setup based on their lifestyle but what’s important is how you scale it up as your storage needs increases year after year. Cloud storage is typically more reliable, scalable, and secure than traditional NAS storage systems. Cloud service providers such as Amazon Web Services, Google Cloud Platform, Microsoft Azure and others offer more affordable rates for storing data on cloud. The advantage over here is pay as you go and scale as you need. It also reduces the overhead of maintaining the storage stack in-house.