Object storage – Part 1

What is object storage?

Object storage is the storage designed to handle large amounts of unstructured data. This is the data that has no structure and largely consists of a mix of emails, images, audio files, text files, IoT data, videos, and so on.

This kind of data is continually generated in massive amounts on social platforms and from IoT devices. We need a storage system that can handle this data influx efficiently and economically.

Object storage is also the preferred storage model for data archiving and taking data backups as it offers dynamic scalability, unlike any other storage models. It can easily handle petabyte and exabyte scale data on an ongoing basis.

Let’s discuss how object storage works.

How does object storage work?

Just like block storage, every data object in an object storage system contains a unique identifier for easy accessibility. Objects also contain metadata attached to them.

Attaching metadata with objects helps with the implementation of data policies, data protection, validation of the authenticity of the content, running business analytics, and so on.

This metadata can also be customized based on the business requirements. For instance, we can customize the metadata of an image to add information, including from what device the image was captured, the people or the objects in the image, the date and location of the image, image category, filters applied on the image, etc.

Once this meta-information is added to the images, they can be easily located and retrieved based on the meta-information like fetching images belonging to a certain category, those captured by a certain camera, and so on.

Storage of meta information of this level is not possible with other storage types, such as block and file. The data stored with block and file storage contains very basic meta-information.

Storing data in objects helps with the performance big time when dealing with petabyte and exabyte scale data. The stored objects are further aggregated into object pools and are spread across the clusters and regions for scalability, high availability and disaster recovery. This is why object storage is widely used by businesses running their workloads on the cloud.

Illustration 1.70 - Distributed Object Storage In The Cloud

Accessing object store data

Object store data is accessed over the web via REST APIs. The data is mostly stored in virtual machines running on commodity bare-metal servers. Developers use the APIs provided by the cloud providers to read and write data in the object store managed by the cloud.

The provider is responsible for making the data redundant, setting up disaster recovery, etc. Fundamentally, cloud storage provides all the features that a cloud typically provides for a workload, such as high availability, scalability, elasticity, durability, security, a distributed environment facilitating storage for massive amounts of data, a pay-for-what-you-use pricing model, and so on.

We will continue this discussion in the next lesson.

Complete and Continue