Sunday, January 13, 2019

OCI―Knowing Object Storage Basics


Video 1.  OCI Object Storage (YouTube link)


Object Storage (in Oracle Cloud Infrastructure or OCI) is where data is handled as an object, also known as unstructured data. The main differences between object storage and traditional storage (also known as block storage), are listed as follows:
  • Stored data contains customized metadata
  • Data is indexed, allowing for much faster search results
  • Data can be located by using pointers instead of finding its location based on tracks and sectors on the hard disk (that is, the standard file system that we have used for many years).

Figure 1. Storage tier: Standard vs Archive
Figure 2.  Object Storage (Standard vs Archive)

Standard vs. Archive


Object Storage is a "public" OCI service and a regional service (i.e., not tied to any specific compute instance).

As shown in Figure 1 (click to enlarge), you can create new bucket under a compartment (i.e., "Training") and in a region (i.e., "us-ashburn-1").  OCI offers two distinct storage tiers for you to store unstructured data:

  • Standard (i.e., Object Storage)
    • Choose it for data to which you need fast, immediate, and frequent access
  • Archive (i.e., Archive Storage)
    • Choose it for data to which you seldom or rarely access, but that must be retained and preserved for long periods of time
    • More cost-effective than Object Storage


Figure 3.  Storage Object provides both public access and private access

Public vs Private Access


Object Storage supports:
  • Public Access from Internet
    • You can access data from anywhere inside or outside the context of the OCI, as long you have Internet connectivity and can access one of the Object Storage endpoints.
  • Private Access from VCN using Service Gateway
    • Any traffic from VCN that is destined for one of the supported OCI public services uses the instance's private IP address for routing, travels over OCI network fabric, and never traverses the Internet.
    • Service gateway can be used, for example, to back up private DB systems in VCN to public Object Storage, which lets resources in VCN access public Object Storage service, but without using an Internet or NAT gateway.

Managing Object Storage


Object Storage offers multiple management interfaces that let you easily manage storage at scale. The size of the object determines the appropriate management interface to use to upload objects to OCI Object Storage:
  • You can use the Console to upload objects up to 2 GiB in size.
  • You can use the CLI or API to upload objects up to 10 TiB in size.
  • You can use the multipart upload API to upload objects larger than 100 MiB (recommended).
Figure 4.  Object Storage Elements

Key Concepts


Before we cover the Object Identifiers, you need to know below key concepts:
  • Compartment
    • Is a collection of related resources that can be accessed only by certain groups
  • Bucket
    • Is a logical container for storing objects
    • A bucket is associated with a single compartment that has policies that determine what actions a user can perform on a bucket and on all objects in the bucket
  • Object
    • Is any type of data, regardless of content type
    • Is comprised of the object itself and metadata about the object
  • Namespace
    • Each OCI tenant is assigned one unique and uneditable Object Storage namespace that is global (spanning all regions and compartments)
      • Is a system-generated string assigned during account creation. 
      • Note that for some older tenancies, the namespace string may be the tenancy name in all lower-case letters
    • Serves as a top-level container for all buckets and objects and allows you to control bucket naming within your tenancy
      • While bucket names must be unique within your tenancy, your tenancy's bucket names can duplicate the bucket names chosen by other tenants. 





Figure 5.  Object Details

Object Identifier―URI


Data in Object Storage are managed as objects using a RESTful API built on standard HTTP verbs (e.g., GET, PUT, DELETE).

Unlike other resources, objects do not have Oracle Cloud Identifiers (OCIDs). Instead, users define an object name when they upload an object.  Then Object Storage prepends the Object Storage namespace string and bucket name to the object name:
/n/<object_storage_namespace>/b/<bucket>/o/
The object name is everything after the /o/.
For example: /n/ansh8lvru1zp/b/accessories/o/backpack_75.jpg
Within an Object Storage namespace, buckets and objects exist in a flat hierarchy, but you can simulate a directory structure using a prefix string that includes the forward slash (/) to add hierarchy to an object name. Doing so lets you list one directory at a time, which is helpful when navigating a large set of objects.

For example:
/n/ansh8tvru7zp/b/event_photos/o/marathon/finish_line.jpg
/n/ansh8tvru7zp/b/event_photos/o/marathon/participants/p_21.jpg
If you named your objects so that they exist in Object Storage as a hierarchy, you can use the CLI to perform bulk downloads and bulk deletes of all objects at a specified level of the hierarchy, without affecting objects in levels above or below. In the example above, you can use the CLI to download or delete all objects at the marathon/ level without downloading or deleting objects at the marathon/participants/ sublevel.When naming objects, you can also use prefix strings without a delimiter so that certain bulk operations can be performed in the CLI by matching on the prefix portion of the object name. For example, in the object names below, the string gloves_27_ can serve as a prefix for matching purposes when performing bulk downloads or deletions:
/n/ansh8tvru7zp/b/apparel/o/gloves_27_dark_green.jpg
/n/ansh8tvru7zp/b/apparel/o/gloves_27_light_blue.jpg   
When you perform bulk uploads with the CLI, you can prepend a prefix string to the names of the files you are uploading.

Object Storage Features


Following are some salient features of OCI Object Storage
  • Strong Consistency
    • When a read request is made, the Object Storage Service always serves the most recent copy of the data that was written to the system
  • Durability
    • Data is stored redundantly across multiple storage servers across multiple Availability Domains
    • Data integrity is actively monitored using checksums and corrupt data is detected and auto repaired. 
    • Any loss of data redundancy is actively managed by recreating a copy of the data from the redundant copy. 
  • Performance
    • The Compute Service and the Object Storage Service are co-located on the same network. 
  • Custom Metadata
    • You can define your own extensive metadata as key-value pairs for any purpose. 
      • For example, you can create descriptive tags for objects, retrieve those tags, and sort through the data.
  • Hadoop Support
    • You can use the Object Storage Service as the primary data repository for big data. 
    • The HDFS connector provides connectivity to various big data analytic engines. 
    • This connectivity enables the analytics engines to work directly with data stored in the Object Storage Service
  • Encryption
    • The Object Storage Service employs 256-bit Advanced Encryption Standard (AES-256) to encrypt object data on the server. Each object is encrypted with its own key. Object keys are encrypted with a master encryption key that is frequently rotated. Encryption is enabled by default and cannot be turned off.
OCI Object Storage also supports the following advanced operations:

No comments: