{"id":2692,"date":"2019-05-05T15:30:33","date_gmt":"2019-05-05T13:30:33","guid":{"rendered":"http:\/\/miro.borodziuk.eu\/?p=2692"},"modified":"2020-02-18T17:21:21","modified_gmt":"2020-02-18T16:21:21","slug":"simple-storage-service-s3","status":"publish","type":"post","link":"http:\/\/miro.borodziuk.eu\/index.php\/2019\/05\/05\/simple-storage-service-s3\/","title":{"rendered":"Simple Storage Service (S3)"},"content":{"rendered":"<p>Simple Storage Service (S3) is a global object storage platform that can be used to store objects in the form of text files, photos, audio, movies, large binaries, or other object types.<\/p>\n<p><!--more--><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-3119 aligncenter\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/S3-1.jpg\" alt=\"\" width=\"625\" height=\"659\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/S3-1.jpg 625w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/S3-1-285x300.jpg 285w\" sizes=\"(max-width: 625px) 100vw, 625px\" \/><\/p>\n<p><span style=\"color: #3366ff;\">S3 Fundamentals<\/span><\/p>\n<ul>\n<li>Bucket names have to be globally <strong>unique <\/strong><\/li>\n<li>Minimum of <strong>3<\/strong> and maximum of <strong>63<\/strong> characters \u2014 no uppercase or underscores<\/li>\n<li>Must start with a lowercase letter or number and can&#8217;t be formatted as an IP address (1.1.1.1)<\/li>\n<li>Default <strong>100<\/strong> buckets per account, and hard 1000-bucket limit via support request<\/li>\n<li>Unlimited objects in buckets<\/li>\n<li>Unlimited total capacity for a bucket<\/li>\n<li>An object&#8217;s key is its name<\/li>\n<li>An object&#8217;s value is its data<\/li>\n<li>An object&#8217;s size is from 0 to <strong>5 TB<\/strong><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>Every object stored in S3 has an associated storage class \u2014 also called a storage tier. Storage classes can be adjusted either manually or automatically using<strong> lifecycle policie<\/strong>s. All storage classes have 99.999999999% (11 nines) durability.<\/p>\n<p>Storage classes determine the <strong>cost<\/strong> of storage, the availability, durability, and latency for object retrieval.<\/p>\n<p>The current classes for S3 object storage are:<\/p>\n<p><span style=\"color: #999999;\">Standard <\/span><\/p>\n<ul>\n<li>Designed for general, all-purpose storage<\/li>\n<li>The default storage option<\/li>\n<li>Designed for 99.99% (four nines) availability<\/li>\n<li><strong>3+ AZ<\/strong> replication<\/li>\n<li>Most expensive storage class, but has <strong>no minimum object size<\/strong> and <strong>no retrieval fee<\/strong><\/li>\n<\/ul>\n<p><span style=\"color: #999999;\">Intelligent-Tiering <\/span><\/p>\n<ul>\n<li>Moves objects across access tiers based on usage patterns<\/li>\n<li>Same performance as Standard<\/li>\n<\/ul>\n<p><span style=\"color: #999999;\">Standard Infrequent Access (Standard-IA) <\/span><\/p>\n<ul>\n<li>Designed for important objects, where <strong>access is infrequent<\/strong>, but <strong>rapid retrieval<\/strong> is a requirement<\/li>\n<li>Designed for 99.9% (three nines) availability<\/li>\n<li><strong>3+ AZ<\/strong> replication<\/li>\n<li><strong>Cheaper<\/strong> than the Standard storage class<\/li>\n<li><strong>30-day<\/strong> minimum storage charge per object,<strong> 128 KB<\/strong> minimum storage charge, object retrieval fee<\/li>\n<\/ul>\n<p><span style=\"color: #999999;\">One Zone-IA <\/span><\/p>\n<ul>\n<li>Designed for non-critical, reproducible objects<\/li>\n<li>Designed fo 99.5% availability<\/li>\n<li><strong>1 AZ<\/strong> replication (less resilient)<\/li>\n<li><strong>Cheaper<\/strong> than the Standard or Standard-IA storage classes<\/li>\n<li><strong>30-day<\/strong> minimum storage charge per object,<strong> 128 KB<\/strong> minimum storage charge, object retrieval fee<\/li>\n<\/ul>\n<p><span style=\"color: #999999;\">Glacier<\/span><\/p>\n<ul>\n<li>Designed for long-term archival storage (not to be used for hot backups)<\/li>\n<li>May take several minutes or hours for objects to be retrieved (several options available)<\/li>\n<li>Designed for 99.99% (four nines) availability<\/li>\n<li><strong>3+ AZ<\/strong> replication<\/li>\n<li><strong>90-day<\/strong> minimum charge per object,<\/li>\n<li><strong>40 KB<\/strong> minimum storage charge, object retrieval fee<\/li>\n<\/ul>\n<p><span style=\"color: #999999;\">Glacier Deep Archive<\/span><\/p>\n<ul>\n<li>Designed for long-term archival storage Ideal alternative to tape backups <strong>Cheaper<\/strong> than normal Glacier, but retrievals take longer<\/li>\n<li>Designed for 99.99% (four nines) availability<\/li>\n<li><strong>3+ AZ<\/strong> replication May take several hours for objects to be retrieved<\/li>\n<li><strong>180-day<\/strong> minimum charge per object,<\/li>\n<li><strong>40 KB<\/strong> minimum storage charge, object retrieval fee<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><strong>Glacier Terminology<\/strong><\/p>\n<p><em>Archive<\/em><\/p>\n<ul>\n<li>A durably stored block of information<\/li>\n<li>TAR and ZIP are common formats used to aggregate files<\/li>\n<li>Total volume of data and number of archives is unlimited<\/li>\n<li>Each archive can be up to<strong> 40<\/strong> <strong>TB<\/strong><\/li>\n<li>Largest single upload is <strong>4 GB<\/strong> (use multipart upload &gt;100 MB)<\/li>\n<li>Archives can be uploaded and deleted, but not edited or overwritten<\/li>\n<\/ul>\n<p><em>Vault<\/em><\/p>\n<ul>\n<li>A way to group archives together<\/li>\n<li>Control access using vault-level access policies using IAM<\/li>\n<li>SNS notifications available for when retrieval requests are ready for download<\/li>\n<\/ul>\n<p><em>Vault Lock <\/em><\/p>\n<ul>\n<li>Lockable policy to enforce compliance controls on vaults<\/li>\n<li>Vault lock policies are immutable<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Lifecycle rules<br \/>\n<\/span><\/p>\n<p>Storage classes can be controlled via lifecycle rules, which allow for the automated transition of objects between storage classes, or in certain cases allow for the expiration of objects that are no longer required. Rules are added at a bucket level and can be enabled or disabled based on business requirements.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-2766 aligncenter\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/InteligentTiering.jpg\" alt=\"\" width=\"613\" height=\"288\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/InteligentTiering.jpg 613w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/InteligentTiering-300x141.jpg 300w\" sizes=\"(max-width: 613px) 100vw, 613px\" \/><br \/>\nObjects smaller than 128 KB cannot be transitioned into INTELLIGENT TIERING. Objects must be in the original storage class for a minimum of <strong>30<\/strong> days before transitioning them to either of the IA storage tiers. Instead of transitioning between tiers, objects can be configured to <strong>expire<\/strong> after certain time periods. At the point of expiry, they are <strong>deleted<\/strong> from the bucket.<br \/>\nObjects can be archived into Glacier using lifecycle configurations. The objects remain inside S3, managed from S3, but Glacier is used for storage. Objects can be restored into S3 for temporary periods of time \u2014after which, they are deleted. If objects are encrypted, they<strong> remain encrypted<\/strong> during their transition to Glacier or temporary restoration into S3.<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Bucket authorization<\/span> within S3 is controlled using <strong>identity policies<\/strong> on AWS identities, as well as bucket policies in the form of resource policies on the bucket and bucket or object ACLs.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2768\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/BucketPolicy.jpg\" alt=\"\" width=\"628\" height=\"545\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/BucketPolicy.jpg 628w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/BucketPolicy-300x260.jpg 300w\" sizes=\"(max-width: 628px) 100vw, 628px\" \/><br \/>\nFinal authorization is a combination of all applicable policies. Priority order is (1) Explicit Deny, (2) Explicit Allow, (3) Implicit Deny.<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Uploads<\/span> to S3 are generally done using the S3 console, the CLI, or directly using the APIs. Uploads either use a single operation (known as a single PUT upload) or multipart upload.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-2769 aligncenter\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/SinglePutUpload.jpg\" alt=\"\" width=\"618\" height=\"198\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/SinglePutUpload.jpg 618w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/SinglePutUpload-300x96.jpg 300w\" sizes=\"(max-width: 618px) 100vw, 618px\" \/><br \/>\nLimit of <strong>5 GB<\/strong>, can cause performance issues, and if the upload fails the whole upload fails<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2770\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/MultipartUpload.jpg\" alt=\"\" width=\"613\" height=\"177\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/MultipartUpload.jpg 613w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/MultipartUpload-300x87.jpg 300w\" sizes=\"(max-width: 613px) 100vw, 613px\" \/><br \/>\nAn object is broken up into parts (up to 10,000), each part is 5 MB to 5 GB, and the last part can be less (the remaining data)<br \/>\nMultipart upload is faster (parallel uploads), and the individual parts can fail and be retried individually. AWS recommends multipart for anything over<strong> 100 MB<\/strong>, but it&#8217;s required for anything beyond 5 GB.<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Versioning<\/span><\/p>\n<p>Versioning can be enabled on an S3 bucket. Once enabled, any operations that would otherwise <strong>modify<\/strong> objects generate <strong>new versions<\/strong> of that original object. Once a bucket is version-enabled, it can <strong>never<\/strong> be fully<strong> switched off<\/strong> \u2014 only <strong>suspended<\/strong>.<br \/>\nWith versioning enabled, an AWS account is billed for <strong>all versions<\/strong> of <strong>all objects<\/strong>. Object deletions by default <strong>don&#8217;t<\/strong> delete an object \u2014 instead, a delete <strong>marker<\/strong> is added to indicate the object is deleted (this can be undone). Older versions of an object can be <strong>accessed<\/strong> using the object name and a <strong>version ID<\/strong>. <strong>Specific<\/strong> versions can be <strong>deleted<\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-2772 aligncenter\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/Versioning.jpg\" alt=\"\" width=\"611\" height=\"301\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/Versioning.jpg 611w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/Versioning-300x148.jpg 300w\" sizes=\"(max-width: 611px) 100vw, 611px\" \/><\/p>\n<p><strong>MFA Delete<\/strong> is a feature designed to prevent accidental deletion of objects. Once enabled, a one-time password is required to delete an object version or when changing the versioning state of a bucket.<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">S3 cross-region replication<\/span> (S3 CRR) is a feature that can be enabled on S3 buckets allowing one-way replication of data from a source bucket to a destination bucket in another region.<\/p>\n<p>By default, replicated objects keep their:<\/p>\n<ul>\n<li>\u00a0Storage class<\/li>\n<li>Object name (key)<\/li>\n<li>Owner<\/li>\n<li>Object permissions<\/li>\n<\/ul>\n<p>Replication configuration is applied to the source bucket, and to do so requires <strong>versioning<\/strong> to be<strong> enabled<\/strong> on<strong> both buckets<\/strong>. Replication requires an<strong> IAM role<\/strong> with <strong>permissions<\/strong> to replicate objects. With the replication configuration, it is possible to override the storage class and object permissions as they are written to the destination.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-2767 aligncenter\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/CRR.jpg\" alt=\"\" width=\"601\" height=\"208\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/CRR.jpg 601w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/CRR-300x104.jpg 300w\" sizes=\"(max-width: 601px) 100vw, 601px\" \/><\/p>\n<p>Excluded from Replication<\/p>\n<ul>\n<li>System actions (lifecycle events)<\/li>\n<li>Any existing objects from <strong>before<\/strong> replication is enabled<\/li>\n<li>SSE-C encrypted objects \u2014 only SSE-S3 and (if enabled) KMS encrypted objects are supported<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Hosting websites<\/span><\/p>\n<p>Amazon S3 buckets can be configured to<span style=\"color: #000000;\"> host websites<\/span>. Content can be uploaded to the bucket and when enabled, <strong>static web hosting will<\/strong> provide a unique endpoint URL that can be accessed by any web browser. S3 buckets can host many types of content, including:<\/p>\n<ul>\n<li>HTML, CSS, JavaScript<\/li>\n<li>Media (audio, movies, images)<\/li>\n<\/ul>\n<p>S3 can be used to host <strong>front-end<\/strong> code for <strong>serverless<\/strong> applications or an offload location for static content. CloudFront can also be added to improve the speed and efficiency of content delivery for global users or to add SSL for custom domains.<\/p>\n<p>Route 53 and alias records can also be used to add human-friendly names to buckets.<\/p>\n<p><span style=\"color: #3366ff;\">Cross-Origin Resource Sharing (CORS)<\/span><br \/>\nCORS is a security measure allowing a web application running in one domain to reference resources in another.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2771\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/CORS.jpg\" alt=\"\" width=\"483\" height=\"236\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/CORS.jpg 483w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/CORS-300x147.jpg 300w\" sizes=\"(max-width: 483px) 100vw, 483px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">AWS Storage Gateway<\/span><\/p>\n<p>Connects local data center software appliances to cloud-based storage, such as Amazon S3. We can use it for hybrid cloud backup, archiving and disaster recovery, tiered storage, application file storage, and data processing workflows.<\/p>\n<p><span style=\"color: #999999;\">File Gateway<\/span><\/p>\n<ul>\n<li>Comprises the S3 service and a virtual appliance<\/li>\n<li>Allows for storage and retrieval of files in S3 using standard file protocols (NFS and SMB)<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-2899 aligncenter\" src=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/FileGateway.jpg\" alt=\"\" width=\"630\" height=\"636\" srcset=\"http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/FileGateway.jpg 630w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/FileGateway-150x150.jpg 150w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/FileGateway-297x300.jpg 297w, http:\/\/miro.borodziuk.eu\/wp-content\/uploads\/FileGateway-100x100.jpg 100w\" sizes=\"(max-width: 630px) 100vw, 630px\" \/><\/p>\n<p><span style=\"color: #999999;\">Volume Gateway<\/span><\/p>\n<p><em>&#8211; Gateway-Cached Volumes<\/em><\/p>\n<ul>\n<li>Create storage volumes and mount them as iSCSI devices on the on-premises servers<\/li>\n<li>The gateway will store the data written to this volume in Amazon S3 and will cache frequently accessed data on-premises in the storage device<\/li>\n<\/ul>\n<p><em>&#8211; Gateway-Stored Volumes<\/em><\/p>\n<ul>\n<li>Store all the data locally (on-premises) in storage volumes<\/li>\n<li>Gateway will periodically take snapshots of the data as incremental backups and store them on Amazon S3<\/li>\n<\/ul>\n<p><span style=\"color: #999999;\">Tape Gateway<\/span><\/p>\n<ul>\n<li>A cloud virtual tape library that writes to Glacier<\/li>\n<li>Used for archiving data<\/li>\n<li>Can run as a VM on-premises or on an EC2 instance<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Encryption<\/span><\/p>\n<p>Data between a client and S3 is encrypted<strong> in transit<\/strong>. Encryption <strong>at rest<\/strong> can be configured on a <strong>per-object<\/strong> basis.<\/p>\n<ul>\n<li><span style=\"color: #999999;\">Client-Side Encryption<\/span>: The client\/application is responsible for managing both the encryption\/decryption process and its keys. This method is generally only used when strict security compliance is required \u2014 it has significant admin and processing overhead.<\/li>\n<li><span style=\"color: #999999;\">Server-Side Encryption<\/span> with <span style=\"color: #999999;\">Customer-Managed Keys<\/span> (SSE-C): S3 handles the encryption and decryption process. The customer is still responsible for key management, and keys must be supplied with each PUT or GET request.<\/li>\n<li><span style=\"color: #999999;\"> Server-Side Encryption<\/span> with <span style=\"color: #999999;\">S3-Managed Keys<\/span> (SSE-S3): Objects are encrypted using AES-256 by S3. The keys are generated by S3 (using KMS on your behalf). Keys are stored with objects in an encrypted form. If you have permissions on the object (e.g., S3 Read or S3 Admin), you can decrypt and access it.<\/li>\n<li><span style=\"color: #999999;\">Server-Side Encryption<\/span> with <span style=\"color: #999999;\">AWS KMS-Managed Keys<\/span> (SSE-KMS): Objects are encrypted using individual keys generated by KMS. Encrypted keys are stored with the encrypted objects. Decryption of an object needs both S3 and KMS key permissions (role separation).<\/li>\n<\/ul>\n<p><span style=\"color: #999999;\">Bucket Default Encryption<\/span><br \/>\nObjects are encrypted in S3, not buckets. Each PUT operation needs to specify encryption (and type) or not. A bucket default captures any PUT operations where no encryption method\/directive is specified. It doesn&#8217;t enforce what type can and can&#8217;t be used. Bucket policies can enforce.<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">A presigned URL<\/span> can be created by an identity in AWS, providing access to an object using the creator&#8217;s access permissions. When the presigned URL is used, AWS verifies the <strong>creator&#8217;s<\/strong> access to the object \u2014 <strong>not yours<\/strong>. The URL is encoded with authentication built in and has an <strong>expiry time<\/strong>.<\/p>\n<p>Presigned URLs can be used to <strong>download<\/strong> or <strong>upload<\/strong> objects.<\/p>\n<p>Any identity can create a presigned URL \u2014 even if that identity <strong>doesn&#8217;t<\/strong> have access to the object.<\/p>\n<p>Example presigned URL scenarios:<\/p>\n<ul>\n<li>Stock images website \u2014 media stored privately on S3, presigned URL generated when an image is purchased<\/li>\n<li>Client access to upload an image for process to an S3 bucket<\/li>\n<\/ul>\n<p>When using presigned URLs, you may get an error. Some common situations include:<\/p>\n<ul>\n<li>The presigned URL has <strong>expired<\/strong> \u2014 <strong>seven-day<\/strong> maximum<\/li>\n<li>The permissions of the creator of the URL have changed<\/li>\n<li>The URL was created using a role (<strong>36-hour<\/strong> max) and the role&#8217;s temporary credentials have expired (aim to <strong>never<\/strong> create presigned URLs using <strong>roles<\/strong>)<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Simple Storage Service (S3) is a global object storage platform that can be used to store objects in the form of text files, photos, audio, movies, large binaries, or other object types.<\/p>\n","protected":false},"author":1,"featured_media":2695,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[77],"tags":[],"_links":{"self":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/2692"}],"collection":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/comments?post=2692"}],"version-history":[{"count":15,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/2692\/revisions"}],"predecessor-version":[{"id":3335,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/2692\/revisions\/3335"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/media\/2695"}],"wp:attachment":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/media?parent=2692"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/categories?post=2692"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/tags?post=2692"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}