AWS StorageGateway

AWS storage gateway is a great way for organizations to get involved with AWS. When I think of “hybrid-cloud”, I think storage gateway. It is essentially a s3 bucket connected to your network in the form of a windows (SMB) or linux share drive!

So, what is the big deal here? Can’t users already get to s3 via a an sdk or the aws console? Yes, but with storage gateway, the s3 bucket appears to the user like a regular file share. This is much more intuitive and friendly with existing applications and workflows, as now the cloud is completely abstracted away and appears to the user as a traditional file share:

This folder is synced with the s3 bucket, and so whatever is inserted into the s3 bucket is visible here, and whatever changes are made to the folder (which is mapped to the s3 bucket) are synced to s3. Pretty powerful.

To get it working in terraform, AWS has provided a terraform module which can help out quite a bit, although it covers the gamut of options and types of file shares, which may not be applicable to your specific situation:

GitHub – aws-ia/terraform-aws-storagegateway

AWS provides specific AMIs that contain the storage gateway software which is necessary for syncing the data in S3 with the storage gateway file server. After you open the appropriate SMB file share ports, you can have your storage gateway communicate with clients in your domain in the case of a windows file share. In order for this to work at all, there needs to be network connectivity between you on prem systems and AWS.

Some of the pain points of setting this up in my case is the type of machine needed to run the gateway. To save costs, I attempted to start with a t2.micro instance. This resulted in a storage gateway appliance that would just hang when attempting to activate via the http activation link.

In my solution, there was a need to create the S3 buckets and file shares outside of the terraform that created the storage gateway. However, the storage gateway cannot be referenced from another terraform module via a terraform “data” resource tag. Using a terraform data tag is a useful way in terraform to reference resources that have not been created within the current terraform module you are working with. I was surprised at not being able to reference the storage gateway in this way.

To get around this issue, the storage gateway arn that was created in the storage gateway repository was stored in AWS SSM parameter store. This SSM parameter was then referenced in the terraform repository that was creating the SMB shares.