One of my earliest and most popular posts is Retrieving Files From S3 Using Chef on OpsWorks. That posts uses the Opscode AWS cookbook which in turn uses the right_aws gem. While this method is fine - particularly if you’re not using OpsWorks - there are some situations where it’s not ideal.
Recently I’ve started using the aws-sdk gem directly which is bundled with the OpsWorks agent. The version at the time of writing is 1.53.0.
The advantages of this are:
- Support for IAM instance roles, meaning you don’t have to pass AWS credentials via your custom JSON.
- No dependencies on external cookbooks.
- Will ordinarily be run at the compile stage, therefore you could download a JSON file, parse it, then use it to generate resources if you wanted.
The disadvantages are:
- It’s not entirely clear, but my feeling is that the gems included with the OpsWorks agent aren’t necessarily part of the API “contract” provided by OpsWorks for cookbook developers. Therefore there is no guarantee that AWS won’t change the version or even remove it entirely without notice. I think it’s unlikely that they’ll remove the aws-sdk gem or move to a version with compatibility breaking changes any time soon, but it’s possible.
- Less “chef-like” solution, although you could write your own chef resource to wrap it
- If you’re not using OpsWorks then the aws-sdk will create another dependency
This blog post provides an example of how to use the bundled aws-sdk gem to download a file from S3 using IAM instance roles on OpsWorks.
The IAM role
Before starting you’ll need to grant permissions to access your S3 bucket to the IAM role used by your instances. You can find
and edit EC2 instance roles under Identity & Access Management > Roles. The default role will probably be called something
like aws-opsworks-ec2-role
.
If you need to find the role it unfortunately isn’t listed in OpsWorks on the instance view page, but you can either find it in the instance properties within the EC2 console:
Or in the security tab of the layer settings:
Add a new policy to this role granting permissions to access your desired S3 bucket, for example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Don’t forget to replace my-bucket
with your bucket name. This will grant access to all the Get and List prefixed operations
for all objects in the bucket, but you can use a more specific policy if desired.
The recipe
I won’t go into the details of creating a custom cookbook, S3 bucket or adding recipes to lifecycle events as these steps were covered in my earlier post. Given that we won’t be using the OpsCode AWS cookbook you can skip adding that to your Berskfile.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
Now if you run mycookbook::myrecipe
you should find /tmp/myobject.txt
is populated with the content of my/object.txt
in the my-bucket
bucket.
Note that as I mentioned earlier the aws-sdk version at the time of writing is 1.53.0, therefore remember to refer to the
SDK version 1 docs rather than the version 2 docs which will
use the Aws.
namespace in examples (note the capitalization).
Reading and parsing a JSON file to use in a recipe
Here’s an example which downloads a JSON file, parses it then uses it to create resources.
1 2 3 4 5 6 7 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
In your OpsWorks run logs you should see something like:
1 2 3 4 5 6 |
|
Compile vs. Converge time
Chef runs in two stages, the compile stage then the converge stage. In the first recipe the S3 code is run at compile time because it is not part of a resource, but actually writing it to a file using the file resource takes place at converge time.
Therefore in the following scenario:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
The log will report File exists: false
, as the file resource hasn’t been executed by the time File.file?
is called.
The consequences of this should generally be insignificant given that usually you’d want the S3 file to be downloaded first, however if you need to delay code to converge time for any reason you can simply wrap it in a ruby_block resource.