Software engineers working with AWS have every cloud service imaginable at their fingertips, and developer velocity could hardly be higher. But, even the most shiny of coins has two sides.
While developers can freely spin up compute instances and databases in addition to less tangible things like Lambda functions or virtual identities—at some point, someone will ask, "What is all of this?"
And as that person hacks away in the CLI trying to get an overview of resources spanning multiple AWS accounts, they will inevitably get frustrated.
While Amazon has been a pioneer in cloud computing and offers the largest array of services, there are some things that just aren't so ideal. Namely, API consistency.
In this post, I describe a few of the challenges and quirks with the AWS API and why we're building Resoto. (Spoiler alert: It is so that you don't have to!)
If you're curious about the implementation of new resources or new services, please take a look at Contributing to the AWS Collector.
PRs are always welcome—Resoto is an open-source product, after all! While there is a recipe to follow for each AWS service, there are many pitfalls and the devil is in the details. Keep reading below for a few examples…
Retrieving Cloud Resource Information
Generally speaking, the API for every service covers the typical resource lifecycle: creation, tagging, description, and deletion. However, this is where the similarities end. In Resoto, everything is a resource.
Therefore, we will be on the lookout for any calls in the realm of
describe-.... This is where the divergence begins.
Can you get complete metadata for all resources of the same kind in a single call, or do you have to make nested calls for every item? For EC2, we are lucky and get everything in one go. Other services, however, force you to first retrieve a list of resource identifiers (more on that below) and then query for the attributes or descriptions of each result.
Obtaining metadata by means of manual exploration is extremely difficult. Programmatically, however, it is not half as bad because Resoto gathers the complete metadata for each and every resource in each collect run.
Cloud Resource Identification
In an ideal world, every question concerning the distinct identification of any resource should be answered by its ARN. (That's the whole idea behind ARNs!) However, the reality is not so simple.
Let's say you wish to tag a
ApiGateway. Looking at the CLI documentation, you can find a
tag-resource command and provide the ARN of the resource and the tags you wish to add.
You need to either assemble the ARN manually or try to find it somewhere in the management console.
ARNs in Resoto
This is why we have taken great care to provide ARNs for every resource in Resoto. You no longer have to care about these idiosyncrasies!
Aside from ARNs, resources can have a name and/or ID. (Sometimes they're the same, though!) The output of
describe-volumes contains a string
KmsKeyId—surely this contains the ID of a KMS key?
Wrong! Many references to KMS keys are named
Id but actually contain the ARN. While this is noted in the documentation, it can still be a source of confusion. It's not uncommon to reference resources by ID within and across AWS services, so this behaviour is unexpected to say the least.
Cloud Resource Data Type Inconsistencies
Let's step back into retrieving metadata for resources. The information varies in complexity and volume. Some properties do have reasonable data types. The size (in bytes) of a Glacier Vault is a long integer. Whether a Glacier Job has completed is a boolean. Let this not lull you in a false sense of comfort, though.
Would you like to know how many subscriptions of an SNS topic have been confirmed? Be sure to convert this string to an integer if you perform any arithmetic operations or comparisons! Do you need to know whether or not the confirmation for a subscription is still pending? This binary piece of information is a string, too!
There are even more inconsistencies to be found in representations of date and time data. Are you interested in the age of an EC2 instance or a Lambda function? You get an ISO-format datetime string. But if you want to know when an SQS queue was created? You instead have to make do with a timestamp in epoch time.
While it is certainly possible to work with this data, you need specific, contextual knowledge to do so… unless you're using Resoto. Resoto provides consistent data types across all resources.
Cloud Resource Tagging
update-tags-for-resource, etc.… the options are almost as numerous as the services. Do you have to provide the tags as an array or as a hashmap? Can you do multiple tags at once or is it one-by-one only?
Applying Cloud Resource Tagging Strategies with Resoto
The workload of tagging existing resources can blow out of proportion really quickly because all the services do their own thing.
At this point I'm sure you're guessing it already: you don't have to worry about any of this when you use Resoto to apply tags.
> aws ec2 create-tags –resources jenkins-master –tags Key=owner,Value=jenkins
> search is(aws_ec2_instance) and name = jenkins-master | tag update owner jenkins
> aws sqs tag-queue –queue-url https://sqs.us-west-2.amazonaws.com/123456789012/MyQueue –tags owner=jenkins
> search is(aws_sqs_queue) and sqs_queue_url = https://sqs.us-west-2.amazonaws.com/123456789012/MyQueue | tag update owner jenkins
All of these examples are taken from our experiences implementing Resoto's AWS collector. Wading through the AWS CLI documentation and attempting to obtain the desired data has resulted in reactions ranging from 🤨 all the way to 🤬.
We dove deep into the details of the responses of each API call and CLI command. But should this be necessary to get meaningful data? Should all of this niche knowledge be required?
At Some Engineering, we think not! Exploring and maintaining cloud infrastructure should be straightforward and you should be able to focus on solving actual problems.