Visualize your (micro)services with AWS Cloud Map and Neo4j

5 min readSep 21, 2020

Static configuration and diagrams …

Developing with the microservice pattern in mind brings a lot of advantages, but also some challenges. For instance, in order for clients to dynamically discover a service endpoint (instead of having a static configuration) we’ll need Service Discovery.
Hashicorp Consul is a great tool for Service Discovery, and now AWS has it’s own Service Discovery service, called Cloud Map. I like Cloud Map because it’s cloud-native, it integrates nicely with ECS and EKS and it provides a non-DNS Service Discovery method that can be used to register arbitrary attributes of services, for instance, the Arn of a Lambda function.

As application platforms tend to grow larger over time with (micro)services and deployment pipelines, having a clear view of services and the relations between them becomes harder. For instance, a front-end depends on a back-end, an application service is depending on a pipeline service.

I used to describe all services and resources in a static diagram (with Visio or Draw.io), which became hard to read and outdated because it wasn’t updated over time.
So instead of updating diagrams, my idea was to automate this manual activity by visualizing services and relations dynamically.

Neo4j

Neo4j is a native graph database that allows you to leverage (visualize and analyze) not only data but also data relationships. Unlike traditional databases, Neo4j has a flexible structure defined by stored relationships between data records.

It’s heavily used in the data-science world (for instance in politics or medical science) but can also be used in IT infrastructure, for instance, network management.

As an example, if we want to define some servers and their relationship in Neo4j, then this can be achieved with a few lines of the Cypher query language:

# query #1
MERGE (n:Server { name: "LDAP", status: "up"})
MERGE (m:Server { name: "Mailserver", status: "up"})
MERGE (n)-[:DEPENDS_ON]->(m)# query #2 
MATCH (n) RETURN n

Cloud Map visualization with Neo4j

I came up with the following requirements for Cloud Map visualization:

Visualize Services and related Instances
Create implicit relationships between instances and the related service
Explicitly define a relationship between a service and another service
Reflecting the state of Cloud Map in Neo4j should not be complex

An important design decision is how the graph database is updated. Processing each mutation in Cloud Map (Create/Update/Delete) and managing relations can be complex, so instead we take a snapshot of the ‘new’ state of Cloud Map, merge (create or update) all nodes and relationships, and delete any node that doesn’t exist in the ‘new’ Cloud Map state. To keep things simple, this process should be an atomic transaction.

The implementation

(You can find the code here)

To achieve single concurrency we’re using an SQS queue with a locking mechanism ensures for the Neo4j database transactions.
The process starts with the Producer being triggered, either by a change in Cloud Map or manually invoked. It will create a snapshot of the ‘new’ state of Cloud Map, and put the payload as a message on the SQS queue.

When the Consumer receives the message, it will 1) acquire a lock, 2) process the Cloud Map state, and 3) update the Neo4j database. If there is an active lock, the message will remain in the queue, to be picked up at a later time.

consumer.py

ddb_client = client('dynamodb')
    acquire_lock(ddb_client, table_name, lock_item_id)
    try:
        update_neo4j(services_list, curr_update_id)
        clean_neo4j(curr_update_id)
    except Exception as err:
        release_lock(ddb_client, table_name, lock_item_id)
        raise Exception(f"Exception during update neo4j: {err}")
    else:
        release_lock(ddb_client, table_name, lock_item_id)

Neomodel: interacting with Neo4j

Interacting with Neo4j from Python is easy with Neomodel, which is an Object Graph Mapper for the Neo4j graph database.

The first thing we’ll need to do is to define inherited classes that represent different types of Neo4j nodes:

class Service(StructuredNode):
    '''
    Service, a neo4j structurednode
    '''
    name = StringProperty(unique_index=True, required=True)
    update_id = StringProperty(index=False, required=True)
    service = RelationshipTo('Service', 'DEP_ON')class Instance(StructuredNode):
    '''
    Instance, a neo4j structurednode
    '''
    name = StringProperty(unique_index=True, required=True)
    update_id = StringProperty(index=False, required=True)
    service = RelationshipTo(Service, 'REGISTERED')

For our use-case, we require 2 types of nodes: Services and Instances. Both have a name property that should be unique. The relationships are also defined in the StructuredNodes: an Instance can have a RelationshipTo a Service and a Service can have a RelationshipTo another Service.

Retreiving a Service based on the name is simple:

service_node = Service.nodes.get(name=service['Name'])

Creating a new Service:

service_node = Service(
            name=service['Name'], update_id=update_id).save()

Updating a property of a Service:

service_node.update_id = update_id
        service_node.save()

Explicit Service to Service Relationship

To define an explicit Service to Service Relationship, we’ve defined a special tag on the services on Cloud Map: ‘NEO4J_RELATIONSHIP_TO_SERVICE’. Use the name of the other service as the Tag Value, and the code will attempt to create a relationship.

Cleanup obsolete nodes

To identify which nodes are obsolete and should be removed, all nodes have an update_id property. Each new Cloud Map state is identified with a SQS message-id which is used as the ‘new update_id’. When nodes are updated, their update_id property is also updated. After all updates, we match the nodes that don’t have the new update_id and delete them:

function clean_neo4j:

all_instances = Instance.nodes.all()
    for instance in all_instances:
        if instance.update_id != update_id:
            logging.info(f"Instance {instance.name} is obselete, deleting...")
            instance.delete()

In closing

Now that we have Cloud Map data in Neo4j, we can use Neo4j to analyse the data. I’ll never have to write up diagrams again, which was the end goal :)

The possibilities with Neo4j and Neomodel are endless, I’d say give it a try.