AWS Elastic File System

Yesterday in a knowledge session between Solution Architects, the topic of AWS Elastic File System was raised and after a short discussion it was decided to take a closer look and set something up. To quote Top Gear, how hard could it be?

What is EFS?

AWS Elastic File System, or EFS, is Amazon Web Services’ latest storage solution and is a fully managed, simple and scalable file storage to use with EC2 instances. As the name suggests, it grows and shrinks automatically with your storage needs and EC2 instances can access EFS using NFS (v4.1), over multiple availability zones at low latency with high throughput (50 MB/s per TB with 100 MB/s burst). AWS lists the use cases of EFS to be; Big Data and analytics, media processing workflows, content management, web serving and home directories. Content Management you say? Hmmm J

From my past, scalable single sources of file system based content were expensive and difficult to deploy. So much so, that product and implementations strategy meant that putting all content in a database was by far and away the most logical route to take. So could EFS now resolve that headache? I will give it a test to find out.

What do I have to set up?

So I will simulate a website setup where I have an application server tier that would host my Tomcat (or similar) application servers and a back end file system which will be mounted as to my application servers so that the files can be used. Onto my file system I will deploy my content. I won’t install or configure Tomcat, this is simple to do but covered very well in other places.

The simple architecture

The simple architecture

So, I will need

  1. An auto-scaling group covering two availability zones (eu-west-1a, eu-west-1b) with two instances of Amazon Linux (no Tomcat, no auto-scaling rules for now)
  2. Security Group to allow my auto-scaling instances to talk NFS to my EFS
  3. An EFS created and mounted to my instances

For my auto-scaling group, I have gone and created a simple one and it is up and running across my two availability zones. I have gone and terminated an instance or two just for fun. That’s not related to this post, it is just fun to terminate something and watch it auto-magically reappear.

My security group allows instances that are a member of my auto-scaling security group, access to the EFS volumes via the NFS protocol

My Security Group

My Security Group

I can now create my EFS for my website content.

I first need to configure the file system access which consists of my VPC, my mount targets (availability zones) and the security group that defines the source of access requests (the one I created early):

Configuration of EFS

Configuration of EFS

Then I configure the optional settings. I have chosen to give it a friendly name and stuck to the default “Performance Mode” of general purpose.

Configure the EFS options

Configure the EFS options

The final review step and then I am done. That was it. No configuring disk sizes, difficult calculations on my requirements of how much content I have. It’s done.

Review what I did

Review what I did

After a shirt whole my volumes are ready and I can keep track on the status of creation in the main EFS dashboard under “life cycle state”.

After a short while they will be ready

After a short while they will be ready

Next we are going to test drive mounting my volume to my instance. EFS provides some instructions to be able to do this from the dashboard. Running in a ssh session (from the root);

Step 1: If needed, install the NFS client on your EC2 instance

sudo yum install -y nfs-utils

Step 2: Create a new directory on your EC2 instance, such as “efs”

sudo mkdir efs

Step 3: Mount your file system using the DNS name.

sudo mount -t nfs4 -o nfsvers=4.1 $(curl -s efs

Once that is done I can switch to the directory and create myself a simple index.html file for my eventual Tomcat server to see. If I then log on to my other instance, I can see that my file has been replicate from the first availability zone to the next. This means, if I would write my content to disk as I have done, it would be available instantly in the other availability zones and all my sites would be updated.

As I did this manually, if my auto-scaling group scales then I would need to do this each time. This defeats the purpose of auto-scaling. However, if I mount this directory at instance initialization time (e.g. chef) then it would be mounted when my new instance starts. To test this I made a very simply launch script and updated my Launch configuration (made a new one as edits are not possible) to add the following to the user data portion of the configuration.

cd /
sudo mkdir efs
sudo mount -t nfs4 -o nfsvers=4.1 $(curl -s efs

Warning: I would not use this code in production. No really, please don’t.


The most complicated thing about this is to mount the drives as creation of the fully managed and scalable storage is incredibly easy. For content management systems, like SDL Web (Tridion) this is a real help in deployment of content in a scalable and reliable way.

10 things to do when deploying SDL Web to the cloud

Now that cloud is on the uptake with almost every organization, it’s likely that during the next upgrade or new implementation of SDL Web you will be asked to move it to a cloud provider like AWS. What you should not do during this move is fall into the trap of just porting some virtual machines to the cloud and running it like you did before. The cloud is better than that (and it is 2016, not 2005), so you should investigate a little deeper in how you should deploy. So to help, here are 10 things to consider;

1.       Use SDL Web 8

It should go without saying that you should use the latest version of a product, but some organizations pull back from such steps until a product version is in a SP1. But with the new product release you get better support for the cloud from SDL. You can read my earlier post on the new infrastructure features that help deployment on the cloud but the main one you need to shoot for is the support for database-as-a-service from AWS or Azure

2.       Elastically scale delivery

In this nice article, the AWS Startup folks at Medium explain that the minimum viable product must scale in order to be a success; they are spot on. As your product is that website you are building then not implementing automatic scaling (or using something like AWS Elastic Beanstalk) of delivery should now be counted as a crime against humanity.

3.       Automate the deployment of your environment

Automating the deployment of an environment is more than saving a VM template of your build servers, but automating the configuration management, topology, software installations etc. This is essential in auto-scaling but deeply important when you are planning things like disaster recovery. The old school methods can go if you can automatically rebuild an environment inside 30 minutes.

4.       Implement Continuous Delivery and then Continuous Deployment

Rome was not build in a day and therefore you need to take this slow but most cloud providers have a CD pipeline available that integrates with the deployment options available at that cloud provider (see. Visual Studio Online or AWS Codepipeline). Get to the point where you can deploy a new version of the site (or part of it) multiple times per day in a robust way.

5.       Scale Publishers when needed

Number one complaint for publishing is it is slow when deploying lots of content. Well, following point four, you should not be able to spin up more publishers when needed. This does not per say need to automatic in response to load (but it could be) but it should at least be an automated process leveraging a graceful shut down and destroy.

6.       Process log files with a tool like Splunk

Now you have servers spinning up and down automatically you probably have little or no access to the running application. It is therefore important to put in automatic log file analysis to ensure that the application is running error free, you can spot failure trends and you can keep the overall health of the environment high. Applications will fail and that is OK, but you need data to proactively reduce the errors, feed into your continuous delivery pipeline and improve performance.

7.       Write custom monitors for SDL Web functionality

The cloud provides you with metrics for things like CPU and memory, but there are no monitors for specific and relevant SDL Web functionality. Is the publishing queue a little too long? Fire a warning to your integrated monitoring solution that something may need to change like a new publisher spun up to help with the load.

8.       Deploy new CD environments for temporary sites

You can easily spin up new delivery environment should you need to deploy a new site that will only last for a short period of time. This keeps complexity low on each site and impact to another site from the new site is impossible.

9.       Adopt a microservice architecture

Architecting your application in the microservice model means that CDaaS can be utilized and sites can just feed content from that. Further splitting your application into smaller functional components which can all be scaled separately will reduce software costs, improve deployment complexity, improve resilience and improve scaling.

10.   Test performance and scaling actively

Too often this is the last of the pile in regards to things to do. Automatic scaling does not remove the need to test performance, in fact, it makes it more important. In years gone by, if the site did not perform it just got slow and then probably crashed. Sadly, website visitors were used to that, but it does little for your business reputation. Now we can scale automatically, all this essentially means is that we keep adding new instances/servers until we meet demand. And what follows is a small heart attack when you read the monthly invoice.

Instead it is now more important that you need to keep your application performing well. Performance should be tested in the CD pipeline as well as on a frequent basis in production. Plenty of tools exist to support this and can help testing from different parts of the world if your site needs to respond to a global customer base.

New Infrastructure Features of SDL Web 8

SDL recently released the latest version of their web content management and this releases has some interesting changes from an infrastructure perspective that I would like to highlight.

Product Name Change

Firstly, whilst not an infrastructure change, the product has changed its name from SDL Tridion to SDL Web. The new name, Web, says a little bit more about what it does (at a very basic level) but does somewhat reject the kudos and history that comes with the name Tridion. The name “Tridion” is distinct and easily recognizable, Web is a little more generic and bland in my opinion.

The version number for the new release is 8 which is a throwback to the “R” releases of Tridion before SDL took over the Tridion company. The last product named in the “R” series was R5.3 and since then there has been the releases 2009, 2011, 2013 and now 8. It’s not directly logical that Web 8 should not actually be called Web 9, but 2009 and 2011 are really R5.4 and R6 respectively.

Improved Cloud Support

The new release has some heavy focus on changes to make it easier to deploy and manage Web. First up is the improved cloud support. SDL always did support “the cloud” through the proxy of supporting specific operating systems and provided they ran as normal it did not really matter where they ran. This meant that an IaaS based deployment of Tridion was always possible.

What SDL now means to say is that SDL now “supports specific features of some cloud providers”. Those are only AWS and Azure and nothing is mentioned about Google or Oracle as a platform. SDL Web 8 adds support for Azure SQL and Amazon RDS. The documentation states “Azure and Amazon RDS” but this is an oversight as it means “Azure SQL” as Azure is the Microsoft cloud platform rather than a specific piece of technology.

All this means is that you now take advantage of the database-as-a-service offerings from these two providers providing you are not using the SDL Web legacy pack (e.g. for VBScript templates), transactional core service code (you can write this out) or implementations with certain extensions. This is because Tridion traditionally made use of distributed transactions and these are not supported on AWS or Azure and legacy style code still needs MSDTC.

In all cases, you can only use the SQL Server engines which has a cost impact over and above the MySQL engine options on AWS RDS but is cheaper than the Oracle engines.

If you use a version of SDL Web prior to Web 8, you should note that you can use Azure SQL or AWS RDS for the Content Delivery database but it is simply not supported by SDL.

Topology Management

Topology Management is new product feature that replaces the existing (now deprecated) Publishing management (e.g. Publish Targets) with a more advanced approach which clearly de-couples the configuration of publishing from the management of content and makes the configuration of delivery environment something tied to the environment (e.g. production) rather than the content management database. Managed through Powershell, the Topology Manager manages the relationships between publications and delivery environments. There is a .NET API which means automation options from other applications that are not PowerShell compliant is possible and an example of using that API can be found here.

Some key terminology to grasp with topology management itself;

  • Content Delivery Environment: in essence just the same as before and is communicated with through a Discovery Endpoint
  • Topology Type: defines the purposes like “Staging” and “Live” and can define a series of purposes to help support a publishing workflow (e.g. Staging Editors -> Staging Executive Approval -> Live)
  • Topologies: combines one or more content delivery environments which have a particular Topology Type (including Purposes).

And on the Content Management side:

  • Target Type are the same as before in that it is what the user selects when publishing to undertake the publishing and has “a” Purpose e.g. “Live”
  • Business Process Type defines how content flows through the organization (published or not) and in this context what topology type and target types, defines the Minimal Approval Status of a target type and the priority which both used to be in the publication target. The Business Process Type is in a publication and can be inherited through the child publications.

The features increase the complexity of publishing management and it is a little frustrating there is no user interface as this was one of the nice features of the current publishing approach. Whilst I support the move, it has yet to be seen how much this would be an advantage in an automation / NoOps approach over what you could already do through existing APIs.

For now, there is no need to change to the new approach, so the advice to customers would be to sufficiently test with the new approach before rolling into a production situation.

More Graceful Publishing Management

SDL has improved how you can manage publishing services. Prior to Web 8, when a publishing service was stopped, it simply forgot everything it was doing (much like a “kill -9”). With the advent of technologies like auto-scaling, this makes simply stopping a service a royal pain because a service may be busy with something that you simply just do not want to forget about. These new features are only available in the new publishing approach described above:

  • Pausing the Publisher service: Pausing means that the Publisher does not pick up any new transactions, but does keep processing deployment feedback on items already send for deployment. Assuming you can test if it has completed all feedback items it had open (?) then a graceful destroy of a server could take place.
  • Graceful shutdown of Publisher service: Shutting down the Publisher service will allow the following to take place before shut down is completed; all transactions that have not yet been transported back in the publish queue (set to Waiting For Publish) and all transactions that have the transaction state “Scheduled for Deployment” have sent their commit packages to transport.

With the approach of starting and stopping delivery environments a little more dynamically then delivery environments have some additional options to help management them:

  • Graceful deactivation and reactivation of a Content Delivery environment: halting a Content Delivery environment for maintenance is now possible from the Content Manager server. The documentation is unclear with what happens to publishing transactions being pushed to other sites as well as what happens to records (e.g. audience manager) that are written to databases belonging to redundant databases in other deployment stacks.
  • Decommissioning of a Content Delivery environment: You can decommission an entire Content Delivery environment without having to unpublish content first.

Content Delivery as a Service (CDaaS)

The major change from architecture side for delivery is the introduction of Content Delivery as a Service or CDaaS for short. This new feature means that web applications can feed content from a SDL Web using a microservice approach. This approach allows non-Tridion (Web) skilled teams to talk to Web and minimizes any impact the libraries for Web would have on other applications.

Development and maintenance of the CDaaS and connected web applications can all happen separately on their own development tracks and upgrades to CDaaS will not affect the applications using the content (assuming the interfaces are reverse compatible). You can scale the CDaaS microservice separately to your other application services (a concept drawn from microservices architectural approach) which means that applications that are not content rich need not have large content delivery farms.

Discovery Service

The Discovery Service is now the know it all of the delivery farms, with the centralized webservice being the go to point to understand what content delivery endpoints there are deployed. The topology manager (see above) needs to only talk to the Discovery Service to get everything he needs to know.

[ Update 9th Feb 2016 – Thanks to Nuno Linhares for some corrections on the version numbers and the information regarding the .NET client for the Topology Manager]

SDL Tridion in the Cloud – Tridion Developer Summit 2014

Earlier this year I presented as lightning presentation at the Tridion Developer Summit in Amsterdam. The even was awesome and really well attended. Here are the slides that I used:

SDL CXC and Tridion

Today I posted some slides to slideshare on SDL CXC and Tridion. Just a simple overview of what we have done for Tridion on SDL’s CXC platform.

The SDL Tridion Reference Implementation

htmlRecently, SDL released the Tridion Reference Implementation, or TRI, to the community. The first edition of the TRI is based on .NET and aims to provide a reference for customers and partners on a preferred approach to an implementation.

Traditionally, Tridion has had an open architecture where a delivery implementation (e.g. an MVC framework) was not supplied with the product. This allowed customers to choose the delivery architecture which suited their needs. As a result we see variations from a simple ASP.NET/ JSP page based site to a fully dynamic MVC based implementation. With the release of the TRI, nothing changes in that regard but should you wish, there is an implementation that you can start from.

The TRI is an MVC based application that uses common elements and approaches with the community MVC application, DD4T. This means that customers using DD4T are compatible with the TRI from a Content Management point of view and so long as the TRI is getting content from something in the right format it can be independent of the content provider though support for semantic web (but by default, Tridion is the content provider). The templates are managed in the application and not in content management and the supplied templates follow a RESS approach and are re-brandable using bootstrap / LESS. The approach means that you can rapidly rebrand the application to another design and this can support a wide variety of devices using one site rather than separate websites for different devices.

The current release, version 1, will be replaced with the upcoming version 2 which will add new features – implementing more SDL product features – will have a java version and will become part of the SDL product stack and available with the installation of Tridion. Right now, you can contribute to the release by going ahead and using it.

You can download it from SDL Tridion World and look at the sources on github.

SDL Tridion Translation Manager

The last video of 2013 and I am now starting to really enjoy making them. If only I had more time. So, this one is about Tridion Translation Manager (TTM) the plugin that enables sending content to and from translation with SDL TMS or World Server.

SDL Tridion Whiteboarding Experience Manager

In this edition, I look at the basic architecture of Experience Manager and how it interacts with Content Manager.

SDL Tridion Whiteboarding: Basic Publishing Flow

In the next of my video series I am going to cover the basic publishing flow from Content Management to Content Delivery. Last time we looked at connecting the server types and this time we are diving into what flows within.


SDL Tridion Whiteboarding: Connecting Server Types

Last time I covered SDL Tridion Server Types and this time I have linked them together to make one Content Management and Content Delivery chain. I will expand on this further in the next video where I’ll look at the publishing chain in more detail.


