A while back I wrote about how to fall in love with publishing in SDL Tridion. It’s true; you will fall in love with it when it is working well. In the article I describe the more end user aspect of publishing but typically there are allot of aspect to publishing that the end user cannot control but that are controlled by the IT or hosting organization.
So there I am, its 5 PM on a Friday and I am trying to get out that article that I have written in the afternoon. I am waiting for it to publish and it is taking longer than I want because I have better things to do. It’s Friday after all! I have to wait in that queue for my job to be processed after all the other jobs that were submitted before mine. But it is a queue, right? That’s what happens, I join the back and I wait until I get to the front for my turn. (At least that’s how we do it in the UK. Also, we mumble.)
So what determines how long the queue takes?
Certainly not the user! The user determines how many jobs are in the queue but not how long it takes to complete the jobs. The duration is mostly determined by a) the task, b) the templates, c) the servers and d) the configuration.
I am sure many of you have queued in the bank.( I am sure that is also why many of you turned to online banking.) There are 5 people in front of you but how long are you waiting? 5 minutes or 10 minutes? The answer is, you have no idea… it depends on what those people want from the bank. Some might want information, some cash and others might want a loan. Each person represents a complexity that will take x amount of minutes of handling by the servers (and yes and I expressly used the word “server” rather than “clerk”). Publishing Jobs submitted by the users are the same, in that the jobs vary in complexity and that cannot be determined by looking superficially at the job (or person) in question.
The complexity is determined in two ways. Firstly, what the user tried to publish and secondly, how the data model is constructed. Now I know I said the user does not determine the queue duration and now I contradicted myself. But, I don’t believe that it is the user’s job to determine whether or not something should be published they need to publish what they need to publish. However, it is important to note that some Tridion items, when published, can take more items along for the ride. A Structure Group, for example, has pages and nested Structure Groups which need to be published also.
The data model determines our relationships between items. So when I publish an item, additional items will be taken because they complete an item. This typically is a small number of items, but the data model could need attention if publishing a single item leads to excessive numbers of additional items.
When I get to the front of the queue at the bank, I am most likely going to be presented with some forms to fill in. Those legal documents that lets me get the money to buy a car or get a new credit card. How long it takes me to fill out those forms will determine how long it will be before I am finished. The smaller and simpler the form, the better! My publishing job will execute templates to create some sort of output (e.g. HTML, XML, Java or ASP.NET code). The templates take time to execute and many templates may have to be executed for one job. The larger and more complex these templates, the longer it will be before I see my publish job completed.
The speed the servers work at and the amount of simultaneous activities they can complete affects the overall speed of publishing a job. The servers must therefore be scaled to meet the load requirements of the environment. Much like the bank, the overall throughput effects the waiting time of any job, with a single server I will wait the longest, with multiple parallel servers the waiting time will be reduced. In most scaled environments the tendency will be to separate out publishing from other server functions (database, management interface), this dedication of a task means that the server can concentrate on the same repetitive task without being interrupted with other business and therefore improve the overall publishing throughput of this server.
The configuration options with SDL Tridion, allow you to manipulate how the queue is managed; it in essence all publishing jobs are equal but with Publishing Priorities and Filters some jobs are more equal than others. Priorities can be set by the end user at the time of publish or as a rule on a given Publishing Target. The priority (high, normal or low) allows the most important tasks to go first (or the least important tasks last) and works like any priority system would do.
Filtering adds an extra dimension to this and the overall way items are removed from the queue for publishing. Many banks have separate desks for different tasks. If you go to deposit some money you use a different desk to the desk where you get a loan. Filtering does the same thing, in that it allows us to specify certain servers to complete certain jobs depend upon its configuration. Filtering is possible by Publication (e.g. German Website), Publication Target (e.g. Live) or Priority (e.g. High) – or a combination of multiple filters with multiple values. So for certain areas of your business you could, for example, dedicate servers to complete just those tasks; so in times of lots of house buying, we have more servers on the loans desk and we divide our throughput unevenly across our customers.
4 actions to improve to improve publishing
I have not encountered a single organization yet who could not do with faster publishing. Even when you think it is as fast as it can be, there will still be room for improvement somewhere. In summary, I have four points you can act upon that can help you love publishing that little bit more…
- Understand what you publish! You can’t manage what you can’t measure, so start measuring! The information you need in available in the API, so a simple report will be enough to help you collect the data needed:
- What is your total job load per day?
- How many items do these jobs consist of?
- How long does it take to render the items in the job?
- Investigate to see if adjusting the configuration can ensure that important publications get the time they deserve and those publications lower down the content management food chain get less time.
- Have IT analyze the performance of your servers under stress, can the performance be improved? If so, then you will gain in throughput.
- Undertake and analysis of the templates and the data model to see where performance can be improved. Having the data from the measurements can really help zoom into what is performing badly.