Cloud computing

Cloud computing

Britain is a nation of weather watchers; the country’s location and, in the modern era its pre-eminence in world trade, make that inevitable. There’s still nothing on earth so unpredictable as the weather, but it was when the Met Office was set up in 1854 that meteorology started to be established as a science.

The organisation started off in the Board of Trade, was moved to the Air Ministry during the Great War as its strategic role came to the fore, and became a ‘trading fund’ in 1996, a move that allows it to act as a commercial organisation and generate some of the £209 million or so a year it costs to run.  It has been part of the Department for Business Innovation and Skills since 2011, moving there from the Ministry of Defence.

About half that sum comes direct from the UK taxpayer to pay for the Public Weather Service, delivering services such as the National Severe Weather Warning service (established in 1987). It also pays for the underpinning infrastructure the Met Office needs, including technologies such as the UK’s participation in EUMETSAT, the EU organisation for weather satellites.

In 1984 the Met Office was designated one of only two World Area Forecast Centres (WAFCs) for civil aviation, (the other is Washington) to provide global forecasts for flying at high altitude. Without this global air traffic would simply grind to a halt.

The Met Office has been a prime adopter of technology since the days when its first head of department, Captain (later Rear-Admiral) Robert FitzRoy introduced synoptic charts, where weather observations taken at the same time were drawn on a map to aid forecasting, a technique still used today, albeit now an automated process.

In 1909 wireless telegraphy made weather readings from ships at sea accessible. Today the Met Office receives observations from aircraft, ships, buoys, satellites and land stations all over the world at a rate of around 100 million observations a day.

Crunching big numbers

To collate all of this, the Met Office needs a huge amount of processing power. Its first computer in 1959 was a Ferranti Mercury, capable of doing 30,000 calculations a second, which made numerical based predictions possible for the first time. But the amount of data and the possible refinements have since increased exponentially. Charles Ewen explains it in terms of Moore’s Law whereby computing capacity tends to double every year driven by ever higher densities of transition on the computer chips.

“Initially, we are trying to describe the state of the Earth’s atmosphere at a given point in time,” he says. “That has to be done on a global scale, as the UK’s weather is influenced by other air masses. We build that picture using millions of observations that come in from land-based, space-based, air-based and marine-based platforms, 24 hours a day, 365 days a year in collaboration with the World Meteorological Organisation (WMO) a UN collaboration between 191 member states.”  These observations have been shared freely and openly by the United Nations convention across geographical and political borders for many years.   This reinforces the notion that open data is not new.

These observations are combined on a three dimensional grid or lattice so that the computers can use them to initiate the process of something called Numerical Weather Prediction, or NWP. 

With close to half of its near 2,000 employees describing themselves as Scientists or Meteorologists, the Met Office is a world leader in weather and climate research.  Led by Chief Scientist of the Met Office, Professor Dame Julia Slingo FRS, its research seeks to understand the factors that drive weather and climate systems so that they can predict them with ever increasing skill.  The science that the Met Office undertakes is more and more collaborative, nationally and internationally working with universities and research organisations like the Natural Environment Research Council (NERC) and America’s National Oceanic and Atmospheric Administration (NOAA).

The Met Office Academic Partnership is a collaboration with the universities of Exeter, Leeds, Oxford and Reading. “It’s not a typical research model,” says Ewen, “But it’s very effective. They put some of their blue sky principles on hold to work on coordinated themes of research which deliver more than the sum of the parts in terms of scientific understanding. It also helps us to foster the next generation of scientist,” he says.

About 1,400 Met Office staff work at its Exeter campus and amongst them are some of the world’s leading experts and the authors and co-authors of  pivotal bodies of research such as the International Panel for Climate Change (IPCC). Underneath them, both literally and figuratively, sit the huge computers that run the Met Office Unified Model (MetUM), that is managed at the Met Office and adopted in a number of other institutions around the world.

The knowledge generated by the science programme ends up being encoded in the MetUM, a series of programs running into millions of lines of code that take the state of the world’s atmosphere ‘now’, and project it forward into the future.

Bringing a few million lines of code into collision with a snapshot of the atmosphere all the way out to the edges of space and the computer’s job of prediction begins. “The initialisation process is one of the most computationally ‘expensive’ things we do,” says Ewen. “It involves some very tricky maths and there are a lot of feedbacks required in the process.”

A programme of data assimilation calculates a forecast model trajectory that best fits the available observations, but even with a billion points of observation – and there are that many – chaos theory dictates that tiny variations can disrupt the model considerably with the passage of time.

“A number of fundamental equations dominate the way weather works, these were described by an ex-Met Office employee called Lewis Fry Richardson in the 1920s.  Today’s weather forecasts are more complex and also have to deal with the challenges of chaos, which were described in meteorology by Edward Lorenz, who used the phrase ‘A flap of a seagull’s wings may forever change the course of the weather’, to explain the impact of chaos on weather. Applying the complex maths at a scale made possible by modern Moore’s Law driven computers, generates data sets that are almost unimaginably massive.

The word ‘trajectory’ is significant – it is no longer the goal of the Met Office to provide deterministic forecasts – forecasts that say definitively what will happen and when; chaos and Lorenz show this is not possible, especially in longer timeframes. More useful in the real world are probabilistic forecasts that map the likelihood of events, but these require even more sophisticated calculations.

The best way to tackle this is to run the NWP a number of times, with a series of ‘perturbations’ built in around known uncertainties.  The ‘snap shot’, of the atmosphere will not be completely accurate and running the model a number of times, all with slightly different starting conditions produces simulations that result in different possible outcomes.  Clever statistics allow the degree of variance in these outcomes to provide information about the probabilities associated with the weather forecast.

Thus while processing power is important, the defining long-term challenge for the Met Office’s IT is input and output, or I/O, says Ewen. “One of the chief limitations on how big we can make a machine and consequently how reliable a forecast can be, is how much data we can get in and out of these model nodes.”  Just handling the data generated by NWP and using it to generate weather forecasts presents some major challenges.


The supercomputing challenge

It’s not surprising that Ewen says that Dr. Moore keeps him awake at night. “When you are starting out with a very large supercomputer, the IBM POWER 7 that we put in six years ago, this inexorable Moore’s Law-driven change has huge implications.” The current mass storage system in Exeter holds around 60 petabytes of data: the contract recently signed with Oracle and SGI for its replacement will take the capacity up to an exabyte, or 1,000 petabytes by 2018, and Ewen is confident the Met Office will become one of the UK’s first exascale organisations. In Gartner’s 3V definition of big data – variety, volume and velocity – it already qualifies, in spades, on every count.

The business case for the capital grant of £97 million awarded in October 2014 was based on the calculation that the new machine will deliver an estimated £2 billion of socio-economic benefit to the UK over its lifetime through benefits such as improved forecast accuracy. Previous purchasing cycles have been paid for out of the trading fund’s revenues: this grant emphasis the strategic national importance of having the new supercomputer and mass storage system.

After looking at the performance and cost effectiveness of the very few alternatives, a contract was signed with Cray for its XC40 machine, Cray’s biggest ever contract outside the USA. It is already partially installed in the Met Office and the old machine was decommissioned in parallel during September, five weeks ahead of schedule. To go from signing to implementation in five months is unprecedented, and Ewen can’t find words to convey the achievement which involved Met Office Science & Technology teams as well as specialists at Cray to deliver a major IT project earlier than expectations.

Of course there’s much more to it than construction - the MetUM needs to be modified to run on the new machine and countless other adjustments had to be made, with never the luxury of even a day’s downtime. “We broke the programme down into three phases,” he explains. “Phase 1a gives us broadly the same computational capacity as the machine it is replacing. It is that part of the new machine that allows us to turn off and decommission the old machine.

“Phase 1b gives us some more capacity, and 1c is the final tranche of capacity, involving a new data centre being built a mile away on a science park. That will be commissioned Spring 2017 and will give us the final 70 percent of our new computational capacity.” By then its processing power will be 23 petaflops - meaning it can perform 23 trillion calculations every second or more calculations per second than there are grains of sand on every beach in the world!

Even then, the new computer may not feature in the world top ten of high speed computers, a list dominated by machines that are built primarily for speed rather than utility. You wouldn’t expect a delivery truck to outpace a Formula 1 car, but this machine is a delivery truck that will, for a while, leave the race cars in its wake, while still delivering heavy loads with reliability.

Technology pairs with science

Ewen is a man with many hats. As the only technology director on the Executive Board of the Met Office, he also fills the role of Chief Information Officer, with a Chief Technical Officer and Chief Digital Officer reporting to him. His division has a headcount of around 350, though in an organisation like this it is not always helpful to draw a distinction between ‘science’ and ‘technology’. Accordingly he works closely with the Chief Scientist and is wary of the firewall mentality, which reduces IT to a repair and maintenance function to which people come with their problems.

As well as having responsibility for the technology, he also wears the title of Chief Information Risk Owner, a public sector role that involves overseeing information and security matters in a department that, after all, has only recently spun off from under the defence umbrella. We already mentioned how critical the Met Office’s role is to aviation. It faces all the normal threats of a large IT based organisation, and in addition has a strategic role to play in the national economy. It works more with global governments and businesses so security will become its USP, and it has to be world class there as in everything else.

Ewen is not exactly a cloud evangelist. At the scale of activity we have been describing the Met Office clearly needs its own data centre, and the cloud, as he points out, is really just someone else’s data centre. What it does provide though, is what he calls instant provisioning.

“People are more interested in weather when there is interesting weather about! We get 20 or 30 times more traffic on our website when there is dramatic weather going on than at normal times. You can cope with that in two ways. You can either have an infrastructure that is scaled at your peak anticipated load in which case it will be idle most of the time, or you can come up with an architecture that can scale dynamically to meet current needs.”

In such a case, cloud provisioning is attractive, and the Met Office was one of the earliest adopters of cloud scaling technology to meet the needs of its public websites. It uses a content distribution network (CDN) from specialist provider Akamai. However ‘moving to the cloud’ is a meaningless concept in the context of an organisation like the Met Office, and one that he finds a bit frustrating. “We see it very much as tools in a toolbox, part of a mixed economy where we make the right choices rather than have a goal to move to the cloud by a certain date.”

Matching methodology to meteorology

This approach to project delivery is a corollary to not tolerating firewalls between departments. “Closer to the application space we use agile techniques, quite highly structured, and Scrum-based methodologies.” For those unfamiliar with the idea, Scrum is a model that involves multiple small but interdisciplinary teams working in an intensive and interdependent way to produce an outcome.

“That is not necessarily bounded by functional requirements. Functional requirements clearly get defined, but they don’t get defined at the first phase; they get defined at a later phase when we know what is realistic, affordable and achievable before we commit to delivery of a set of functional or non-functional requirements.”

‘Waterfall’ methodologies, he explains, would start by creating a well-formed requirements set, the kind of attitude that typifies the failed public sector IT projects in times gone by. “Where that happens it’s often a sign of setting out to build a castle but finding you only had enough bricks for a semi and the customer actually wanted a caravan! I am a believer in iterative development, recognising that you almost certainly do not know what the customer wants because they often don’t know what is possible or what they need!”

In a recent government assessment of 40 departments for ‘green IT’ the Met Office came out as the only one in a leading position for effectiveness and efficiency and has recently won a major UK IT Industry Award. “We work hard,” asserts Ewen, “to make sure IT is not seen as a cost but as a core facilitator for the business. I am constantly looking at how to invest to save and to grow.”

It surprises him to see that there are still organisations that don’t understand that. “It is wrong to see IT as something that should be managed down. At times, there are good strategic reasons for managing it up! My strategy for the last two or three years has been moving IT from being an enabler to being a leader, that is, a position of appropriate shared leadership.”

IT is undoubtedly moving to the top floor in every organisation, having been locked in the basement for too long - we are used to science being part of the leadership process, so it is a short hop to affording IT the same level of participation. Technology is moving too fast to be called on at the end of a decision process: it needs to play a dynamic role from the outset.


Met Office
Charles Ewen