Skip to content

Update from the Real Time Cloud Working Group dataset – release 1.1 

update: real time cloud dataset release 1.1

As we’ve mentioned before, we are members of the industry body, the Green Software Foundation,and as part of our technical outreach work, we contribute to efforts in a number of their working groups; from leading standardisation projects with SCI-Web, shaping briefings in the policy working group, and working on datasets in Realtime Cloud working group. This last project is focused on creating an open dataset tracking how “clean” the energy is, that is used by the largest hyperscaler providers, like Microsoft,Google and Amazon, in their various cloud regions all around the world. There’s a new release out, and in this post, Director of Technology and Policy Chris Adams explains what’s in the latest dataset, where it can be used, and what’s next for the project.

What’s in the update?

The biggest change is that this dataset now collates the data to up to 2024 instead of 2023. This is the latest full year of data published by the big cloud providers in their own sustainability reports, data releases and so on, and compiles figures for every cloud computing region they make available to customers.

For the purposes of this dataset, you can think of a cloud computing region as a cluster of physical datacentres in a given part of the world, that show up as a single logical ‘place’ you might choose to deploy software into, when you use cloud services. You can see a concrete example of this in our recent post where we show on a map what Microsoft’s NorthEurope region really looks like on a map in the West of Dublin, Ireland.

Where data has not been published yet by a cloud provider directly (and some are more proactive than others about doing so), it is supplemented by data from other Green Software Foundation members, Electricity Maps and Watttime, to show how ‘clean’ the electricity grid they run on is. This means you don’t need to wait for a disclosure from the big cloud providers themselves to get an idea of the likely impacts of using each region.

Electricity Maps and Watttime have contributed average carbon intensity and marginal carbon intensity values respectively for the electricity grids in every part of the world containing a cloud computing region. 

Wait, what is this marginal and average stuff? Why are there two ways to measure how green the energy is?

It turns out there are more than two ways to measure how ‘green’ energy is at a moment in time or space, and they have their own specific uses.

The Real time Cloud project’s dataset guidance has more detail, but briefly, if you want to follow a standardised way of measuring the carbon intensity of software, using the approaches accepted by ISO Software Carbon Intensity standard, you need to use one of these two methods to convert electricity to carbon emissions in calculations.

Wait, it’s 2026! Why is there only data up to 2024?

This dataset is created only from data that has been published into the public domain by the companies themselves.

Generally speaking, most companies publish their sustainability reports about the last full year,and Amazon, Google and Microsoft have an established pattern of publishing in each summer. This means when they published their sustainability reports in the summer of 2025, these reports referred to data about the preceding year, 2024. We’ll likely need to wait until summer 2026 for information about 2025.

Some information can only come from the big cloud providers, as only they have access to information about the energy use in each of their facilities. However, not all of it needs to come from them.

If you don’t want to wait a full year for data, our latest release of CO2.js now contains the same annual carbon intensity data for every grid region in the world for 2025, thanks to a collaboration with Electricity Maps that we recently wrote about in our announcement post

What is missing?

This dataset currently contains information about the three largest providers: Microsoft with their Azure cloud, Amazon with AWS (Amazon Web Services) and Google with GCP (Google Cloud Platform).

At the time of publishing, the new 1.1 dataset was released, without the latest region data from Microsoft, who had been contributing to the project before but had not contributed data for 2024. 

The Real Time Cloud working group works in the open on Github, and every two weeks a meeting takes place, with minutes published shortly after. As the minutes suggest, there are some promising signs that Microsoft is re-engaging with the project, and that the data that would have made it into this 1.1 release will be published as an extra add-on “1.1.1” release.

If this data really does make it into the public domain, we’ll post an update, and make it accessible on datasets.greenweb.org.

Where next for the project?

This project has been running for two years, and over the next year, there’s few potential areas for further development. We’ll list a few that have caught our eye:

More recent data for existing providers in the dataset

The next set of sustainability reports and data from the big three providers are expected in the summer of 2026, for the year of 2025. Assuming data is disclosed as required, this should contain the information necessary to add official per-region data for 2025.

Representing changes how how datacentres are powered

Finally, one of the big challenges with compiling a dataset for tracking how clean the power from large provides like this is representing the industry trend of using much more on-site generation than before.

In an earlier post, we showed how the Irish northeurope region from Microsoft now has 170MW of gas-fired reciprocating engine generators installed, and a permit to run them daily. Last week, we saw a new announcement that Nscale, one of the new ‘neocloud’ AI datacentre providers, was building a massive new datacentre campus, powered by 1.3 GW on-site gas generation for use by Microsoft, it’s main customer. For context, this one facility would be about the same size as half of the entire UK’s current datacentre capacity.

Elsewhere, we are now seeing announcements of projects like a recent one from Google to run a datacentre in Minnesota with 300MW Iron-Air batteries capable of running for more than four days, but also a project to finance a new gas power plant with carbon capture and storage to power a new datacentre in Illinois.

Expanding support to cover new providers

Faced with this, it makes sense to expand coverage as well, to include other large providers, like Oracle, but also the new set of “neoclouds”, like Coreweave, Nebius, Nscale and so on.

There is now a standardised schema to publish, based on two years of tracking what data is normally disclosed by large cloud providers, and what disclosures have become required by law where there is now legislation requiring it. If you work for a cloud provider, and you’re interested in publishing data following the schema, or if you’re a purchaser and you’re looking for a standardised set of data to request in your tenders, drop us a line – we’d love to work with you.