AWS Public Sector Blog

34 new or updated datasets on the Registry of Open Data: New data for land use, Alzheimer’s Disease, and more

Scientist looks at an image of a brain scan on a computer.

The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on Amazon Web Services (AWS). AWS works with data providers to democratize access to data by making it available to the public for analysis on AWS; develop new cloud-native techniques, formats, and tools that lower the cost of working with data; and encourage the development of communities that benefit from access to shared datasets. Through this program, customers are making over 100PB of high-value, cloud-optimized data available for public use.

The full list of publicly available datasets are on the Registry of Open Data on AWS and are now also discoverable on AWS Data Exchange. This quarter, AWS released 34 new or updated datasets from Impact Observatory, The Allen Institute for Brain Science, Common Screens, and others, which are available now on the Registry of Open Data.

10m Annual Land Use Land Cover (9-class)

10m Annual Land Use Land Cover (9-class) is a global map of land use/land cover (LULC) derived from the European Space Agency’s (ESA) Sentinel-2 satellite imagery at 10m resolution for the years 2017-2021. Each map provides an annual classification of built area, crops, trees, water, rangeland, flooded vegetation, snow/ice, and bare ground by applying a deep learning artificial intelligence (AI) land classification model to over 400,000 Sentinel-2 satellite images of Earth per year. LULC datasets like this provide users with the ability to measure changes over time and inform critical decision makers in governments, non-government organizations (NGOs), finance, and industry who need trustworthy, actionable information about the changing world.

The Seattle Alzheimer’s Disease (SEA-AD) Study

Understanding neurodegenerative disease requires data that describes patients on several planes: clinical assessments, imaging, and molecular attributes all contribute. The Seattle Alzheimer’s Disease Study is a rich multimodal dataset for 85 Alzheimer’s Disease patients that provides unprecedented insights into the disease process. Available data includes digital neuropathology, single cell transcriptomic data, and single cell chromatin accessibility, as well as basic clinical and demographic data.

Common Screens 

The Common Screens project is an expanding corpus of over 55 million screenshots from over 70 million websites on the Internet. Website screenshots allow for machine learning (ML) applications such as classification to identify malicious websites, parked domains, specific kinds of content, and design themes, among other applications. Along with screenshots, the project includes English language Optical Character Recognition (OCR) text, and a collection of ML models.

Here is a full list of the datasets released or significantly updated this quarter joining over 350 datasets already available:

Agriculture:

Astronomy:

Climate and weather:

Internet and networking:

Geospatial:

Life sciences:

Machine learning:

Statistical and regulatory:

Learn more about AWS for open data

Looking to make your data available? The AWS Open Data Sponsorship Program covers the cost of storage for publicly available high-value, cloud-optimized datasets. We work with data providers who seek to democratize access to data by making it available for analysis on AWS; to develop new cloud-native techniques, formats, and tools that lower the cost of working with data; and to encourage the development of communities that benefit from access to shared datasets. Learn how to propose your dataset to the AWS Open Data Sponsorship Program.

Learn more about open data on AWS.

Read more about AWS for open data:


Subscribe to the AWS Public Sector Blog newsletter to get the latest in AWS tools, solutions, and innovations from the public sector delivered to your inbox, or contact us.

Please take a few minutes to share insights regarding your experience with the AWS Public Sector Blog in this survey, and we’ll use feedback from the survey to create more content aligned with the preferences of our readers.