Featured Post

Set up machine learning and deep learning on AWS

Here is the simple instructions to set up a EC2 instance to run machine learning and deep learning on AWS 1.  Run an EC2 instance from ...

Showing posts with label data. Show all posts
Showing posts with label data. Show all posts

May 13, 2023

Machine Learning public datasets

There are many public datasets available for machine learning that can be used for research, experimentation, and model development. Here are some popular sources of public datasets for machine learning:

UCI Machine Learning Repository: This is a collection of datasets that cover a wide range of topics, including classification, regression, and clustering. The datasets are available in various formats, including CSV, ARFF, and others.

Kaggle Datasets: Kaggle is a platform for data science competitions and also provides a collection of public datasets. The datasets cover various domains, including computer vision, natural language processing, and tabular data.

Google Dataset Search: Google Dataset Search is a search engine for datasets that allows users to find datasets from a variety of sources, including government agencies, universities, and research institutions.

Amazon Web Services (AWS) Public Datasets: AWS provides a collection of public datasets that can be used for machine learning and other applications. The datasets cover a range of domains, including genomics, astronomy, and finance.

Open Data on AWS: This is a collection of public datasets that are hosted on AWS. The datasets cover various domains, including healthcare, finance, and transportation.

Data.gov: This is the US government's open data portal, which provides access to thousands of datasets from various government agencies.

Microsoft Research Open Data: This is a collection of datasets from Microsoft Research that cover various domains, including healthcare, education, and social media.