Tuesday, July 7, 2020
Home Data Science Find Datasets for your Machine Learning Projects

Find Datasets for your Machine Learning Projects

Source and Credit: Rodolfo Mendes https://wordpress.com/read/blogs/157318172/posts/450

The fundamental step to work on a Machine Learning project is to state well the problem you want to solve. Usually, the problem definition comes from a business or a domain like marketing, finance, medicine, engineering, and others. Thus, we must state the problem in terms of the business or the domain and then map it to a typical Machine Learning task.

Once we have clarity on which problem we are trying to solve, we need to get the necessary data to solve the problem. Data is the feedstock for Machine Learning models. They matter more than the algorithms you plan to use.

But building a dataset is not a trivial task. In a business environment, we need to reach other departments to get access to databases, documents, and unify different data sources. For personal projects, this is more difficult because sometimes we don’t even have access to the data we need. 

- Advertisement -

So if you are working in a project for your portfolio, it is a good idea to work on real and public datasets. Besides saving time from collecting and processing data, your work on open datasets can provide useful insights into public interest.   

In this post, we list some public repositories from which you can download datasets to use on your projects.

Collaborative Repositories

Universities and companies maintain these repositories. Researchers and practitioners from different areas and domains make their data available on these repositories:

Government and political organizations

Governments and organizations keep portals of public datasets about the economy, education, health, agriculture, and other areas of public interest. Below check data repositories from some international organizations:

- Advertisement -

Also, many countries maintain open data repositories:

Dataset search

Finally, you can search the Internet for the data you need. You can use the Google Dataset Search tool to search the Web to find the dataset you need:


Good Machine Learning projects start with quality data. Use the lists and tools above to find the data you need for your project. Also, you can navigate and explore the repositories to find ideas for new projects.

- Advertisement -

Subscribe to our newsletter

To be updated with all the latest news, offers and special announcements.


techsocialnetwork has teamed up with some great affiliates - check out the shop.




Most Popular

Fitbit’s Chinese rival Amazfit mulls a transparent, self-disinfecting mask

The COVID-19 pandemic has ushered in a wave of Chinese corporations with production functions to deliver virus-combating devices:...

10 Ways AI Is Improving Manufacturing In 2020

Machinery Maintenance and Quality are the leading AI transformation projects in manufacturing operations today, according to Capgemini.Caterpillar's Marine Division is saving $400K...

Review: Intel Hades Canyon NUC

Intel's fast mini PC weds a Core i7 to powerful AMD Vega graphics. Available on Amazon below:-

Will 5G be the technology to deliver Industry 4.0?

John Harris, Global R&D Director and Frank Wüstefeld, Senior Sales Engineer at Panasonic Mobile Solutions Business Division We have...