Find Datasets for your Machine Learning Projects

Source and Credit: Rodolfo Mendes https://wordpress.com/read/blogs/157318172/posts/450

The fundamental step to work on a Machine Learning project is to state well the problem you want to solve. Usually, the problem definition comes from a business or a domain like marketing, finance, medicine, engineering, and others. Thus, we must state the problem in terms of the business or the domain and then map it to a typical Machine Learning task.

Once we have clarity on which problem we are trying to solve, we need to get the necessary data to solve the problem. Data is the feedstock for Machine Learning models. They matter more than the algorithms you plan to use.

But building a dataset is not a trivial task. In a business environment, we need to reach other departments to get access to databases, documents, and unify different data sources. For personal projects, this is more difficult because sometimes we don’t even have access to the data we need. 

So if you are working in a project for your portfolio, it is a good idea to work on real and public datasets. Besides saving time from collecting and processing data, your work on open datasets can provide useful insights into public interest.   

In this post, we list some public repositories from which you can download datasets to use on your projects.

Collaborative Repositories

Universities and companies maintain these repositories. Researchers and practitioners from different areas and domains make their data available on these repositories:

Government and political organizations

Governments and organizations keep portals of public datasets about the economy, education, health, agriculture, and other areas of public interest. Below check data repositories from some international organizations:

Also, many countries maintain open data repositories:

Dataset search

Finally, you can search the Internet for the data you need. You can use the Google Dataset Search tool to search the Web to find the dataset you need:

Conclusion

Good Machine Learning projects start with quality data. Use the lists and tools above to find the data you need for your project. Also, you can navigate and explore the repositories to find ideas for new projects.

More from author

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Related posts

Latest posts

10 Best Websites to Browse During Weekend Hours

A collection of samples out on the table and above it that he had recently cut out of an illustrated magazine and housed in...

How to Dramatically Increase Session Time on Your Website

A collection of samples out on the table and above it that he had recently cut out of an illustrated magazine and housed in...

The Best Website Experience for Housing Platforms

A collection of samples out on the table and above it that he had recently cut out of an illustrated magazine and housed in...

Want to stay up to date with the latest news?

We would love to hear from you! Please fill in your details and we will stay in touch. It's that simple!