Ma dataset 52 a ntchito zophunzitsira

  1. Mall Customers Dataset - zidziwitso za alendo ogulitsa: id, jenda, zaka, ndalama, ndalama zomwe amawononga. (Njira yofunsira: Ntchito Yogawa Makasitomala Ndi Kuphunzira Kwamakina)
  2. Iris Dataset - gulu la oyambira, lomwe lili ndi makulidwe a sepals ndi maluwa amaluwa osiyanasiyana.
  3. Zithunzi za MNIST - mndandanda wa manambala olembedwa pamanja. Zithunzi zophunzitsira 60 ndi zithunzi 000 zoyeserera.
  4. Boston Housing Dataset ndi gulu lodziwika bwino lodziwika bwino. Ili ndi zambiri zamanyumba ku Boston: kuchuluka kwa nyumba, mitengo yobwereketsa, index yaumbanda.
  5. Fake News Detection Dataset - ili ndi zolemba 7796 zokhala ndi nkhani: zoona kapena zabodza. (Njira yogwiritsira ntchito ndi code source ku Python: Fake News Detection Python Project )
  6. Vinyo khalidwe dataset - ili ndi zambiri za vinyo: 4898 zolemba ndi 14 magawo.
  7. Zambiri za SOCR - Matali ndi Kulemera Kwa data - njira yabwino yoyambira nayo. Lili ndi zolemba 25 za kutalika ndi kulemera kwa anthu azaka 000.

    Ma dataset 52 a ntchito zophunzitsira

    Nkhaniyi idamasuliridwa mothandizidwa ndi EDISON Software, yomwe amakwaniritsa malamulo ochokera ku Southern China "mwabwino kwambiri"ndipo imapanga mawebusayiti ndi mawebusayiti.

  8. Parkinson Dataset - 195 mbiri ya odwala Parkinson matenda, ndi 25 kusanthula magawo. Angagwiritsidwe ntchito kuunika koyambirira kwa kusiyana pakati pa odwala ndi anthu athanzi. (Njira yogwiritsira ntchito ndi code source ku Python: Pulojekiti Yophunzirira Makina pa Kuzindikira Matenda a Parkinson)
  9. Titanic Dataset - ili ndi zambiri za okwera (zaka, jenda, achibale omwe ali m'bwalo, ndi zina zotero) 891 mu seti yophunzitsira ndi 418 mu seti yoyesera.
  10. Uber Pickups Dataset - zambiri za maulendo 4.5 miliyoni pa Uber mu 2014 ndi 14 miliyoni mu 2015. (Njira yogwiritsira ntchito ndi code source mu R: Uber Data Analysis Project mu R)
  11. Zithunzi za Chars74K - ili ndi zithunzi za zizindikiro za British ndi Canada za makalasi 64: 0-9, AZ, az. 7700 7.7k zithunzi zachilengedwe, 3400k zolembedwa pamanja, 62000 makompyuta apanga zilembo.
  12. Ngongole Yachinyengo Detection Dataset - ili ndi zambiri zokhudzana ndi zochitika zama kirediti kadi omwe asokonezedwa. (Njira yogwiritsira ntchito ndi gwero: Ntchito Yophunzirira Makina Ophunzirira Makhadi a Ngongole)
  13. Chatbot Intents Dataset - Fayilo ya JSON yomwe ili ndi ma tag osiyanasiyana: moni, chabwino, hospital_search, pharmacy_search, ndi zina zotero. Lili ndi ma tempulo a mayankho a mafunso. (Njira yogwiritsira ntchito ndi code source ku Python: Ntchito ya Chatbot ku Python)
  14. Enron Email Dataset - ili ndi zilembo theka miliyoni kuchokera kwa oyang'anira 150 Enron.
  15. Yelp Dataset - ili ndi malingaliro 1,2 miliyoni kuchokera kwa ogwiritsa ntchito 1,6 miliyoni pafupifupi mabungwe 1,2 miliyoni.
  16. Jeopardy Dataset - zojambulidwa zopitilira 200 za mafunso ndi mayankho zochokera pamasewera otchuka apawailesi yakanema.
  17. Recommender Systems Dataset - portal yokhala ndi zosunga zobwezeretsera kuchokera ku UCSD University. Muli ndi mbiri ya ndemanga pa malo otchuka (Goodreads, Amazon). Zabwino kupanga ma recommender systems. (Njira yogwiritsira ntchito ndi code source mu R: Kanema Recommendation System Project mu R )
  18. UCI Spambase Dataset - nkhokwe yophunzitsira kuzindikira sipamu. Ili ndi zilembo 4601 zokhala ndi magawo 57 a metadata.
  19. Flickr 30k Dataset - zithunzi ndi mawu opitilira 30. (Flickr 8k Dataset - 8000 zithunzi. Python source Project: Chithunzi Chojambula Ntchito ya Python Project)
  20. Ndemanga za IMDB - Ndemanga zamakanema 25 mu seti yophunzitsira ndi 000 mu seti yoyeserera. (Njira yogwiritsira ntchito ndi code source mu R: Sentiment Analysis Data Science Project)
  21. Chithunzi cha MS COCO - Zithunzi zojambulidwa ndi 1,5 miliyoni.
  22. CIFAR-10 ndi CIFAR-100 dataset - CIFAR-10 ili ndi zithunzi 60,000 zazing'ono za 32 * 32 mapikiselo manambala 0-9. CIFAR-100 - motero, 0-100.
  23. GTSRB (benchmark yozindikirira chizindikiro cha magalimoto ku Germany) Seti ya data - Zithunzi za 50 za zizindikiro 000 zamsewu. (Njira yogwiritsira ntchito ndi code source ku Python: Project Python Recognition Signs Signs)
  24. Zithunzi za ImageNet - ili ndi mawu opitilira 100 ndi zithunzi pafupifupi 000 pa liwu lililonse.
  25. Zithunzi za M'mawere Histopathology Dataset - gululi lili ndi zithunzi za zitsanzo za khansa ya m'mawere. (Njira yogwiritsira ntchito yokhala ndi code source Pulojekiti ya Python ya Cancer ya M'mawere)
  26. Cityscapes Dataset - ili ndi zofotokozera zapamwamba zamakanema am'misewu m'mizinda yosiyanasiyana.
  27. Kinetics Dataset - ili ndi ulalo wa ulalo wamakanema apamwamba pafupifupi 6,5 miliyoni.
  28. MPII positi yamunthu - dataset ili ndi zithunzi 25 zamawonekedwe amunthu okhala ndi mawu olumikizana.
  29. 20BN-chinachake-dataset v2 - makanema apamwamba kwambiri omwe amawonetsa momwe munthu amachitira zinthu zina.
  30. Chinthu 365 Dataset - mndandanda wazithunzi zamtundu wapamwamba wokhala ndi mabokosi omangira zinthu.
  31. Seti yojambula zithunzi - ili ndi zithunzi zopitilira 1000 ndi zojambula zawo.
  32. Chithunzi cha CQ500 - dataset ili ndi 491 CT scans yamutu yokhala ndi magawo 193.
  33. IMDB-Wiki dataset - gulu la data lomwe lili ndi zithunzi zopitilira 5 miliyoni zodziwika ndi jenda ndi zaka. (Njira yogwiritsira ntchito yokhala ndi code source Gender & Age Detection Python Project)
  34. Youtube 8M Dataset - Makanema olembedwa omwe ali ndi ma ID amakanema a Youtube 6,1 miliyoni
  35. Urban Sound 8K dataset - seti yamawu akutawuni (ili ndi mawu 8732 akutawuni kuchokera m'makalasi 10).
  36. Zithunzi za LSUN - mndandanda wazithunzi mamiliyoni azithunzi ndi zinthu (pafupifupi zithunzi 59 miliyoni, magulu 10 amitundu yosiyanasiyana ndi magulu 20 azinthu).
  37. RAVDESS Dataset - nkhokwe ya audiovisual yamawu amawu. (Njira yogwiritsira ntchito yokhala ndi code source Speech Emotion Recognition Python Project)
  38. Librispeech Dataset - Chidziwitsocho chili ndi maola 1000 olankhula Chingerezi okhala ndi mawu osiyanasiyana.
  39. Baidu Apolloscape Dataset - gulu lachidziwitso lachitukuko cha matekinoloje odziyendetsa okha.
  40. Quandl Data Portal - malo osungiramo zinthu zachuma ndi zachuma (pali zaulere komanso zolipira).
  41. Bungwe la World Bank Open Data Portal - zambiri zangongole zoperekedwa ndi World Bank kumayiko omwe akutukuka kumene.
  42. IMF Data Portal ndi thumba la thumba la ndalama zapadziko lonse lapansi lomwe limasindikiza zandalama zapadziko lonse lapansi, mitengo ya ngongole, ndalama, nkhokwe za ndalama zakunja ndi katundu.
  43. American Economic Association (AEA) Data Portal - Chida chofufuzira deta ya US macroeconomic.
  44. Google Trends Data Portal - Zambiri za Google zitha kugwiritsidwa ntchito kufufuza ndi kusanthula deta.
  45. Financial Times Market Data Portal ndi chida chothandizira kudziwa zaposachedwa pamisika yazachuma padziko lonse lapansi.
  46. Data.gov Portal - Boma la US lotseguka la data portal (zaulimi, thanzi, nyengo, maphunziro, mphamvu, zachuma, sayansi ndi kafukufuku, ndi zina).
  47. Data Portal: Tsegulani za boma (India) ndi nsanja yotseguka ya boma yaku India.
  48. Malo azakudya Atlas Data Portal - ili ndi kafukufuku wokhudzana ndi zakudya ku United States.
  49. Health Data Portal ndi portal ya US Department of Health and Human Services.
  50. Centers for Disease Control and Prevention Data Portal - ili ndi zambiri zokhudzana ndi thanzi.
  51. London Datastore Portal - zambiri za moyo wa anthu ku London.
  52. Canada Government Open Data Portal - malo otseguka okhudza anthu aku Canada (zaulimi, zaluso, nyimbo, maphunziro, boma, zaumoyo, ndi zina)

Werengani zambiri

Source: www.habr.com

Kuwonjezera ndemanga