MLnetOiS Logo left

MLnetOiS Logo right

Resources:Datasets

 

Index
Resources
* Bibliography
* Courses
* Datasets
* Links
* Showcases
* Software

List of datasets

For updating your dataset's information use the 'update'-button on the right-hand side of the table. Click on the table headers to get the appropriate sorting order. The white arrows on the left-hand-side can be clicked to jump back to the top of the page.

Add Add your datasets to the database.

Thanks to Dimitar Kazakov (University of York, UK), Lubos Popelinsky and Olga Stepankova (Faculty of Electrical Engineering, CTU Prague, Czech Republic) for annotating and contributing most of the ILP datasets!

 

 

 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsMiscellaneous ILP Geographic data: ilpgeo Geographic data http://www.fi.muni.cz/kd/..., http://, http:// ~100 facts, positive examples only + background knowledge: predicates for the relevant domains as well as basic arithmetic predicates Prolog   Apr 10th, 2003Update now!
up arrowDetailsModelling, Diagnosis, Control ILP BLEARN: Learning Relational Concepts from Sensor Data of Mobile Robots Learning Relational Concepts from Sensor Data of a Mobile Robot http://www-ai.cs.uni-dort..., http://, http://  first order logic 

data available

data available

Jan 28th, 2003Update now!
 Details  A L: Ailin Liu nerve  http://www.imm.ac.cn    

data available

Mar 4th, 2003Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 Detailstext classification  generic zocor: generic zocor Anti-spam email filtering http://www.iit.demokritos..., http://, http:// 4314 KBytes "encrypted" ASCII text   Jun 9th, 2006Update now!
up arrowDetails  ala: ala eldin misbah mhammed       Aug 14th, 2003Update now!
 Detailsinformation retrieval  ala: ala elden misbah       Oct 18th, 2003Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 Details  shiying: shiying it    

data available

data available

Jan 9th, 2004Update now!
up arrowDetailstext classification  PU1: The PU1 anti-spam filtering corpus Anti-spam email filtering http://www.iit.demokritos..., http://www.aueb.gr/users/..., http:// 4314 KBytes "encrypted" ASCII text 

data available

data available

May 4th, 2004Update now!
 DetailsWeb Mining Multi-Instance Learning MilWeb: Data for Multi-Instance Learning Based Web Index Recommendation web appliation http://lamda.nju.edu.cn/d..., http://, http:// 30.2 Mb ZIP 

data available

data available

Jun 27th, 2004Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsModelling, Diagnosis, Control ILP BLEARN: Learning Relational Concepts from Sensor Data of Mobile Robots Learning Relational Concepts from Sensor Data of a Mobile Robot http://www-ai.cs.uni-dort..., http://, http://  first order logic 

data available

data available

Dec 8th, 2004Update now!
up arrowDetails        

data available

Dec 14th, 2005Update now!
 Details         Jan 31st, 2006Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 Details21 black jack play statistics  life insurance: http://perso.wanadoo.es/coplace/life-insurance.html life insurance: life insurance, life insurance, life insurance. http://perso.wanadoo.es/coplace/life-insurance.html life insurance here. Good job !  http://perso.wanadoo.es/c..., http://perso.wanadoo.es/c..., http://perso.wanadoo.es/c... http://perso.wanadoo.es/coplace/life-insurance.html life insurance: life insurance, life insurance, life insurance. http://perso.wanadoo.es/coplace/life-insurance.html life insurance here. Good job ! http://perso.wanadoo.es/coplac 

data available

data available

Apr 18th, 2006Update now!
up arrowDetailsbusiness 125 percent loan to value home Star: David  united states StarDavid Company  

data available

data available

May 30th, 2006Update now!
 DetailsChess ILP KRK: King-Rook-King (exact data) Learning Rules for King+Rook vs. King Endgame http://www.comlab.ox.ac.u..., http://, http:// 184 KB (tar, gzip) Golem   Apr 21st, 2005Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsInformation Extraction Doc. collection RISE: Repository of online Information Sources used in information Extraction tasks RISE is a distributed repository of online information sources that are used for the empirical analysis of [machine] learning algorithms that generate extraction patterns  http://www.isi.edu/~musle... 10 datasets with several hundred html pages each RISE format (text + extraction   Aug 12th, 1999Update now!
up arrowDetailsModelling, Diagnosis, Control ILP BLEARN: Learning Relational Concepts from Sensor Data of Mobile Robots Learning Relational Concepts from Sensor Data of a Mobile Robot http://www-ai.cs.uni-dort...  first order logic 

data available

data available

Aug 12th, 1999Update now!
 DetailsModelling, Diagnosis, Control ILP UTUBE Learning qualitative models from example behaviors  ftp://ftp.mlnet.org/ml-ar... 4 positive and 543 negative examples Prolog (mFOIL format) 

data available

data available

Sep 7th, 1999Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsComputer ILP Fossil: Fossil.xiang Learning Rules for Predicting Mutagenesis  http://www.comlab.ox.ac.u..., http://, http:// 131 KB compressed file with positive and negative examples for the subsets of 188 and 42 compounds  Progol 

data available

data available

Jan 30th, 2004Update now!
up arrowDetailsMolecular Biology ILP Proteins Learning Rules for Predicting Protein Secondary Structure  http://www.comlab.ox.ac.u..., http://, http:// 46 KB (tar, gzip) Golem 

data available

data available

Jun 25th, 2002Update now!
 DetailsMolecular Biology ILP Drugs Learning Drug Structure-Activity Rules for Alzheimer's disease http://www.comlab.ox.ac.u..., http://, http:// 37 KB  Prolog   Mar 13th, 2006Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsMolecular Biology ILP drugs Learning Drug Structure-Activity Rules for Suramin Analogues http://www.comlab.ox.ac.u... 23KB  Progol   Aug 12th, 1999Update now!
up arrowDetailsMesh Design ILP Mesh ggg: Finite Element Mesh Design (Partial Dataset) Finite Element Mesh Design http://www.comlab.ox.ac.u..., http://, http:// 57 KB (tar, gzip) Golem   Apr 21st, 2006Update now!
 DetailsMesh Design ILP Mesh: Finite Element Mesh Design (Complete Dataset) Finite Element Mesh Design ftp://ftp.mlnet.org/ml-ar..., http://, http:// 642 positive + 3804 background examples Prolog facts   Mar 25th, 2005Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsLanguage ILP Geoquery Geoquery ftp://ftp.cs.utexas.edu/p... 250 facts Prolog   Aug 12th, 1999Update now!
up arrowDetailsLanguage ILP English Past Tense English past tense  ftp://ftp.cs.utexas.edu/p... 1392 facts (in the largest alph. past data)  Prolog   Aug 12th, 1999Update now!
 DetailsLanguage ILP Document understanding Document understanding ftp://ftp.mlnet.org/ml-ar... 250 training and 120 test instances approximately FOCL format   Oct 19th, 2000Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsModelling, Diagnosis, Control ILP + prop. data Stagedata Reverse Engineering for Flight Simulator  ftp://www.mlnet.org/ml-ar... 160K/real world data  Prolog/Attribute-value pairs   Sep 7th, 1999Update now!
up arrowDetailsModelling, Diagnosis, Control ILP Satellite Learning Rules for Qualitative Models of Satellite Power Supplies  http://www.comlab.ox.ac.... 78 KB (tar, gzip)  Golem   Aug 12th, 1999Update now!
 DetailsChess ILP KRK: King-Rook-King (exact + noisy data) Learning Rules for King+Rook vs. King Endgame http://www.comlab.ox.ac.u..., http://, http:// 5 sets of 1000 examples each and 5 sets of 100 examples each. Variants of the latter with three different types and six different levels of noise  Golem   Mar 8th, 2002Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsSamples from ILP systems ILP LINUS LINUS ftp://ftp.mlnet.org/ml-ar..., ftp://ftp.mlnet.org/ml-ar... small examples Linus 

data available

data available

May 15th, 2000Update now!
up arrowDetailsSamples from ILP systems ILP FORTE: First Order Revision of Theories from Examples Forte (First Order Revision of Theories from Examples) http://www.cs.utexas.edu/..., ftp://ftp.cs.utexas.edu/p... simple sample datasets  Forte format   Aug 12th, 1999Update now!
 DetailsSamples from ILP systems ILP SkilIt SkilIt (Recursive Theories)  http://www.ncc.up.pt/~amj... small examples  Prolog  

data available

Aug 12th, 1999Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsSamples from ILP systems ILP HAIKU HAIKU   scholar examples (3 concepts)  Prolog  

data available

Aug 12th, 1999Update now!
up arrowDetailsSamples from ILP systems ILP WiM WiM  the minimal sets of the worst possible examples such that WiM can learn the target predicate  Prolog   Aug 12th, 1999Update now!
 DetailsMiscellaneous ILP Spatial Layout Data Spatial Layout Data for GKS System  ftp://mizo01.ia.noda.sut.... 174 positive and 214 negative examples, 1500 clauses of background knowledge Prolog   Aug 12th, 1999Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsMiscellaneous ILP East-West Challenge East-West/Challenge  ftp://ftp.mlnet.org/ml-ar... 20/24/100 trains data  Prolog  

data available

Oct 19th, 2000Update now!
up arrowDetailsMiscellaneous ILP Holiday Planning Holiday Planning ftp://ftp.informatik.hu-b... 1470 examples Prolog   Aug 12th, 1999Update now!
 DetailsMiscellaneous ILP Geographic data Geographic data   ~100 facts, positive examples only + background knowledge: predicates for the relevant domains as well as basic arithmetic predicates  Prolog   Aug 12th, 1999Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsLearning from WWW ILP + Doc. collection Four Universities: WebKB - Four universities data set  http://www.cs.cmu.edu/afs... 8,282 web pages (28MB), ILP data (26.5MB) raw html data, FOIL  

data available

Sep 7th, 1999Update now!
up arrowDetailsmiscellaneous propositional Regression DataSets: Repository of Regression DataSets for Propositional Learning Algorithms various http://www.ncc.up.pt/~lto...  C4.5-like 

data available

data available

Sep 29th, 1999Update now!
 DetailsInformation Retrieval Doc. collection OHSUMED: OHSUMED test collection Information retrieval on clinically-oriented MEDLINE subset (1987-1991) ftp://medir.ohsu.edu/pub/... 348,566 references, approx. 400 MBye    Oct 20th, 1999Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsInformation Retrieval Doc. collections TREC: Text REtrieval Conference test collections  http://trec.nist.gov/data... 5 disks, approx. 250,000 documents / 800MBytes each SGML   Oct 20th, 1999Update now!
up arrowDetailsInformation Retrieval Doc. collections CFD: Cystic Fibrosis Database Information retrieval on articles about cystic fibrosis ftp://ils.unc.edu/pub/res... 1239 documents, approx. 5MByte Marked-up text  

data available

Oct 20th, 1999Update now!
 DetailsInformation Retrieval Doc. collections SMART: SMART test collections  ftp://ftp.cs.cornell.edu/...  Marked-up text   Oct 20th, 1999Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsText Categorization Doc. collection Reuters: Reuters-21578 Text Categorization Test Collection  http://www.research.att.c... 21578 documents, 28 MByte SGML   Dec 14th, 1999Update now!
up arrowDetails  Web Advertisements: Feature vectors characterizing images from actual web pages and their classification (advertisement or non-advertisement) Trainable Web browsing assistants http://www.cs.ucd.ie/staf..., http://, http:// 10MB file, 3279 instances; 1558 attributes  "zip" file format  

data available

data available

Mar 24th, 2003Update now!
 DetailsData Warehouse  kdd-sisyphus Life-Insurance Data Warehouse extract. NOT preprocessed data for KDD, several relations  http://research.swisslife...  Prolog Facts 

data available

 Oct 19th, 2000Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsKDD DB tables PKDD99: PKDD99 Discovery Challenge Financial (clients of a bank, their accounts, loans etc.) Medical (patients with thrombosis) http://lisp.vse.cz/pkdd99..., http://, http:// Financial data -8 tables - 18M zipped, 67M unzipped Medical data - 3 tables - 2M zipped, 6M unzipped text files 

data available

data available

Aug 19th, 2002Update now!
up arrowDetailstext classification  generic zocor: generic zocor Anti-spam email filtering http://www.iit.demokritos..., http://www.aueb.gr/users/..., http:// 13994 KBytes tar-gzipped ASCII text   Jun 9th, 2006Update now!
 Detailstext classification  PU1: The PU1 anti-spam filtering corpus Anti-spam email filtering http://www.iit.demokritos..., http://www.aueb.gr/users/..., http:// 4314 KBytes "encrypted" ASCII text 

data available

data available

Nov 13th, 2003Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsKDD  KDD-Cup 2000 web clickstreams and purchase transactions http://www.ecn.purdue.edu...    

data available

Oct 19th, 2000Update now!
up arrowDetailsbusiness home insurance new uk Star: http://www.braincurance.org/homeinsurnewuk.html David united states  agree 

data available

data available

May 30th, 2006Update now!
 DetailsMiscellaneous  dmef: dmef-data set consumer behaviour http://www.the-dma.org/dm..., http://, http://     May 14th, 2002Update now!
 <-Back

Legend

up arrow CategoryTypeNameApplication domainWWWComplexityFormat

Contact person

Groups

Updated

 
 DetailsLearning from WWW  Web Ad: Web Advertisements Trainable Web browsing assistants  http://www.cs.ucd.ie/staf... 10MB file, 3279 instances; 1558 attributes  "zip" file format  

data available

data available

Jun 13th, 2001Update now!
up arrowDetailsLanguage ILP M & K data: ANdrea Natural Language Parsing ftp://ftp.cs.utexas.edu/p..., http://, http:// 1450 facts Prolog   May 30th, 2005Update now!
 DetailsMolecular Biology ILP QSARs: Learning Qualitative Structure Activity Relationships (QSARs) for Pyrimidine and Triazine Compounds Learning Qualitative Structure Activity Relationships (QSARs) for Inhibition of E. Coli Dihydrofolate Reductase ftp://ftp.mlnet.org/ml-ar..., http://www.comlab.ox.ac.u... 3.5 MB Golem  

data available

Oct 19th, 2000Update now!

Index
Resources
* Bibliography
* Courses
* Datasets
* Links
* Showcases
* Software