Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Miscellaneous | ILP | Geographic data: ilpgeo | Geographic data | http://www.fi.muni.cz/kd/..., http://, http:// | ~100 facts, positive examples only + background knowledge: predicates for the relevant domains as well as basic arithmetic predicates | Prolog | Apr 10th, 2003 | |||||
Modelling, Diagnosis, Control | ILP | BLEARN: Learning Relational Concepts from Sensor Data of Mobile Robots | Learning Relational Concepts from Sensor Data of a Mobile Robot | http://www-ai.cs.uni-dort..., http://, http:// | first order logic | Jan 28th, 2003 | ||||||
A L: Ailin Liu | nerve | http://www.imm.ac.cn | Mar 4th, 2003 | |||||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
text classification | generic zocor: generic zocor | Anti-spam email filtering | http://www.iit.demokritos..., http://, http:// | 4314 KBytes | "encrypted" ASCII text | Jun 9th, 2006 | ||||||
ala: ala eldin misbah mhammed | Aug 14th, 2003 | |||||||||||
information retrieval | ala: ala elden misbah | Oct 18th, 2003 | ||||||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
shiying: shiying | it | Jan 9th, 2004 | ||||||||||
text classification | PU1: The PU1 anti-spam filtering corpus | Anti-spam email filtering | http://www.iit.demokritos..., http://www.aueb.gr/users/..., http:// | 4314 KBytes | "encrypted" ASCII text | May 4th, 2004 | ||||||
Web Mining | Multi-Instance Learning | MilWeb: Data for Multi-Instance Learning Based Web Index Recommendation | web appliation | http://lamda.nju.edu.cn/d..., http://, http:// | 30.2 Mb | ZIP | Jun 27th, 2004 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Modelling, Diagnosis, Control | ILP | BLEARN: Learning Relational Concepts from Sensor Data of Mobile Robots | Learning Relational Concepts from Sensor Data of a Mobile Robot | http://www-ai.cs.uni-dort..., http://, http:// | first order logic | Dec 8th, 2004 | ||||||
Dec 14th, 2005 | ||||||||||||
Jan 31st, 2006 | ||||||||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
21 black jack play statistics | life insurance: http://perso.wanadoo.es/coplace/life-insurance.html life insurance: life insurance, life insurance, life insurance. http://perso.wanadoo.es/coplace/life-insurance.html life insurance here. Good job ! | http://perso.wanadoo.es/c..., http://perso.wanadoo.es/c..., http://perso.wanadoo.es/c... | http://perso.wanadoo.es/coplace/life-insurance.html life insurance: life insurance, life insurance, life insurance. http://perso.wanadoo.es/coplace/life-insurance.html life insurance here. Good job ! | http://perso.wanadoo.es/coplac | Apr 18th, 2006 | |||||||
business | 125 percent loan to value home | Star: David | united states | StarDavid Company | May 30th, 2006 | |||||||
Chess | ILP | KRK: King-Rook-King (exact data) | Learning Rules for King+Rook vs. King Endgame | http://www.comlab.ox.ac.u..., http://, http:// | 184 KB (tar, gzip) | Golem | Apr 21st, 2005 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Information Extraction | Doc. collection | RISE: Repository of online Information Sources used in information Extraction tasks | RISE is a distributed repository of online information sources that are used for the empirical analysis of [machine] learning algorithms that generate extraction patterns | http://www.isi.edu/~musle... | 10 datasets with several hundred html pages each | RISE format (text + extraction | Aug 12th, 1999 | |||||
Modelling, Diagnosis, Control | ILP | BLEARN: Learning Relational Concepts from Sensor Data of Mobile Robots | Learning Relational Concepts from Sensor Data of a Mobile Robot | http://www-ai.cs.uni-dort... | first order logic | Aug 12th, 1999 | ||||||
Modelling, Diagnosis, Control | ILP | UTUBE | Learning qualitative models from example behaviors | ftp://ftp.mlnet.org/ml-ar... | 4 positive and 543 negative examples | Prolog (mFOIL format) | Sep 7th, 1999 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Computer | ILP | Fossil: Fossil.xiang | Learning Rules for Predicting Mutagenesis | http://www.comlab.ox.ac.u..., http://, http:// | 131 KB compressed file with positive and negative examples for the subsets of 188 and 42 compounds | Progol | Jan 30th, 2004 | |||||
Molecular Biology | ILP | Proteins | Learning Rules for Predicting Protein Secondary Structure | http://www.comlab.ox.ac.u..., http://, http:// | 46 KB (tar, gzip) | Golem | Jun 25th, 2002 | |||||
Molecular Biology | ILP | Drugs | Learning Drug Structure-Activity Rules for Alzheimer's disease | http://www.comlab.ox.ac.u..., http://, http:// | 37 KB | Prolog | Mar 13th, 2006 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Molecular Biology | ILP | drugs | Learning Drug Structure-Activity Rules for Suramin Analogues | http://www.comlab.ox.ac.u... | 23KB | Progol | Aug 12th, 1999 | |||||
Mesh Design | ILP | Mesh ggg: Finite Element Mesh Design (Partial Dataset) | Finite Element Mesh Design | http://www.comlab.ox.ac.u..., http://, http:// | 57 KB (tar, gzip) | Golem | Apr 21st, 2006 | |||||
Mesh Design | ILP | Mesh: Finite Element Mesh Design (Complete Dataset) | Finite Element Mesh Design | ftp://ftp.mlnet.org/ml-ar..., http://, http:// | 642 positive + 3804 background examples | Prolog facts | Mar 25th, 2005 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Language | ILP | Geoquery | Geoquery | ftp://ftp.cs.utexas.edu/p... | 250 facts | Prolog | Aug 12th, 1999 | |||||
Language | ILP | English Past Tense | English past tense | ftp://ftp.cs.utexas.edu/p... | 1392 facts (in the largest alph. past data) | Prolog | Aug 12th, 1999 | |||||
Language | ILP | Document understanding | Document understanding | ftp://ftp.mlnet.org/ml-ar... | 250 training and 120 test instances approximately | FOCL format | Oct 19th, 2000 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Modelling, Diagnosis, Control | ILP + prop. data | Stagedata | Reverse Engineering for Flight Simulator | ftp://www.mlnet.org/ml-ar... | 160K/real world data | Prolog/Attribute-value pairs | Sep 7th, 1999 | |||||
Modelling, Diagnosis, Control | ILP | Satellite | Learning Rules for Qualitative Models of Satellite Power Supplies | http://www.comlab.ox.ac.... | 78 KB (tar, gzip) | Golem | Aug 12th, 1999 | |||||
Chess | ILP | KRK: King-Rook-King (exact + noisy data) | Learning Rules for King+Rook vs. King Endgame | http://www.comlab.ox.ac.u..., http://, http:// | 5 sets of 1000 examples each and 5 sets of 100 examples each. Variants of the latter with three different types and six different levels of noise | Golem | Mar 8th, 2002 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Samples from ILP systems | ILP | LINUS | LINUS | ftp://ftp.mlnet.org/ml-ar..., ftp://ftp.mlnet.org/ml-ar... | small examples | Linus | May 15th, 2000 | |||||
Samples from ILP systems | ILP | FORTE: First Order Revision of Theories from Examples | Forte (First Order Revision of Theories from Examples) | http://www.cs.utexas.edu/..., ftp://ftp.cs.utexas.edu/p... | simple sample datasets | Forte format | Aug 12th, 1999 | |||||
Samples from ILP systems | ILP | SkilIt | SkilIt (Recursive Theories) | http://www.ncc.up.pt/~amj... | small examples | Prolog | Aug 12th, 1999 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Samples from ILP systems | ILP | HAIKU | HAIKU | scholar examples (3 concepts) | Prolog | Aug 12th, 1999 | ||||||
Samples from ILP systems | ILP | WiM | WiM | the minimal sets of the worst possible examples such that WiM can learn the target predicate | Prolog | Aug 12th, 1999 | ||||||
Miscellaneous | ILP | Spatial Layout Data | Spatial Layout Data for GKS System | ftp://mizo01.ia.noda.sut.... | 174 positive and 214 negative examples, 1500 clauses of background knowledge | Prolog | Aug 12th, 1999 | |||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Miscellaneous | ILP | East-West Challenge | East-West/Challenge | ftp://ftp.mlnet.org/ml-ar... | 20/24/100 trains data | Prolog | Oct 19th, 2000 | |||||
Miscellaneous | ILP | Holiday Planning | Holiday Planning | ftp://ftp.informatik.hu-b... | 1470 examples | Prolog | Aug 12th, 1999 | |||||
Miscellaneous | ILP | Geographic data | Geographic data | ~100 facts, positive examples only + background knowledge: predicates for the relevant domains as well as basic arithmetic predicates | Prolog | Aug 12th, 1999 | ||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Learning from WWW | ILP + Doc. collection | Four Universities: WebKB - Four universities data set | http://www.cs.cmu.edu/afs... | 8,282 web pages (28MB), ILP data (26.5MB) | raw html data, FOIL | Sep 7th, 1999 | ||||||
miscellaneous | propositional | Regression DataSets: Repository of Regression DataSets for Propositional Learning Algorithms | various | http://www.ncc.up.pt/~lto... | C4.5-like | Sep 29th, 1999 | ||||||
Information Retrieval | Doc. collection | OHSUMED: OHSUMED test collection | Information retrieval on clinically-oriented MEDLINE subset (1987-1991) | ftp://medir.ohsu.edu/pub/... | 348,566 references, approx. 400 MBye | Oct 20th, 1999 | ||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Information Retrieval | Doc. collections | TREC: Text REtrieval Conference test collections | http://trec.nist.gov/data... | 5 disks, approx. 250,000 documents / 800MBytes each | SGML | Oct 20th, 1999 | ||||||
Information Retrieval | Doc. collections | CFD: Cystic Fibrosis Database | Information retrieval on articles about cystic fibrosis | ftp://ils.unc.edu/pub/res... | 1239 documents, approx. 5MByte | Marked-up text | Oct 20th, 1999 | |||||
Information Retrieval | Doc. collections | SMART: SMART test collections | ftp://ftp.cs.cornell.edu/... | Marked-up text | Oct 20th, 1999 | |||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Text Categorization | Doc. collection | Reuters: Reuters-21578 Text Categorization Test Collection | http://www.research.att.c... | 21578 documents, 28 MByte | SGML | Dec 14th, 1999 | ||||||
Web Advertisements: Feature vectors characterizing images from actual web pages and their classification (advertisement or non-advertisement) | Trainable Web browsing assistants | http://www.cs.ucd.ie/staf..., http://, http:// | 10MB file, 3279 instances; 1558 attributes | "zip" file format | Mar 24th, 2003 | |||||||
Data Warehouse | kdd-sisyphus | Life-Insurance Data Warehouse extract. NOT preprocessed data for KDD, several relations | http://research.swisslife... | Prolog Facts | Oct 19th, 2000 | |||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
KDD | DB tables | PKDD99: PKDD99 Discovery Challenge | Financial (clients of a bank, their accounts, loans etc.) Medical (patients with thrombosis) | http://lisp.vse.cz/pkdd99..., http://, http:// | Financial data -8 tables - 18M zipped, 67M unzipped Medical data - 3 tables - 2M zipped, 6M unzipped | text files | Aug 19th, 2002 | |||||
text classification | generic zocor: generic zocor | Anti-spam email filtering | http://www.iit.demokritos..., http://www.aueb.gr/users/..., http:// | 13994 KBytes tar-gzipped | ASCII text | Jun 9th, 2006 | ||||||
text classification | PU1: The PU1 anti-spam filtering corpus | Anti-spam email filtering | http://www.iit.demokritos..., http://www.aueb.gr/users/..., http:// | 4314 KBytes | "encrypted" ASCII text | Nov 13th, 2003 | ||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
KDD | KDD-Cup 2000 | web clickstreams and purchase transactions | http://www.ecn.purdue.edu... | Oct 19th, 2000 | ||||||||
business | home insurance new uk | Star: http://www.braincurance.org/homeinsurnewuk.html | David | united states | agree | May 30th, 2006 | ||||||
Miscellaneous | dmef: dmef-data set | consumer behaviour | http://www.the-dma.org/dm..., http://, http:// | May 14th, 2002 | ||||||||
Category | Type | Name | Application domain | WWW | Complexity | Format | Updated | |||||
Learning from WWW | Web Ad: Web Advertisements | Trainable Web browsing assistants | http://www.cs.ucd.ie/staf... | 10MB file, 3279 instances; 1558 attributes | "zip" file format | Jun 13th, 2001 | ||||||
Language | ILP | M & K data: ANdrea | Natural Language Parsing | ftp://ftp.cs.utexas.edu/p..., http://, http:// | 1450 facts | Prolog | May 30th, 2005 | |||||
Molecular Biology | ILP | QSARs: Learning Qualitative Structure Activity Relationships (QSARs) for Pyrimidine and Triazine Compounds | Learning Qualitative Structure Activity Relationships (QSARs) for Inhibition of E. Coli Dihydrofolate Reductase | ftp://ftp.mlnet.org/ml-ar..., http://www.comlab.ox.ac.u... | 3.5 MB | Golem | Oct 19th, 2000 |
| ||||||
|
| |||||