Schema matching bibliography
From Scratchpad
- [Agr90]{agresti90} A. Agresti. {\em Categorical Data Analysis}. Wiley, New York, NY, 1990.
- [AK97]{ashish97} N. Ashish and C. Knoblock. Wrapper generation for semi-structured internet sources. {\em SIGMOD Record}, 26(4):8--15, 1997.
- [BC86]{biskup86}
J. Biskup and B. Convent. A formal view integration method. In {\em Proceedings of the ACM Conf. on Management of Data (SIGMOD)},
1986.
- [BCVB01]{momis}
S. Bergamaschi, S. Castano, M. Vincini, and D. Beneventano. Semantic integration of heterogeneous information sources. {\em Data and Knowledge Engineering}, 36(3):215--249, 2001.
- [Ber03]{bernstein-cidr03}
P. Bernstein. Applying model management to classical meta data problems. In {\em Proceedings of the Conf. on Innovative Database Research
(CIDR)}, 2003.
- [BG00]{rdf}
D. Brickley and R. Guha. Resource description framework schema specification 1.0, 2000.
- [BHP00]{phil-vision-paper}
P. Bernstein, A. Halevy, and R. Pottinger. A vision for management of complex models. {\em ACM SIGMOD Record}, 29(4):55--63, 200.
- [BKD{\etalchar{+}}01]{fensel01}
J. Broekstra, M. Klein, S. Decker, D. Fensel, F. van Harmelen, and I. Horrocks. Enabling knowledge representation on the {W}eb by extending {RDF}
schema.
In {\em Proceedings of the Tenth Int. World Wide Web Conference},
2001.
- [BLHL01]{berners-lee}
T. Berners-Lee, J. Hendler, and O. Lassila. The {S}emantic {W}eb. {\em Scientific American}, 279, 2001.
- [BLN86]{bln86}
C. Batini, M. Lenzerini, and SB. Navathe. A comparative analysis of methodologies for database schema
integration.
{\em ACM Computing Survey}, 18(4):323--364, 1986.
- [BM01]{autoplex}
J. Berlin and A. Motro. Autoplex: {A}utomated discovery of content for virtual databases. In {\em Proceedings of the Conf. on Cooperative Information Systems
(CoopIS)}, 2001.
- [BM02]{automatch}
J. Berlin and A. Motro. Database schema matching using machine learning with feature
selection.
In {\em Proceedings of the Conf. on Advanced Information Systems
Engineering (CAiSE)}, 2002.
- [CA99]{artemis}
S. Castano and V. De Antonellis. A schema analysis and reconciliation tool environment. In {\em Proceedings of the Int. Database Engineering and Applications
Symposium (IDEAS)}, 1999.
- [CDI98]{hypertext}
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In {\em Proceedings of the ACM SIGMOD Conference}, 1998.
- [CGL01]{calvanese}
D. Calvanese, D. G. Giuseppe, and M. Lenzerini. Ontology of integration and integration of ontologies. In {\em Proceedings of the 2001 Description Logic Workshop (DL
2001)}, 2001.
- [CH98]{cohen-kdd98}
W. Cohen and H. Hirsh. Joins that generalize: Text classification using {WHIRL}. In {\em Proc. of the Fourth Int. Conf. on Knowledge Discovery and
Data Mining (KDD)}, 1998.
- [Cha00]{ontomorph}
H. Chalupsky. Ontomorph: A translation system for symbolic knowledge. In {\em Principles of Knowledge Representation and Reasoning}, 2000.
- [CHR97]{clifton97}
C. Clifton, E. Housman, and A. Rosenthal. Experience with a combined approach to attribute-matching across
heterogeneous databases.
In {\em Proc. of the IFIP Working Conference on Data Semantics
(DS-7)}, 1997.
- [CRF00]{quilt}
Donald D. Chamberlin, Jonathan Robie, and Daniela Florescu. Quilt: An {XML} query language for heterogeneous data sources. In {\em WebDB (Informal Proceedings) 2000}, pages 53--62, 2000.
- [CT91]{coverthomas}
T. M. Cover and J. A. Thomas. {\em Elements of Information Theory}. Wiley, New York, NY, 1991.
- [dam]{daml}
www.daml.org.
- [DDH01]{lsd}
A. Doan, P. Domingos, and A. Halevy. Reconciling schemas of disparate data sources: A machine learning
approach.
In {\em Proceedings of the ACM SIGMOD Conference}, 2001.
- [DDH03]{lsd-mlj}
A. Doan, P. Domingos, and A. Halevy. Learning to match the database schemas: A multistrategy approach. {\em Machine Learning}, 2003. Special Issue on Multistrategy Learning. To Appear.
- [DFF{\etalchar{+}}99]{xmlql}
A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for {XML}. In {\em Proceedings of the International Word Wide Web Conference,
Toronto, CA}, 1999.
- [DH74]{dudahart}
R. O. Duda and P. E. Hart. {\em Pattern Classification and Scene Analysis}. John Wiley and Sons, New York, 1974.
- [DJMS02]{bellman-system}
T. Dasu, T. Johnson, S. Muthukrishnan, and V. Shkapenyuk. Mining database structure; or, how to build a data quality browser. In {\em Proceedings of the ACM Conf. on Management of Data (SIGMOD)},
2002.
- [DMDH02]{glue}
A. Doan, J. Madhavan, P. Domingos, and A. Halevy. Learning to map ontologies on the {S}emantic {W}eb. In {\em Proceedings of the World-Wide Web Conference (WWW-02)}, 2002.
- [DMR02]{erhard-eval}
H. Do, S. Melnik, and E. Rahm. Comparison of schema matching evaluations. In {\em Proceedings of the 2nd Int. Workshop on Web Databases (German
Informatics Society)}, 2002.
- [DP97]{domingos&pazzani97}
P. Domingos and M. Pazzani. On the optimality of the simple bayesian classifier under zero-one
loss.
{\em Machine Learning}, 29:103--130, 1997.
- [DR96]{donoho96}
S. Donoho and L. Rendell. Constructive induction using fragmentary knowledge. In {\em Proc. of the 13th Int. Conf. on Machine Learning}, pages
113--121, 1996.
- [DR02]{coma}
H. Do and E. Rahm. Coma: A system for flexible combination of schema matching
approaches.
In {\em Proceedings of the 28th Conf. on Very Large Databases
(VLDB)}, 2002.
- [EJX01]{embley01}
D. Embley, D. Jackman, and L. Xu. Multifaceted exploitation of metadata for attribute match discovery
in information integration.
In {\em Proceedings of the WIIW Workshop}, 2001.
- [EP90]{ep90}
AK. Elmagarmid and C. Pu. Guest editors' introduction to the special issue on heterogeneous
databases.
{\em ACM Computing Survey}, 22(3):175--178, 1990.
- [Fen01]{fensel-book01}
D. Fensel. {\em Ontologies: {S}ilver {B}ullet for {K}nowledge {M}anagement and
{E}lectronic {C}ommerce}.
Springer-Verlag, 2001.
- [Fre98]{freitag-thesis}
Dayne Freitag. Machine learning for information extraction in informal domains. {\em Ph.D. Thesis}, 1998. Dept. of Computer Science, Carnegie Mellon University.
- [FW97]{friedman-ijcai97}
M. Friedman and D. Weld. Efficiently executing information-gathering plans. In {\em Proc. of the Int. Joint Conf. of AI (IJCAI)}, 1997.
- [GMPQ{\etalchar{+}}97]{tsimmis97}
H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv,
J. Ullman, and J. Widom.
The {TSIMMIS} project: Integration of heterogeneous information
sources.
{\em Journal of Intelligent Inf. Systems}, 8(2), 1997.
- [HGMN{\etalchar{+}}98]{hammer97}
J. Hammer, H. Garcia-Molina, S. Nestorov, R. Yerneni, M. Breunig, and
V. Vassalos.
Template-based wrappers in the {TSIMMIS} system (system
demonstration).
In {\em ACM Sigmod Record}, Tucson, Arizona, 1998.
- [HH01]{shoe}
J. Heflin and J. Hendler. A portrait of the {S}emantic {W}eb in action. {\em IEEE Intelligent Systems}, 16(2), 2001.
- [HNR72]{hart72}
P. Hart, N. Nilsson, and B. Raphael. Correction to ``a formal basis for the heuristic determination of
minimum cost paths.
{\em SIGART Newsletter}, 37:28--29, 1972.
- [HZ83]{rlvision}
R.A. Hummel and S.W. Zucker. On the foundations of relaxation labeling processes. {\em PAMI}, 5(3):267--287, May 1983.
- [iee01]{ieee}
{\em IEEE Intelligent Systems}, 16(2), 2001.
- [IFF{\etalchar{+}}99]{tukwila}
Z. Ives, D. Florescu, M. Friedman, A. Levy, and D. Weld. An adaptive query execution system for data integration. In {\em Proc. of SIGMOD}, 1999.
- [ILM{\etalchar{+}}00]{sagres}
Z. Ives, A. Levy, J. Madhavan, R. Pottinger, S. Saroiu, I. Tatarinov,
S. Betzler, Q. Chen, E. Jaslikowska, J. Su, and W. Yeung.
Self-organizing data sharing communities with sagres. In {\em Proceedings of the 2000 ACM SIGMOD International Conference
on Management of Data}, page 582, 2000.
- [KMA{\etalchar{+}}98]{ariadne}
C. Knoblock, S. Minton, J. Ambite, N. Ashish, P. Modi, I. Muslea, A. Philpot,
and S. Tejada.
Modeling web sources for information integration. In {\em Proc. of the National Conference on Artificial Intelligence
(AAAI)}, 1998.
- [KSL{\etalchar{+}}99]{proverb}
G. Keim, N. Shazeer, M. Littman, S. Agarwal, C. Cheves, J. Fitzgerald,
J. Grosland, F. Jiang, S. Pollard, and K. Weinmeister.
{PROVERB}: The probabilistic cruciverbalist. In {\em Proc. of the 6th National Conf. on Artificial Intelligence
({AAAI}-99)}, pages 710--717, 1999.
- [Kus00a]{kushmerick2000}
N. Kushmerick. Wrapper induction: Efficiency and expressiveness. {\em Artificial Intelligence}, 118(1--2):15--68, 2000.
- [Kus00b]{kushmerickwrapper}
N. Kushmerick. Wrapper verification. {\em World Wide Web Journal}, 3(2):79--94, 2000.
- [LC94]{semint-vldb}
W. Li and C. Clifton. Semantic integration in heterogeneous databases using neural
networks.
In {\em Proceedings of the Conf. on Very Large Databases (VLDB)},
1994.
- [LC00]{semint00}
W. Li and C. Clifton. {SEMINT}: A tool for identifying attribute correspondence in
heterogeneous databases using neural networks.
{\em Data and Knowledge Engineering}, 33:49--84, 2000.
- [LCL00]{semint-journal}
W. Li, C. Clifton, and S. Liu. Database integration using neural network: implementation and
experience.
{\em Knowledge and Information Systems}, 2(1):73--96, 2000.
- [LG01]{lacher}
M. Lacher and G. Groh. Facilitating the exchange of explixit knowledge through ontology
mappings.
In {\em Proceedings of the 14th Int. FLAIRS conference}, 2001.
- [Lin98]{infosim}
D. Lin. An information-theoretic definition of similarity. In {\em Proceedings of the International Conference on Machine
Learning (ICML)}, 1998.
- [LKG99]{emerac}
E. Lambrecht, S. Kambhampati, and S. Gnanaprakasam. Optimizing recursive information gathering plans. In {\em Proc. of the Int. Joint Conf. on AI (IJCAI)}, 1999.
- [Llo83]{lloyd83}
S. Lloyd. An optimization approach to relaxation labeling algorithms. {\em Image and Vision Computing}, 1(2), 1983.
- [LRO96]{levy-im2-96}
A. Y. Levy, A. Rajaraman, and J. Ordille. Querying heterogeneous information sources using source descriptions. In {\em Proc. of {VLDB}}, 1996.
- [MBR01]{cupid}
J. Madhavan, P.A. Bernstein, and E. Rahm. Generic schema matching with {C}upid. In {\em Proceedings of the International Conference on Very Large
Databases (VLDB)}, 2001.
- [MFRW00]{chimaera}
D. McGuinness, R. Fikes, J. Rice, and S. Wilder. The {C}himaera ontology environment. In {\em Proceedings of the 17th National Conference on Artificial
Intelligence}, 2000.
- [MHDB02]{madhavan-aaai02}
J. Madhavan, A. Halevy, P. Domingos, and P. Bernstein. Representing and reasoning about mappings between domain models. In {\em Proceedings of the National AI Conference (AAAI-02)}, 2002.
- [MHH00]{miller00}
R. Miller, L. Haas, and M. Hernandez. Schema mapping as query discovery. In {\em Proc. of {VLDB}}, 2000.
- [MHTH01]{peter-mork-paper}
P. Mork, A. Halevy, and P. Tarczy-Hornoch. A model of data integration system of biomedical data applied to
online genetic databases.
In {\em Proceedings of the Symposium of the American Medical
Informatics Association}, 2001.
- [MMGR02]{simflood}
S. Melnik, H. Molina-Garcia, and E. Rahm. Similarity flooding: a versatile graph matching algorithm. In {\em Proceedings of the International Conference on Data
Engineering (ICDE)}, 2002.
- [MN98]{mccallum-twoevents}
A. McCallum and K. Nigam. A comparison of event models for {N}aive {B}ayes text classification. In {\em Proceedings of the AAAI-98 Workshop on Learning for Text
Categorization}, 1998.
- [MS99]{manning99}
C. Manning and H. Sch{\"{u}}tze. {\em Foundations of Statistical Natural Language Processing}, pages
575--608.
The MIT Press, Cambridge, US, 1999.
- [MS01]{onto-learn}
A. Maedche and S. Saab. Ontology learning for the {S}emantic {W}eb. {\em IEEE Intelligent Systems}, 16(2), 2001.
- [MT94]{michalski&tecuci94}
R. Michalski and G. Tecuci, editors. {\em Machine Learning: A Multistrategy Approach}. Morgan Kaufmann, 1994.
- [MWJ]{skat}
P. Mitra, G. Wiederhold, and J. Jannink. Semi-automatic integration of knowledge sources. In {\em Proceedings of Fusion'99}.
- [MZ98]{transcm}
T. Milo and S. Zohar. Using schema matching to simplify heterogeneous data translation. In {\em Proceedings of the International Conference on Very Large
Databases (VLDB)}, 1998.
- [NHT{\etalchar{+}}02]{howard}
F. Neumann, CT. Ho, X. Tian, L. Haas, and N. Meggido. Attribute classification using feature analysis. In {\em Proceedings of the Int. Conf. on Data Engineering (ICDE)},
2002.
- [NM00]{prompt-noy}
N.F. Noy and M.A. Musen. {PROMPT}: Algorithm and tool for automated ontology merging and
alignment.
In {\em Proceedings of the National Conference on Artificial
Intelligence (AAAI)}, 2000.
- [NM01]{anchor-prompt}
N.F. Noy and M.A. Musen. Anchor-{PROMPT}: Using non-local context for semantic {M}atching. In {\em Proceedings of the Workshop on Ontologies and Information
Sharing at the International Joint Conference on Artificial Intelligence (IJCAI)}, 2001.
- [NM02]{noy-aaai02}
NF. Noy and MA. Musen. Prompt{D}iff: A fixed-point algorithm for comparing ontology
versions.
In {\em Proceedings of the Nat. Conf. on Artificial Intelligence
(AAAI)}, 2002.
- [Ome01]{borys}
B. Omelayenko. Learning of ontologies for the {W}eb: the analysis of existent
approaches.
In {\em Proceedings of the International Workshop on Web Dynamics},
2001.
- [ont]{ontobroker}
http://ontobroker.semanticweb.org.
- [Pad98]{padro-hybrid}
L. Padro. A hybrid environment for syntax-semantic tagging, 1998.
- [PB02]{pottinger02}
R. Pottinger and P. Bernstein. Creating a mediated schema based on initial correspondences. {\em IEEE Data Engineering Bulletin}, 25(3), 2002.
- [PE95]{perkowitz&etzioni95}
M. Perkowitz and O. Etzioni. Category translation: Learning to understand information on the
{I}nternet.
In {\em Proc. of Int. Joint Conf. on AI (IJCAI)}, 1995.
- [PR00]{roth-nips00}
V. Punyakanok and D. Roth. The use of classifiers in sequential inference. In {\em Proceedings of the Conference on Neural Information
Processing Systems (NIPS-00)}, 2000.
- [PS98]{ps98}
C. Parent and S. Spaccapietra. Issues and approaches of database integration. {\em Communications of the ACM}, 41(5):166--178, 1998.
- [PSTU99]{pstu99}
L. Palopoli, D. Sacca, G. Terracina, and D. Ursino. A unififed graph-based framework for deriving nominal interscheme
properties, type conflicts, and object cluster similarities.
In {\em Proceedings of the Conf. on Cooperative Information Systems
(CoopIS)}, 1999.
- [PSU98]{palopoli98}
L. Palopoli, D. Sacca, and D. Ursino. Semi-automatic, semantic discovery of properties from database
schemes.
In {\em Proc. of the Int. Database Engineering and Applications
Symposium (IDEAS-98)}, pages 244--253, 1998.
- [PTU00]{ptu00}
L. Palopoli, G. Terracina, and D. Ursino. The system {DIKE}: towards the semi-automatic synthesis of
cooperative information systems and data warehouses.
In {\em Proceedings of the {ADBIS}-{DASFAA} Conf.}, 2000.
- [PVH{\etalchar{+}}02]{popa02}
L. Popa, Y. Velegrakis, M. Hernandez, R. J. Miller, and R. Fagin. Translating web data. In {\em Proceedings of the Int. Conf. on Very Large Databases
(VLDB)}, 2002.
- [RB01]{survey}
E. Rahm and P.A. Bernstein. On matching schemas automatically. {\em VLDB Journal}, 10(4), 2001.
- [RD00]{rahm-do-cleaning}
E. Rahm and H. Do. Data cleaning: Problems and current approaches. {\em IEEE Data Engineering Bulletin}, 2000.
- [RHS01]{hical}
I. Ryutaro, T. Hideaki, and H. Shinichi. Rule induction for concept hierarchy alignment. In {\em Proceedings of the 2nd Workshop on Ontology Learning at the
17th Int. Joint Conf. on AI (IJCAI)}, 2001.
- [RMR00]{arnie-get-data}
A. Rosenthal, F. Manola, and S. Renner. Getting data to applications: Why we fail, and how we can do better. In {\em Proceedings of the AFCEA Federal Database Conference}, 2000.
- [RRSM01]{arnie-revolution}
A. Rosenthal, S. Renner, L. Seligman, and F. Manola. Data integration needs an industrial revolution. In {\em Proceedings of the Workshop on Foundations of Data
Integration}, 2001.
- [RS01]{arnie-scalability}
A. Rosenthal and L. Seligman. Scalability issues in data integration. In {\em Proceedings of the AFCEA Federal Database Conference}, 2001.
- [SL90]{sl90}
AP. Seth and JA. Larson. Federated database systems for managing distributed, heterogeneous,
and autonomous databases.
{\em ACM Computing Survey}, 22(3):183--236, 1990.
- [SR01]{arnie-impact}
L. Seligman and A. Rosenthal. The impact of xml in databases and data sharing. {\em IEEE Computer}, 2001.
- [SRLS02]{arnie-time}
L. Seligman, A. Rosenthal, P. Lehner, and A. Smith. Data integration: Where does the time go? {\em IEEE Data Engineering Bulletin}, 2002.
- [TD97]{lagrame}
L. Todorovski and S. Dzeroski. Declarative bias in equation discovery. In {\em Proceedings of the Int. Conf. on Machine Learning (ICML)},
1997.
- [TW99]{ting&witten99}
K. M. Ting and I. H. Witten. Issues in stacked generalization. {\em Journal of Artificial Intelligence Research}, 10:271--289, 1999.
- [{UDB}]{bio:udb}
{UDB}: {T}he unified database for human genome computing. http://bioinformatics.weizmann.ac.il/udb.
- [vR79]{ir-book}
van Rijsbergen. {\em Information {R}etrieval}. London:Butterworths, 1979. Second Edition.
- [Wol92]{wolpert92}
D. Wolpert. Stacked generalization. {\em Neural Networks}, 5:241--259, 1992.
- [Wor]{wordnet}
Wordnet: {A} lexical database for the {E}nglish language. http://www.cogsci.princeton.edu/ wn.
- [XML98]{xml}
Extensible markup language ({XML}) 1.0. www.w3.org/TR/1998/REC-xml-19980210, 1998. W3C Recommendation.
- [Xqu]{xquery}
X{Q}uery: {A}n {XML} query language. http://www.w3.org/TR/xquery.
- [XSL99]{xslt}
{XSL} {T}ransformations ({XSLT}), version 1.0. http://www.w3.org/TR/xslt, 13 August 1999. W3C Working Draft.
- [YMHF01]{clio}
L.L. Yan, R.J. Miller, L.M. Haas, and R. Fagin. Data driven understanding and refinement of schema mappings. In {\em Proceedings of the ACM SIGMOD}, 2001.
- [YS00]{yi00}
J. Yi and N. Sundaresan. A classifier for semi-structured documents. In {\em Proc. of the 6th Int. Conf. on Knowledge Discovery and Data
Mining ({KDD}-2000)}, 2000.
