Technical Report on Data Integration and Preparation

Type: Preprint

Publication Date: 2021-01-01

Citations: 3

DOI: https://doi.org/10.48550/arxiv.2103.01986

View

Locations

  • arXiv (Cornell University) - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ What About the Data? A Mapping Study on Data Engineering for AI Systems 2024 Petra Heck
+ PDF Chat MetaPix: A Data-Centric AI Development Platform for Efficient Management and Utilization of Unstructured Computer Vision Data 2024 Santosh S. Venkatesh
Atra Akandeh
Madhu Lokanath
+ Data-centric Artificial Intelligence: A Survey 2025 Daochen Zha
Zaid Pervaiz Bhat
Kwei-Herng Lai
Fan Yang
Zhimeng Jiang
Shaochen Zhong
Xia Hu
+ Data-centric Artificial Intelligence: A Survey 2023 Daochen Zha
Zaid Pervaiz Bhat
Kwei-Herng Lai
Fan Yang
Zhimeng Jiang
Shaochen Zhong
Xia Hu
+ Data-centric AI: Perspectives and Challenges 2023 Daochen Zha
Zaid Pervaiz Bhat
Kwei-Herng Lai
Fan Yang
Xia Hu
+ PDF Chat Data-centric AI: Perspectives and Challenges 2023 Daochen Zha
Zaid Pervaiz Bhat
Kwei-Herng Lai
Fan Yang
Xia Hu
+ How can AI Automate End-to-End Data Science? 2019 Charų C. Aggarwal
Djallel Bouneffouf
Horst Samulowitz
Beat Buesser
Thanh Hoang
Udayan Khurana
Sijia Liu
Tejaswini Pedapati
Parikshit Ram
Ambrish Rawat
+ How can AI Automate End-to-End Data Science? 2019 Charų C. Aggarwal
Djallel Bouneffouf
Horst Samulowitz
Beat Buesser
Thanh Hoang
Udayan Khurana
Sijia Liu
Tejaswini Pedapati
Parikshit Ram
Ambrish Rawat
+ PDF Chat OpenDataLab: Empowering General Artificial Intelligence with Open Datasets 2024 Conghui He
Wei Li
Zhenjiang Jin
Chao Xu
Bin Wang
Dahua Lin
+ PDF Chat Data Readiness for AI: A 360-Degree Survey 2024 Kaveen Hiniduma
Suren Byna
Jean Luca Bez
+ PDF Chat AI Data Readiness Inspector (AIDRIN) for Quantitative Assessment of Data Readiness for AI 2024 Kaveen Hiniduma
Suren Byna
Jean Luca Bez
Ravi Madduri
+ PDF Chat AI Data Readiness Inspector (AIDRIN) for Quantitative Assessment of Data Readiness for AI 2024 Kaveen Hiniduma
Suren Byna
Jean Luca Bez
Ravi Madduri
+ Data Readiness Report 2020 Shazia Afzal
C Rajmohan
Manish Kesarwani
Sameep Mehta
Hima Patel
+ Data Acquisition: A New Frontier in Data-centric AI 2023 Lingjiao Chen
Bilge Acun
Newsha Ardalani
Yifan Sun
Feiyang Kang
Hanrui Lyu
Yongchan Kwon
Ruoxi Jia
Carole-Jean Wu
Matei Zaharia
+ PDF Chat Data Quality Awareness: A Journey from Traditional Data Management to Data Science Systems 2024 Sijie Dong
Soror Sahri
Themis Palpanas
+ Data Readiness Report. 2020 Shazia Afzal
C Rajmohan
Manish Kesarwani
Sameep Mehta
Hima Patel
+ From Data Fusion to Knowledge Fusion 2015 Xin Luna Dong
Evgeniy Gabrilovich
Geremy Heitz
Wilko Horn
Kevin Murphy
Shaohua Sun
Wei Zhang
+ Linked Data Science Powered by Knowledge Graphs 2023 Mossad Helali
Shubham Vashisth
Philippe Carrier
Katja Hose
Essam Mansour
+ PDF Chat From data fusion to knowledge fusion 2014 Xin Luna Dong
Evgeniy Gabrilovich
Geremy Heitz
Wilko Horn
Kevin Murphy
Shaohua Sun
Wei Zhang
+ A data science axiology: the nature, value, and risks of data science 2023 Michael L. Brodie

Citing (18)

Action Title Year Authors
+ PDF Chat Big data dimensional analysis 2014 Vijay Gadepally
Jeremy Kepner
+ PDF Chat Using a Power Law distribution to describe big data 2015 Vijay Gadepally
Jeremy Kepner
+ PDF Chat Automatic optimization for MapReduce programs 2011 Eaman Jahani
Michael Cafarella
Christopher Ré
+ DeepDive: Declarative Knowledge Base Construction. 2016 Christopher De
Alex Ratner
Christopher Ré
Jaeho Shin
Feiran Wang
Sen Wu
Ce Zhang
+ PDF Chat The BigDAWG polystore system and architecture 2016 Vijay Gadepally
Peinan Chen
Jennie Duggan
Aaron J. Elmore
Brandon Haynes
Jeremy Kepner
Samuel Madden
Tim Mattson
Michael Stonebraker
+ Data Readiness Levels 2017 Neil D. Lawrence
+ PDF Chat Snorkel 2017 Alexander Ratner
Stephen H. Bach
Henry R. Ehrenberg
Jason Fries
Sen Wu
Christopher Ré
+ PDF Chat Hyperscaling Internet Graph Analysis with D4M on the MIT SuperCloud 2018 Vijay Gadepally
Jeremy Kepner
Lauren Milechin
William Arcand
David Bestor
Bill Bergeron
Chansup Byun
Matthew Hubbell
Micheal Houle
Micheal Jones
+ HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 Theodoros Rekatsinas
Xu Chu
Ihab F. Ilyas
Christopher Ré
+ Bidirectional Attention Flow for Machine Comprehension 2016 Minjoon Seo
Aniruddha Kembhavi
Ali Farhadi
Hannaneh Hajishirzi
+ PDF Chat Moments in Time Dataset: One Million Videos for Event Understanding 2019 Mathew Monfort
Carl Vondrick
Aude Oliva
Alex Andonian
Bolei Zhou
Kandan Ramakrishnan
Sarah Adel Bargal
Tom Yan
Lisa M. Brown
Quanfu Fan
+ PDF Chat Unsupervised String Transformation Learning for Entity Consolidation 2019 Dong Deng
Wenbo Tao
Ziawasch Abedjan
Ahmed K. Elmagarmid
Ihab F. Ilyas
Guoliang Li
Samuel Madden
Mourad Ouzzani
Michael Stonebraker
Nan Tang
+ Demonstration of a Multiresolution Schema Mapping System 2018 Zhongjun Jin
Christopher Baik
Michael Cafarella
H. V. Jagadish
Yuze Lou
+ On Evaluating Embedding Models for Knowledge Base Completion 2019 Yanjie Wang
Daniel Ruffinelli
Rainer Gemulla
Samuel Broscheit
Christian Meilicke
+ Unsupervised Methods for Determining Object and Relation Synonyms on the Web 2009 Andrew Yates
Oren Etzioni
+ PDF Chat Computing on Masked Data to improve the security of big data 2015 Vijay Gadepally
Braden Hancock
Benjamin Kaiser
Jeremy Kepner
Pete Michaleas
Mayank Varia
Arkady Yerukhimovich
+ PDF Chat H <scp>elix</scp> 2018 Doris Xin
Litian Ma
Jialin Liu
Stephen Macke
Shuchen Song
Aditya Parameswaran
+ PDF Chat Computing on masked data: a high performance method for improving big data veracity 2014 Jeremy Kepner
Vijay Gadepally
Pete Michaleas
Nabíl Schear
Mayank Varia
Arkady Yerukhimovich
Robert K. Cunningham