Synthesis Lectures on Data Management- Data Cleaning
Afbeeldingen
Sla de afbeeldingen overArtikel vergelijken
Auteur:
Venkatesh Ganti
Anish Das Sarma
- Engels
- Paperback
- 9783031007699
- 01 oktober 2013
- 69 pagina's
Samenvatting
Finally, we discuss the development of custom scripts which leverage the basic data cleaning operators along with relational operators to implement effective solutions for data cleaning tasks.
Data warehouses consolidate various activities of a business and often form the backbone for generating reports that support important business decisions. Errors in data tend to creep in for a variety of reasons. Some of these reasons include errors during input data collection and errors while merging data collected independently across different databases. These errors in data warehouses often result in erroneous upstream reports, and could impact business decisions negatively. Therefore, one of the critical challenges while maintaining large data warehouses is that of ensuring the quality of data in the data warehouse remains high. The process of maintaining high data quality is commonly referred to as data cleaning. In this book, we first discuss the goals of data cleaning. Often, the goals of data cleaning are not well defined and could mean different solutions in different scenarios. Toward clarifying these goals, we abstract out a common set of data cleaning tasks that often need to be addressed. This abstraction allows us to develop solutions for these common data cleaning tasks. We then discuss a few popular approaches for developing such solutions. In particular, we focus on an operator-centric approach for developing a data cleaning platform. The operator-centric approach involves the development of customizable operators that could be used as building blocks for developing common solutions. This is similar to the approach of relational algebra for query processing. The basic set of operators can be put together to build complex queries. Finally, we discuss the development of custom scripts which leverage the basic data cleaning operators along with relational operators to implement effective solutions for data cleaning tasks.
Data warehouses consolidate various activities of a business and often form the backbone for generating reports that support important business decisions. Errors in data tend to creep in for a variety of reasons. Some of these reasons include errors during input data collection and errors while merging data collected independently across different databases. These errors in data warehouses often result in erroneous upstream reports, and could impact business decisions negatively. Therefore, one of the critical challenges while maintaining large data warehouses is that of ensuring the quality of data in the data warehouse remains high. The process of maintaining high data quality is commonly referred to as data cleaning. In this book, we first discuss the goals of data cleaning. Often, the goals of data cleaning are not well defined and could mean different solutions in different scenarios. Toward clarifying these goals, we abstract out a common set of data cleaning tasks that often need to be addressed. This abstraction allows us to develop solutions for these common data cleaning tasks. We then discuss a few popular approaches for developing such solutions. In particular, we focus on an operator-centric approach for developing a data cleaning platform. The operator-centric approach involves the development of customizable operators that could be used as building blocks for developing common solutions. This is similar to the approach of relational algebra for query processing. The basic set of operators can be put together to build complex queries. Finally, we discuss the development of custom scripts which leverage the basic data cleaning operators along with relational operators to implement effective solutions for data cleaning tasks.
Productspecificaties
Wij vonden geen specificaties voor jouw zoekopdracht '{SEARCH}'.
Inhoud
- Taal
- en
- Bindwijze
- Paperback
- Oorspronkelijke releasedatum
- 01 oktober 2013
- Aantal pagina's
- 69
- Illustraties
- Met illustraties
Betrokkenen
- Hoofdauteur
- Venkatesh Ganti
- Tweede Auteur
- Anish Das Sarma
- Hoofduitgeverij
- Springer International Publishing Ag
Overige kenmerken
- Product breedte
- 191 mm
- Product lengte
- 235 mm
- Studieboek
- Ja
- Verpakking breedte
- 191 mm
- Verpakking hoogte
- 235 mm
- Verpakking lengte
- 235 mm
- Verpakkingsgewicht
- 185 g
EAN
- EAN
- 9783031007699
Je vindt dit artikel in
- Categorieën
- Taal
- Engels
- Beschikbaarheid
- Leverbaar
- Boek, ebook of luisterboek?
- Boek
- Studieboek of algemeen
- Studieboeken
Kies gewenste uitvoering
Bindwijze
: Paperback
Prijsinformatie en bestellen
De prijs van dit product is 29 euro en 99 cent.
4 - 6 weken
Verkoop door bol
- Prijs inclusief verzendkosten, verstuurd door bol
- Ophalen bij een bol afhaalpunt mogelijk
- 30 dagen bedenktijd en gratis retourneren
- Dag en nacht klantenservice
Rapporteer dit artikel
Je wilt melding doen van illegale inhoud over dit artikel:
- Ik wil melding doen als klant
- Ik wil melding doen als autoriteit of trusted flagger
- Ik wil melding doen als partner
- Ik wil melding doen als merkhouder
Geen klant, autoriteit, trusted flagger, merkhouder of partner? Gebruik dan onderstaande link om melding te doen.