数据库问题解答

                                                   A字母a        ROAD        MAP地图

                                                                FOR

ENTERPRISE企业    DATA MANAGEMENT数据管理

                                                                                                Barry Williams 巴里威廉姆斯

                                                                                                                                Principal Consultant 首席顾问

                                                                                                                                Database Answers Ltd. 数据库回答有限公司

                                                                                                                                London 伦敦 , England 英格兰

                                                                                                                                info@barryw.org info@barryw.org


CHAPTER 1. 第1章。 INTRODUCTION .. 引言.. 3 3

CHAPTER 2. 第2章。 ASSESSMENT . 估价 4 4

CHAPTER 3. 第3章。 A VISION OF THE FUTURE . 对未来的展望 5 5

CHAPTER 4. 第4章。 THE ROAD MAP . 路线图 9 9

CHAPTER 5. 第5章。 A CASE STUDY FOR THE ROAD MAP . 为例路线图 34 34


CHAPTER 1. INTRODUCTION 第1章。 导言

1.1 Purpose of this Document 1.1在本文件

This document describes a Road Map for Enterprise Data Management which covers the important phases from Integration of Data Sources to the production of Integrated Performance Reports, with Business Intelligence. 本文件描述了一个路线图为企业数据管理涉及重要的阶段一体化的数据来源的生产综合执行情况报告,与商业智能。

1.2 Benefits of this Document 1.2本文件的优点

The benefits of this document are that it lays out a Road Map which can help anybody with questions about Enterprise Data Management to get useful answers. 的好处这一文件是它制定了一个路线图可以帮助任何人的问题企业数据管理得到有用的答案。

1.3 What is in the Road Map ? 1.3什么是路线图?

The Road Map contains five separate Stages which can be used to plan and control any activity related to Enterprise Data Management. 路线图包含五个不同的阶段可用于计划和控制的任何活动与企业数据管理。

These Stages are : - 这些阶段是: -

1) 1 )        Database Design 数据库设计

2) 2 )        Data Integration 数据集成

3) 3 )        Performance Reporting 业绩报告

4) 4 )        Internet Mashups 互联网混搭

5) 5 )        Data Governance 数据治理

Separate documents discusses how the Road Map could be implemented by Microsoft, Informatica and Salesforce.com. 单独的文件讨论了如何路线图可以实施的微软, Informatica和Salesforce.com 。

The documentation for each Stage has generally the same structure : - 该文件的每一个阶段有大致相同的结构: -

  • Definition - 定义-   usually a Wikipedia entry. 通常是一个 维基百科条目。
  • Best Practice 最佳实践
  • Templates 模板
  • Tools 工具
  • Tutorials 教程

The Approach has been to formalise Best Practice in Enterprise Data Management and to make this Best Practice accessible by a series of Questions. 该办法是正式最佳实践在企业数据管理使这一最佳做法获得了一系列的问题。

1.4 How to use this Document 1.4 如何使用本文件

To use this Document, you should answer the Questions in the Self-Assessment in Section 2.1 to determine which Stage you are at. 如果要使用这个文件,你应该在回答这些问题的自我评估在第2.1节以确定哪个阶段您在。

If you have a question that is not covered please tell us about it and we will be happy to respond. 如果你有一个问题没有涉及请告诉我们我们将很乐意回答。

You can email us at dba_requests@barryw.org . 您可以发送电子邮件至dba_requests@barryw.org

Chapter 3 presents a Vision of the Future for the role of Databases. 第3章提出了一种对未来的展望中的作用数据库。

 

Chapter 5 presents a Case Study showing how the Road Map would be used in practice. 第5章提出了一个实例研究说明该路线图将被用于实践。


CHAPTER 2. ASSESSMENT 第2章。 评价

2.1 Self-Assessment 2.1自我评估

The first table contains a summary of the Questions which help in the Self-Assessment for any individual or organisation to determine where they are along the Road Map. 第一个表包含一个简要的问题帮助在自我评估的任何个人或团体以确定它们是沿路线图。

This is sample of Questions which will be added to regularly. 这是抽样的问题将被添加到定期。

Nr. 上午十时正。

QUESTION 问题

STAGE 阶段

1 1

Do you need to design a Database ? 你需要设计一个数据库?

1 1

2 2

Do you need to handle multiple languages ? 你需要处理多国语言?

1 1

3 3

Do you use multiple types of Database, such as SQL Server and Oracle ? 您是否使用了多种类型的数据库,如SQL Server Oracle

2 2

4 4

Is Data Quality an 数据质量是一个 Enterprise 企业 Issue ? 问题?

2 2

5 5

Do you have a Single View of the Things of Importance, such as Customers ? 你有一个视图的东西很重要,如客户?

2 2

6 6

Do you have Master Data Management (MDM) in place ? 你有主数据管理( MDM )的地方?

2 2

7 7

Can you verify the derivation of all data (the Data Lineage') in your Reports ? 你能否确认派生的所有数据(数据天堂' )在您的报告?

3 3

8 8

Do you want to combine Excel data in your Reports ? 你想结合Excel数据在您的报告?

3 3

9 9

Does your Chief Exec have Report requirements that you cannot meet ? 贵首席执行官已报告的要求你不能满足?

3 3

10 10

Is anyone using Mashups in your organisation ? 是任何人都使用混搭在您的组织

4 4

11 11

Do you have a top-down view of Data Management in your organisation ? 你有自上而下鉴于数据管理在您的组织

5 5

12 12

Does your organisation have a Data Governance function ? 组织有一个数据管理功能?

5 5

2.2 Assessment Snapshot 2.2评估快照

This table provides a snapshot to help in the Assessment process. 此表提供的快照以帮助评估过程。

Data Architecture and Data Models for Sources and Targets. 数据结构和数据模型的源和目标。

STAGE 阶段

                        BASIC 基本

             AVERAGE 海损

                     IDEAL 理想

1) Data Sources 1 )数据源

Knowledge in the 知识

heads 元首   of individuals. 个人

Top 20 Applications known with list of Data Sources and Owners 排名前20位的应用与已知名单的数据来源和业主

Agile development with refactoring techniques. 敏捷开发与重构技术。

No Data Models and poor documentation of links between code and databases. 没有数据模型和穷国之间的联系的文件的代码和数据库。

Basic Data Dictionary in place. 基本数据字典到位。

Data Models and sign-off by DBA on all changes. 数据模型并签署过的数据库管理员对所有的变化。

User access and sign-off for Data Dictionary 用户访问和签字的数据字典

2) Data Integration 2 )数据集成

Ad-hoc integration using bespoke SQL Scripts 特设一体化使用定制的SQL脚本

Some Templates established and commercial Tools in use. 某些模板建立和商业工具的使用。

MDM approved, data owner sign-off, 山东批准,数据所有者签字,

Data Quality is an 数据质量是一个 Enterprise 企业 issue. 问题。

Software Tools linked to the Data Dictionary 软件工具与数据字典

Clear and reconciled top-down and bottom-up views of data. 明确和协调的自上而下和自下而上的观点的数据。

3) Performance Rpts 3 )性能Rpts

One-off, often independent Dept. 一次性的,往往是独立的事业部 Spreadsheets 试算表  

Independent Maps, KPIs and drill-down to detailed Reports 独立的地图, 章程和钻取到详细的报告

Integrated Maps, KPIs and drill-downs for Chief Exec 综合地图, 章程和演习下调为首席执行官

4) Internet Mashups 4 )互联网混搭

None 毫无

Isolated development 孤立发展

Users aware 用户意识到

5) Data Governance 5 )数据管理

None 毫无

No end-to-end agreement. 没有端到端的协议。

Procedures published, Roles and Responsibilities and Sign-off all in place. 出版程序,作用和责任并注册了所有的地方。

Data lineage known and auditable. 数据系已知和审计。


CHAPTER 3. A VISION OF THE FUTURE 第3章。 对未来的展望

3.1 Universal Information Architecture 3.1通用信息架构

In the future, the current trend to Widgets and end-user data integration will continue and user demands will increasingly call for easy access to all data at any time and using any device. 在未来,目前的趋势以组件和最终用户数据整合将继续和用户的需求将越来越多地要求容易获得所有的数据在任何时间使用任何装置。

The functionality offered by cell phones or mobiles will continue, with Apple's iPhone expected to maintain its position of leadership. 所提供的功能的手机或手机将继续与苹果的iPhone 预计将保持其领导地位。

This situation is shown in the following diagram, with four very different perspectives being seen by 这种情况表现在下面的图表,有4个非常不同的角度被认为   groups :- 团体 : -

1) 1 )        Suppliers 供应商

2) 2 )        Users & Organisations 用户组织

3) 3 )        University Research Departments 大学研究部门

4) 4 )        Students 学生

世界信息架构


3.2 Evolution of Databases 3.2演变数据库  

In the future, Databases will be accessible at any time and from any location using any Device. 今后,数据库将可在任何时间和任何地点使用任何装置。

This diagram shows how real-time links can be provided to all Databases. 此图显示了如何实时联系可提供给所有数据库。

In addition, more data types, built-in compatible Data Models in the Clouds, mix-and-match selection of required Tables, with Platforms for Vertical Applications and creation of Data Marts. generation of Data and built-in resolution of the impedance mismatch between the Relational and Object approaches. Conceptually, there will be an Integrated Data Platform, with a range of superimposed Data Service Layers. 此外,更多的数据类型,内置兼容的数据模型中的云,混合搭配选择所需的表,与平台的垂直应用软件并建立数据集市。 新一代数据和内置的解决阻抗不匹配之间的关系和对象的方法。从概念上来说,将有一个综合数据平台,具有广泛的叠加数据服务层。  

Databases will come equipped with self-correction, self-monitoring and self-tuning. 数据库将配备自我校正,自我监督和自我调整。


3.3 Data Architecture for the Future 3.3数据结构的未来  

This Architecture features three Levels. 这种结构特征三个层次。


3.4 Data Quality in the Clouds 3.4数据质量中的云

Gartner predicts that within a few years, 80% of all Enterprises will have at least some involvement in Cloud Computing. Gartner预测在几年之内, 80 %的所有企业将有至少有一些参与云计算。

Our thinking should therefore encompass Scenarios where some of our data will be located in the Clouds. 因此我们的思维应该包括的情况下我们的一些数据将设在云中。

Data Integration and Data Quality must provide for integration with Cloud data. 数据整合和数据质量必须提供整合云数据。

This diagram shows that Data Sources and Data Quality On Demand Services can be in the Clouds. 此图显示数据来源和数据质量On Demand服务可以云中。

DataQuality -on-Demand is provided by Informaticahttp://www.informaticaondemand.com/ DataQuality按需提供Informatica - http://www.informaticaondemand.com/

 

3.5 Data Dictionary in the Clouds 3.5数据字典中的云

The Data Dictionary will be located in the Clouds so that it will be readily available to anybody at any time and from any location. 在数据字典将设在云以便将随时提供给任何人在任何时间和任何地点。

Here is an extract from a typical Dictionary : - 下面是一个典型的摘录词典: -

TYPE

DETAILS 详情

COMMENTS 评论

Salesforce Salesforce的

Will meet face-to-face 将面对面

Frankie Beverley 弗兰基贝弗利

3.6 Populating a Data Dictionary 3.6填充数据字典

The Data Dictionary will be populated by reading data from the System Catalogues for Data Sources. 在数据字典将居住读数据从系统目录数据来源。

Data Sources 数据来源

SQL Server, Oracle, Etc. SQL Server中,甲骨文

SOURCES

DATA OWNER 数据拥有者

CRM 客户关系管理

Objects 物体

Bobby is happy 鲍比很高兴

Bobby Caldwell 鲍比考德威尔

HR 人力资源 SQL Server SQL Server的

Custom Objects 定制对象

Finance 财经

Spreadsheets 试算表

Ray sees things clearly 雷的事情清楚地看到

Ray Charles 雷查尔斯

Source Object Explorer 源对象浏览器

- Informatica On Demand - Informatica 的On Demand

- MS Integrated Services -质谱综合服务

Data Dictionary 数据字典


CHAPTER 4. THE ROAD MAP 第4章。 路线图

This Section describes the details the major Stages in the Road Map. 本节描述的细节主要阶段的路线图。

It is presented in a step-by-step sequence, from Data Sources 这是在一步一步序列,从数据   to Data Governance. 数据治理。

The Steps are : - 该步骤是: -

·           Data Sources - 数据来源-

o ö          Identify the Data Sources 确定数据来源

o ö          Create Data Models 创建数据模型

·           Data Integration – 数据集成-

o ö          Design Target ERD Data Model for combined Data Sources 设计目标位移数据模型合并数据源

·           Mapping – 绘图-

o ö          Map Entities 地图实体

o ö          Map Attributes 地图属性

o ö          Define Rules for Relationships and Field validation 定义规则的关系和外地验证

·           Data Quality (DQ) – 数据质量(德泉) -

o ö          Produce DQ Profiles 生产部门宿舍概况

o ö          Agree required DQ Standards 同意需要德泉标准

o ö          Repeat Data Validation and Clean-Up as necessary 重复数据验证和清理的必要

·           Design the Data Mart 设计的数据集市

·           Performance Reports - 执行情况报告-

o ö          Agree KPIs with Users 同意章程的用户

o ö          Agree Top-Level Summary Reports 同意顶级摘要报告

o ö          Agree Detailed Reports 同意详细报告

·           Internet Mashups - 互联网混搭 -

o ö          Determine the requirements for Mashups 确定所需的混搭

o ö          Design and Build Mashups as appropriate 设计和建造适当的混搭

·           Data Governance - 数据管理-

o ö          Ensure Compliance with Policies and Procedures. 确保遵守政策和程序。

o ö          Modify as appropriate 适当的修改


4.1 Stage 1 – Database Design 4.1第1阶段-数据库设计

4.1.1 State-of-the-Art 4.1.1国家最先进的

* Wikipedia on Database Design * 维基百科的数据库设计      - http://en.wikipedia.org/wiki/Database_design - http://en.wikipedia.org/wiki/Database_design

* Wikipedia on Data Modeling * 维基百科的数据建模         - http://en.wikipedia.org/wiki/Data_modeling - http://en.wikipedia.org/wiki/Data_modeling

4.1.2 Best Practice 4.1.2最佳实践

Here is a series of Steps in designing a Database : - 这是一系列的步骤在设计数据库: -

Step 1. Establish the Scope of the Database. 第1步。建立范围的数据库。

Step 2. Identify the 'Things of Interest' 第2步。确定'事物的利益'

Step 3. Define the Business Rules that determine how these 'Things of Interest' are related 第3步。界定业务规则确定如何将这些'事物的兴趣相关

Step 4. Choose the Data Modelling Tool. 第4步。选择数据建模工具。

Step 5. Produce first draft Data Model and review with the Users. 第5步。生产第一稿和审查数据模型与用户。

Step 6. Ask the Users to provide sample data. 第6步。卖出的用户提供样本数据。

Step 7. Load data into Database and confirm the Design. 第7步。加载数据到数据库并确认设计。

4.1.3 Templates 4.1.3模板

A very valuable set of over 600 Kick-Start Data Models are available on the Database Answers Web Site : - 一个非常宝贵的一套600多启动数据模型都可以在数据库回答网址: -

                                - http://www.databaseanswers.org/data_models/index.htm - http://www.databaseanswers.org/data_models/index.htm

You will probably find something to give you an excellent start to designing a new Database. 您可能会找到一些给你一个良好的开端以设计一个新的数据库。

If not, contact us by email at barryw@databaseanswers.org and we will help you to get started. 如果不是这样,通过电子邮件与我们联系在barryw@databaseanswers.org我们会帮助您开始使用。


4.1.3.1 A Database for Local Authority Parking 4.1.3.1一个数据库供地方当局停车

Here is an example of an Entity-Relationship Diagram for a Database designed for Parking Tickets in a Local Authority in the UK :- 下面是一个例子一个实体关系图的数据库设计的罚单在一个地方当局在英国: -

4.1.4 Tools 4.1.4工具

There is a wide choice of Data Modeling Tools and here is a sample of the most popular Tools available : - 有多种可供选择的数据建模工具并在这里是一个示例最受欢迎的工具: -

4.1.5 Tutorials 4.1.5教程

* Data Modelling *数据建模 - http://www.databaseanswers.org/tutorial4_data_modelling/index.htm - http://www.databaseanswers.org/tutorial4_data_modelling/index.htm

* Database Design - http://www.databaseanswers.org/tutorial4_getting_started_with_db_design/index.htm *数据库设计- http://www.databaseanswers.org/tutorial4_getting_started_with_db_design/index.htm

* Understanding a Database Schema - http://www.databaseanswers.org/tutorial4_db_schema/index.htm *了解一个数据库模式- http://www.databaseanswers.org/tutorial4_db_schema/index.htm

4.1.6 How do I ? 4.1.6我该怎么做?

4.1.6.1 Get Certified as a DBA ? 4.1.6.1获得认证的数据库管理员?

Certification can be described as 'Necessary but not sufficient'. 认证可以被描述为'必要条件而不是充分。 In other words, some employers consider it as evidence that you have the necessary technical knowledge and skills to be a Database Administrator, but without any experience, it will not guarantee you a job. 换句话说,一些雇主认为这是证据你有必要的技术知识和技能是一个数据库管理员,但没有任何经验,但并不能保证你找到工作。

If you take your profession seriously and are committed to self-improvement, then you should certainly consider getting certified in the DBMS of your choice. 如果你认真考虑你的专业并致力于自我改进,那么你当然应该考虑得到认证的数据库管理系统的选择。

Here are some very useful Microsoft Web Links : - 这里有一些非常有益的Microsoft Web 链接: -

* Overview of Certification *认证概况                - http://www.microsoft.com/learning/mcp/default.mspx - http://www.microsoft.com/learning/mcp/default.mspx

* Database Administrator *数据库管理员                  - http://www.microsoft.com/learning/mcp/mcitp/dbadmin/default.mspx - http://www.microsoft.com/learning/mcp/mcitp/dbadmin/default.mspx

* Microsoft Certified Master *微软认证大师             - http://www.microsoft.com/learning/mcp/master/sql/default.mspx - http://www.microsoft.com/learning/mcp/master/sql/default.mspx

* Certified Database Architect *认证数据库架构师          - http://www.microsoft.com/learning/mcp/architect/database/default.mspx - http://www.microsoft.com/learning/mcp/architect/database/default.mspx

4.1.6.2 Tune the Performance of a Database 4.1.6.2调整效能数据库

Examine the Query Execution Plan to make sure that the appropriate Indexes have been created and are being used properly. 检查查询执行计划以确保适当的指标已经建立和正在使用得当。

 

4.1.7 Qualities for Success 4.1.7成功素质

Skills include T-SQL for SQL Server and PL/SQL for Oracle. 技能包括的T - SQL 的SQL Server和PL / SQL 的Oracle 。

A good Database Administrator (DBA) likes to have responsibility for a clearly defined area, namely a production Database. 一个很好的数据库管理员(管理员)喜欢有责任明确界定的领域,即生产数据库。    He (or she) is happy to make decisions and defend them against questions from Developers, Managers and End-Users. 他(或她)很高兴地作出决定并保护他们对问题从开发,管理和最终用户。

 

It is useful for a Database Designer to have a DBA background, but is likely to welcome the challenge of interacting with Users, creating a design for a new Database and working with Users to get agreement on the new design. 它是有用的数据库设计有一个管理员的背景,但很可能会欢迎的挑战与用户,创造了设计一个新的数据库并与使用者取得一致意见的新设计。

 


4.2 Stage 2 – Data Integration 4.2第二阶段-数据集成

4.2.1 State-of-the-Art 4.2.1国家最先进的

* Wikipedia on Data Integration * 维基百科的数据集成       - http://en.wikipedia.org/wiki/Data_integration - http://en.wikipedia.org/wiki/Data_integration

* Wikipedia on Data Quality * 维基百科的数据质量              - http://en.wikipedia.org/wiki/Data_quality - http://en.wikipedia.org/wiki/Data_quality

* Wikipedia on Microsoft's Integration Services - * 维基百科在微软集成服务-  

http://en.wikipedia.org/wiki/SQL_Server_Integration_Services http://en.wikipedia.org/wiki/SQL_Server_Integration_Services

Case Study 个案研究

Here's a Case Study on the Database Answers Web Site about Data Integration in the Clouds 这里有一个案例研究数据库回答网站上称数据集成在云

                http://www.databaseanswers.org/data_integration_case_study.htm http://www.databaseanswers.org/data_integration_case_study.htm

Connecting Databases 连接数据库

One of the requirements might be to connect separate physical Databases. 的要求之一可能是连接不同的物理数据库。

In order to achieve this, the requirements can be defined and then appropriate products can be selected from chosen vendors. 为了实现这一目标,要求可以定义然后适当的产品可以选择供应商的选择。 For example , 例如   a simple techniques is to prefix a Table name with the Database name in an SQL statement. 一个简单的技术是前缀表名称数据库名称的SQL语句。

4.2.2 Best Practice 4.2.2最佳实践

Architectures are vitally important to an understanding of Data Integration. 架构是非常重要的一项谅解的数据集成。

After the appropriate Architecture has been correctly designed, the choice of Products can be made. 经过适当的建筑已经被正确地设计,选择产品可以。

It is possible that sometimes these Products might be developed in-house, especially if an organization or individual has experience and a Library of Software Utilities has been established. 这是可能的有时这些产品可能是内部开发的,特别是如果一个组织或个人的经验和图书馆软件工具已经成立。

A number of different Architectures are included in this Section to provide a starting-point for specific Projects. 一些不同的架构都列在这一节提供一个起点具体项目。

For planning the Steps for a Project, here is a general Approach 规划中的步骤进行了一个项目,这是一个一般方法   : - -

Step 1. Establish the Scope of both Sources and Targets. 第1步。建立的范围都源和目标。

Step 2. Identify the key Data Owners within the Scope. 第2步。找出主要数据拥有者的范围内。

Step 3. Define the Mappings between Source and Target Data Items 第3步。定义的映射关系和目标数据源的项目

Step 4. Agree the minimum acceptable Data Quality standards. 第4步。同意可接受的最低数据质量标准。

                For example, every Address will be validated. 例如,每一个地址将被验证。

This page lists some useful Web Links for Customer Data Integration : 此页列出了一些有用的网站链接的客户数据集成: -

http://www.databaseanswers.org/customer_data_integration.htm http://www.databaseanswers.org/customer_data_integration.htm

4.2.2.1 Mapping Data from Source to Target 4.2.2.1地图数据从源头到目标

Mapping is defined at the field level between all Sources and Targets. 映射的定义是在外地一级之间的所有来源和目标。

For example, for Local Government, a Voter from the Electoral Register can be mapped to a Customer in the Customer Master Index. 例如,对于当地政府,选民从选民登记册可以映射到一个顾客在客户总索引。

A Parking Ticket Vehicle Owner can also be mapped to the same Customer. 停车场票务车主也可以映射到相同的客户。

4.2.2.2 Duplicate Records 4.2.2.2重复记录

When there are many sources of similar data, such as Customers, there are frequently duplicate records. 当有许多类似的数据来源,如客户,有经常重复的记录。

For example , 例如   in the US美国 , John Doe could be also called Jon Doe, Johnny Doe, Mr.J.. Doe and so on. , John Doe的也可以称为乔恩能源部,强尼Doe的, Mr.J. 。 能源部 和等等。

In the UK 英国 , Joe Bloggs could also be called Joseph Bloggs , Joey Bloggs , Mr.J.Bloggs and so on. Bloggs也可以被称为约瑟夫Bloggs ,乔伊BloggsMr.J.Bloggs等等。

 

The rules for recognizing and resolving this kind of problem has led to the development of software for Deduplicating records. 规则的认识和解决这一问题已导致软件的开发Deduplicating记录。 This process is informally referred to as 'de-duping', especially by people who do a great deal of it. 这一进程是非正式地称为'去愚弄' ,特别是人谁做了大量的它。

Best Practice is to look for a commercial product, rather than to write your own bespoke software because it usually takes longer than expected and commercial products can be quite cheap. 最好的做法是寻找一个商业产品,而不是写自己的定制软件因为它通常需要比预期更长的时间和商业产品可以相当便宜。

This page on the Database Answers Web Site is an excellent starting- point : - 此网页上的数据库回答网站是一个很好的起点: -

http://www.databaseanswers.org/deduping.htm http://www.databaseanswers.org/deduping.htm

 

4.2.2.3 Architectures 4.2.2.3结构
4.2.2.3.1 Major Components 4.2.2.3.1主要成分  

This diagram shows a top-down view of the major Components in the Architecture. 此图显示自上而下期的主要组成部分的建筑。

4.2.2.3.2 Architecture for Data Integration 4.2.2.3.2数据集成体系结构

This diagram shows details of the Data Integration Component in the Architecture shown above. 此图显示详细的数据集成组件的体系结构如上所示。


4.2.2.3.3 Service-Oriented Architecture (SOA) 4.2.2.3.3面向服务的架构( SOA

SOA架构


4.2.2.3.4 Architecture of Web-Services for Data Quality 4.2.2.3.4结构的Web服务的数据质量

The use of Web Services allows some Components in this Architecture to be distributed in the Clouds. 使用Web服务让一些部件在这个架构的分布在云中。


4.2.2.4 Data Models 4.2.2.4数据模型
4.2.2.4.1 Father of Data Models 4.2.2.4.1数据模型之父

MDM requires a Common Data Model as the Target to which data from multiple Sources can be loaded. 主数据管理需要一个共同的数据模型为目标的数据从多种来源可以加载。

This Data Model can used to provide a generic, flexible foundation for a Data Services Layer. 这个数据模型可以用来提供一个通用,灵活的基础通过数据服务层。

 

This diagram shows a very high-level Data Model which is one candidate for this kind of CDM. 此图显示了非常高层次数据模型这是候选人的这种清洁发展机制。

In practice, this is never used because it is too cumbersome and makes it difficult to obtain with the interested Stakeholders. 在实践中,这是从来没有使用因为它过于复杂因此很难获得与利害攸关者。 It can also postpone difficult decisions and therefore can encourage bad practice. 它也可以推迟困难的决定因此可以鼓励不良做法。

父亲的所有数据模型

4.2.2.4.2 Data Model for Salesforce ERD 4.2.2.4.2数据模型Salesforce的应急司

If one of the Data Sources is Salesforce.com, then knowledge of the Salesforce Database design is vital. 如果其中的数据来源是Salesforce.com ,然后知识的Salesforce的数据库设计是至关重要的。

The ERD is shown in a Chapter at the end of this document. 的ERD显示为一章在本文件的结尾。

The most important Entities are Account ( ie Customer), Case, Contact, Contract, Partner 最重要的实体的帐户( 客户) ,案例,联系方式,合同,合作伙伴


4.2.2.5 Customer Master Index 4.2.2.5客户总索引

A Customer Master Index (CMI) is very important in establishing a Single View of a Customer. 顾客总索引(海事)是非常重要的建立一个单一视图的客户。 The CMI consists basically of cross-references between each Source System and the single Target System. 国际海事委员会基本上由相互参照彼此源系统和目标系统的单一。

顾客总索引


4.2.2.6 Master Data Management 4.2.2.6主数据管理

One of the major components in Master Data Management ('MDM') is Customers. 其中一个主要组成部分的主数据管理( '山东' )是客户。

A Customer Master Index, ('CMI') supports a Single View of a Customer. 顾客总索引, ( '海事' )支持的单一视图的客户。

Master Data Management applies the same principles to all the 'Things of Interest' in an organisation . 主数据管理同样的原则适用于所有的'事物的兴趣的一个组织

This can typically include Employees, Products and Suppliers. 这通常包括员工,产品和供应商。

MDM involves the same kind of operations as a CMI. 主数据管理涉及到同样的行动作为一个海事委员会。 That is, identification and removal of duplicates, and putting in place to eliminate duplicates in any new data loaded into the Databases. 也就是说,查明和消除重复,并建立以消除重复的任何新的数据加载到数据库。

There is a wide choice of software vendors offering MDM products. 有多种可供选择的软件供应商提供的MDM产品。

De-duplication and Address validation is a niche market in this area. 重复和地址验证是一个利基市场在这一领域。

On my Database Answers Web Site, I have a Tutorial on Getting Started in MDM : - 数据库的回答在我的网站,有一个入门教程中的主数据管理: -


4.2.2.7 Data Platform 4.2.2.7数据平台

These building-blocks represent successive levels that can be put in place in a controlled manner. 这些构建模块,是历届水平可以制定一个控制的方式。

Each building-block builds on the previous manner. 每个构建块建立在以往的方式。

This can be used in the planning and control of the Data Management. 这可用于规划和控制的数据管理。

Data Governance provides a thread of continuity through the process and can ensure the integrity and consistency of the data. 数据管理提供了一个线程连续性的过程可以确保信息的完整性和一致性的数据。

步骤数据平台


4.2.3 Templates 4.2.3模板

Here's a page on the Database Answers Web Site discussing Performance Reports : - 这里有一个网页数据库回答网站讨论业绩报告: -

http://www.databaseanswers.org/tutorial4_integrated_performance_reporting/index.htm http://www.databaseanswers.org/tutorial4_integrated_performance_reporting/index.htm

4.2.3.1 Information Catalogue 4.2.3.1信息目录

The Information Catalogue records a range of critical data related to a Data Migration activity. 信息目录记录了一系列重要的数据相关的数据迁移活动。

For example, a list of Entities, Tables, Fields Mappings and 例如,清单的实体,表,字段映射   Rules for Relationships and Validation. 规则的关系和验证工作。

4.2.3.1.1 4.2.3.1.1   Mapping Entities 映射 实体

This Templates is used to define the mapping of Entities or Tables from a specific Source to a specific Target. 模板用于定义映射表的实体或来自特定来源的具体目标。

For example, from an Electoral Register to a Generic Customer Services Data Model (GCDM). 例如,从选民登记册通用客户服务数据模型( GCDM ) 。

This Transformation is supported by Mapping Specifications and the appropriate software. 这一转变是支持的绘图规格和相应的软件。

This software can be either manually-coded SQL, a specialized solutions , such as Salesforce's Excel Connector, 这个软件可以是手动编码的SQL , 一个专门的解决方案 ,如SalesforceExcel的连接器,

or a general-purpose commercial product, such as Informatica . 通用的商业产品,如Informatica

Source Table 源表

Target Table 目标表

Comment 评论

Example : Electoral Register 例如:选民登记册

Example : Customer 例如:客户

 

Example : Elections 例如:选举

Example : Customer_Event 例如: Customer_Event

 

4.2.3.1.2 4.2.3.1.2   Mapping Attributes 映射 属性

This Template defines the correspondence between Fields in Data Sources and Targets. 此模板的定义之间的来往信函场的数据源和目标。

An example of this Template in use is included in Section 6.4. 这方面的一个例子模板中使用列入第6.4 。

SOURCE

TABLE

DATA 威刚

ITEM 项目

TYPE

VALIDATION 验证

TARGET 目标

TABLE

TARGET 目标

ATTRIBUTE 属性

COMMENT 评论

 

 

 

 

 

 

 

 

 

4.2.3.1.3 4.2.3.1.3   Rules for Relationships 规则 的关系

These Business Rules define the conditions that Relationships between Entities must support. 这些业务规则规定的条件各实体之间的关系必须支持。

They can be translated into SQL which can be applied as Test Conditions for the Data Warehouse. 它们可以被翻译成的SQL这可作为测试条件的数据仓库。

A sample is provided for as an example. 将样本规定作为一个例子。

  1. Example : An ADDRESS can be associated with zero, one or many CUSTOMER ADDRESSes . 例如: 一个地址可以与零,一个或许多客户地址

For example, many people can live at the same Address. 例如,许多人都生活在同一个地址。

  1. Example : A CUSTOMER can be associated with zero, one or many CUSTOMER ADDRESSes . 例如: 客户可以将相关的零,一个或许多客户地址    For example, Home, Work, Billing, Delivery and so on. 例如,家庭,工作,结算,配送等。


4.2.3.1.4 4.2.3.1.4   Rules for Validation 审定 规则

These are the Rules for validation of the data in a Table. 这些规则的验证中的数据表。

Two example Rules are provided for guidance. 两个例子规则规定的指导。

  DATA ITEM 数据项

  TYPE

  VALIDATION 验证

  COMMENT 评论

Example : address_id 例如: address_id

  Integer 整数

  >0 and unique “ 0和独特

  Unique Identifier for each Address. 唯一标识符每个地址。

  Example : easting 例如:

Integer 整数

  A six-digit number, less than 660000 6位数字,不到六十六点零零万

  The Easting coordinate for a BLPU 协调的东一BLPU

 

4.2.4 Tools 4.2.4工具

It is quite common to develop bespoke software for smaller internal projects with limited scope. 这是很常见的软件开发定制的小型内部项目的范围有限。

Organisations frequently build up a Library of Data Integration software 经常组织 建立一个图书馆的数据集成软件

Major vendors for Integration Tools include Informatica and Microsoft. 主要供应商的集成工具包括Informatica和微软。

Details are shown in a separate document. 详情列于一个单独的文件。

4.2.5 Tutorials 4.2.5教程

There are three Tutorials on the Database Answers Web Site that are helpful : - 有三个教程数据库回答网站有益 -

i) Data Quality i )数据质量                      - http://www.databaseanswers.org/presentations/Strategy_for_Data_Quality.ppt - http://www.databaseanswers.org/presentations/Strategy_for_Data_Quality.ppt

ii) Master Data Mgmt 二)主数据管理          - http://www.databaseanswers.org/tutorial4_bp_in_mdm/index.htm - http://www.databaseanswers.org/tutorial4_bp_in_mdm/index.htm

iii) MDM and Ref Data 三)主数据管理和参照数据        - http://www.databaseanswers.org/presentations/MDM_and_Ref_Data.ppt - http://www.databaseanswers.org/presentations/MDM_and_Ref_Data.ppt

4.2.6 How do I ? 4.2.6我该怎么做?

4.2.6.1 Plan the Data Integration process ? 4.2.6.1计划数据的一体化进程?
  • Identify the Data Stewards 确定数据托
  • Obtain buy-in from key Stakeholders within the organisation . 获得买进来自关键利益相关者的组织
  • Determine with the Users, the quality KPIs for key data items. 确定与用户,质量的关键数据增值项目。
  • Define the Field Mappings from Sources to Targets. 确定外勤映象来源的指标。

4.2.7 Qualities for Success 4.2.7成功素质

Informatica offers Certification in Data Integration : - Informatica提供认证的数据集成: -

http://www.informatica.com/products_services/education_services/certification/Pages/index.aspx http://www.informatica.com/products_services/education_services/certification/Pages/index.aspx

To be competent in this area it is important to have a clear understanding of the end-to-end process of transforming Source data into Target data and to derive satisfaction from achieving the end-result of seeing good-quality data loaded and available for subsequent analysis and reporting. 有能力在这方面重要的是要有清醒的认识的端到端的过程转化源数据到目标数据并从中取得满意的最终结果看到高质量的数据加载和供以后分析和报告。

Someone who works in this area is happy to work with Developers, Managers and End-Users. 有人谁的作品在这方面很高兴能与开发人员,管理人员和最终用户。

 


4.3 Stage 3 – Performance Reports 4.3第3阶段-业绩报告

4.3.1 State-of-the-Art 4.3.1国家最先进的

Two articles on Wikipedia summarise the State-of-the-Art on Wikipedia 的两篇文章总结了国家最先进的对

Performance Reports and Business Intelligence are very similar in their interpretation. 执行情况报告和商务智能是非常相似的解释。

4.3.2 Best Practice 4.3.2最佳实践

There are three areas involved : - 有三个领域包括: -

i) 一)                      Determine the Data Sources from the Data Marts 确定数据来源的数据集市

ii) 二)                    Choose the commercial Report-Writer 选择商业报告撰写  

iii) 三)                   Create Data Validation and Transformation procedures 创建数据验证和转换程序


4.3.3 Templates 4.3.3模板

Report Templates are available showing Content and Layout for standard Ad-Hoc and Off-the-Shelf Reports. 报告模板可显示的内容和布局的标准特设和现成的报告。

4.3.3.1 Data Mart in a Word document 4.3.3.1数据集市在Word文档

This diagram shows a Data Model for a Data Mart to hold data about Parking Tickets issued by a Local Authority in the 此图显示一个数据模型的数据集市进行数据停车罚单的一个地方当局的 UK 英国 .

It was produced in a Word document from early discussion with the End-User and was very helpful in establishing communication and a collaborative method of working. 这是制作一个Word文档从早期讨论的最终用户是非常有帮助建立沟通和协作的工作方法。

End-users find to easier to understand and agree to this kind of Data Model than a formal ERD. 最终用户寻找到更容易理解和同意这种数据模型比一个正式的位移。

This approach is therefore recommended. 这种做法因此建议。

Each Fact is associated with a number of Dimensions. 每一项事实与一些方面的问题。

The 'FACTS' Table contains the list of data items which is available. 的'事实的表中包含的数据清单的项目可用。

The other Tables are called 'Dimensions' and define how the Facts can be analysed . 其他表称为'尺寸'并确定如何在事实加以分析


4.3.3.3 Map showing KPIs 4.3.3.3地图显示章程

This Map shows Key Performance Indicators ( KPIs ) for the Wards in a Local Authority 这份地图显示关键性能指标( KPI )的病房的一个地方当局

Each Ward is displayed in either Red , Amber or Green, depending in whether the KPIs Threshold values are reached or exceeded. 每个病房中显示或者红色 ,黄色或绿色,取决于是否在阈值的关键业绩指标达到或超过。  

Red indicates a situation that requires urgent management attention, amber is a warning and green is within acceptable limits. 红色表明这种状况迫切需要管理的重视,琥珀是一种警告和绿色是在可接受的限度。

4.3.3.4 Reports at the Regional Level 4.3.3.4报告区域一级

This Report shows the total count of Customers gained and lost in an imaginary South-East Region 此报告显示总数客户获得和失去的是虚构的东南地区

RPt.1 Total Customers Gained and Lost by Week RPt.1客户总数积累和丢失周

Date selected: Month of January, 2010 日期选择: 1月份, 2010年

Week Ending 一周

Location 位置

Total Gained 共计取得的

Total Lost 损失总额

March 6 th 09 3月6 09

SE Region 东南地区      

10 10

10 10

March 13 th 09 3月13 09

SE Region 东南地区

20 20

20 20

March 20 th . 09 3月20 。 09

SE Region 东南地区

30 30

30 30

March 27 th . 09 3月27 。 09

SE Region 东南地区

40 40

40 40

April 3 rd / 09 4月3 / 09

SE Region 东南地区

50 50

50 50

April 10 th . 09 4月10 。 09

SE Region 东南地区

30 30

30 30

April 17 th . 09 4月17 。 09

SE Region 东南地区

20 20

20 20

April 24 th . 09 4月24 。 09

SE Region东南地区

10 10

10 10

.


4.3.3.5 Reports at the City Level 4.3.3.5报告在城市一级

This Report shows the total count of Customers gained and lost for 此报告显示总数客户获得和失去的 London 伦敦 in the South-East Region. 在东南地区。

RPt.1 Total Customers Gained and Lost by Week RPt.1客户总数积累和丢失周

Date selected: Month of January, 2010 日期选择: 1月份, 2010年

Week Ending 一周

Location 位置

Total Gained 共计取得的

Total Lost 损失总额

March 6 th 09 3月6 09

London 伦敦              

1 1

1 1

March 13 th 09 3月13 09

London 伦敦

2 2

2 2

March 20 th . 09 3月20 。 09

London 伦敦

3 3

3 3

March 27 th . 09 3月27 。 09

London 伦敦

4 4

4 4

April 3 rd / 09 4月3 / 09

London 伦敦

5 5

5 5

April 10 th . 09 4月10 。 09

London 伦敦

3 3

3 3

April 17 th . 09 4月17 。 09

London 伦敦

2 2

2 2

April 24 th . 09 4月24 。 09

London 伦敦

1 1

1 1

4.3.3.6 Reports for Parking Tickets 4.3.3.6报告罚单

This table shows a sample Template of unrealistic data for Parking Ticket Reports. 此表显示样本模板不现实数据罚单报告。

The Template is available on this page of the Database Answers Web Site : - 模板可在此网页上的数据库回答网址: -

http://www.databaseanswers.org/Parking_Rpts/PK06_TotalPaidPCNs_withPaymentMethod_demo_rpt.xls http://www.databaseanswers.org/Parking_Rpts/PK06_TotalPaidPCNs_withPaymentMethod_demo_rpt.xls

PK.6 - Report on Total PCNs Paid with Payment Methods PK.6 -报告共计PCNs支付付款方式

Date selected: Month of January, 2010 日期选择: 1月份, 2010年

PCN Type 个人通信网类型

Source 来源

Payment Method 付款方式

PCNs Paid PCNs付费

Amount Paid 支付的金额

PCN - BLE 个人通信网-竹叶提取液

H H