
FOR
Barry
Williams
Principal
Consultant
Database
Answers Ltd.
CHAPTER
3. A VISION OF THE FUTURE
CHAPTER
5. A CASE STUDY FOR THE ROAD MAP
This document describes a
Road Map for Enterprise Data Management which covers the important phases from
Integration of Data Sources to the production of Integrated Performance
Reports, with Business Intelligence.
The benefits of this document
are that it lays out a Road Map which can help anybody with questions about
Enterprise Data Management to get useful answers.
The Road Map contains five separate
Stages which can be used to plan and control any activity related to Enterprise
Data Management.
These Stages are :-
1)
Database
Design
2)
Data
Integration
3)
Performance
Reporting
4)
Internet
Mashups
5)
Data
Governance
Separate
documents discusses how the Road
Map could be implemented by Microsoft, Informatica
and Salesforce.com.
The documentation for each
Stage has generally the same structure :-
The Approach has been to formalise
Best Practice in Enterprise Data Management and to make this Best Practice
accessible by a series of Questions.
To use this Document, you
should answer the Questions in the Self-Assessment in Section 2.1 to determine
which Stage you are at.
If you have a question that
is not covered please tell us about it and we will be happy to respond.
You can email us at dba_requests@barryw.org.
Chapter 3 presents a Vision
of the Future for the role of Databases.
Chapter 5 presents a Case
Study showing how the Road Map would be used in practice.
The first table contains a
summary of the Questions which help in the Self-Assessment for any individual
or organisation to determine where they are along the
Road Map.
This is sample of Questions which
will be added to regularly.
|
Nr. |
QUESTION |
STAGE |
|
1 |
Do you need to design a Database ? |
1 |
|
2 |
Do you need to handle
multiple languages ? |
1 |
|
3 |
Do you use multiple types
of Database, such as SQL Server and Oracle ? |
2 |
|
4 |
Is Data Quality an |
2 |
|
5 |
Do you have a Single View
of the Things of Importance, such as Customers ? |
2 |
|
6 |
Do you have Master Data
Management (MDM) in place ? |
2 |
|
7 |
Can you verify the
derivation of all data (the Data Lineage’) in your Reports
? |
3 |
|
8 |
Do you want to combine
Excel data in your Reports ? |
3 |
|
9 |
Does your Chief Exec have
Report requirements that you cannot meet ? |
3 |
|
10 |
Is anyone using Mashups in your organisation ? |
4 |
|
11 |
Do you have a top-down view
of Data Management in your organisation ? |
5 |
|
12 |
Does your organisation have a Data Governance function
? |
5 |
This table provides a
snapshot to help in the Assessment process.
|
STAGE |
BASIC |
AVERAGE |
IDEAL |
|
1) Data Sources |
Knowledge in the heads of individuals. |
Top 20 Applications known
with list of Data Sources and Owners |
Agile development with refactoring techniques. |
|
|
No Data Models and poor
documentation of links between code and databases. |
Basic Data Dictionary in
place. |
Data Models and sign-off by
DBA on all changes. |
|
|
|
|
User access and sign-off for
Data Dictionary |
|
2) Data Integration |
Ad-hoc integration using
bespoke SQL Scripts |
Some Templates established
and commercial Tools in use. |
MDM approved, data owner
sign-off, Data Quality is an |
|
|
|
Software Tools linked to the
Data Dictionary |
Clear and reconciled
top-down and bottom-up views of data. |
|
|
|
|
Data Architecture and Data
Models for Sources and Targets. |
|
3) Performance Rpts |
One-off, often independent
Dept. Spreadsheets |
Independent Maps, KPIs and drill-down to detailed Reports |
Integrated Maps, KPIs and drill-downs for Chief Exec |
|
4) Internet Mashups |
None |
Isolated development |
Users aware |
|
5) Data Governance |
None |
No end-to-end agreement. |
Procedures published, Roles
and Responsibilities and Sign-off all in place. |
|
|
|
|
Data lineage known and
auditable. |
In the future, the current
trend to Widgets and end-user data integration will continue and user demands
will increasingly call for easy access to all data at any time and using any
device.
The functionality offered by cell
phones or mobiles will continue, with Apple’s iPhone
expected to maintain its position of leadership.
This situation is shown in
the following diagram, with four very different perspectives being seen by groups :-
1)
Suppliers
2)
Users
& Organisations
3)
University
Research Departments
4)
Students

In the future, Databases will
be accessible at any time and from any location using any Device.
This diagram shows how
real-time links can be provided to all Databases.

In addition, more data types,
built-in compatible Data Models in the Clouds, mix-and-match selection of
required Tables, with Platforms for Vertical Applications and creation of Data
Marts. generation of Data and built-in resolution of
the impedance mismatch between the Relational and Object approaches.
Conceptually, there will be an Integrated Data Platform, with a range of
superimposed Data Service Layers.
Databases will come equipped
with self-correction, self-monitoring and self-tuning.
This Architecture features
three Levels.

Gartner predicts that within
a few years, 80% of all Enterprises will have at least some involvement in
Cloud Computing.
Our thinking should therefore
encompass Scenarios where some of our data will be located in the Clouds.
Data Integration and Data
Quality must provide for integration with Cloud data.
This diagram shows that Data
Sources and Data Quality On Demand Services can be in
the Clouds.
DataQuality-on-Demand is provided by Informatica
– http://www.informaticaondemand.com/

The Data
Dictionary will be located in the Clouds so that it will be readily available
to anybody at any time and from any location.
Here is an
extract from a typical Dictionary :-
|
SOURCES |
TYPE |
DETAILS |
COMMENTS |
DATA OWNER |
|
CRM |
Salesforce |
Objects |
Bobby is
happy |
Bobby
Caldwell |
|
HR |
SQL Server |
Custom
Objects |
Will meet
face-to-face |
Frankie
Beverley |
|
Finance |
Spreadsheets |
|
Ray sees
things clearly |
Ray Charles |
|
|
|
|
|
|
The Data
Dictionary will be populated by reading data from the System Catalogues for
Data Sources.

This Section describes the
details the major Stages in the Road Map.
It is presented in a
step-by-step sequence, from Data Sources to Data Governance.
The Steps are
:-
·
Data
Sources -
o
Identify
the Data Sources
o
Create
Data Models
·
Data
Integration –
o
Design
Target ERD Data Model for combined Data Sources
·
Mapping
–
o
Map
Entities
o
Map
Attributes
o
Define
Rules for Relationships and Field validation
·
Data
Quality (DQ) –
o
Produce
DQ Profiles
o
Agree
required DQ Standards
o
Repeat
Data Validation and Clean-Up as necessary
·
Design
the Data Mart
·
Performance
Reports -
o
Agree
KPIs with Users
o
Agree
Top-Level Summary Reports
o
Agree
Detailed Reports
·
Internet
Mashups -
o
Determine
the requirements for Mashups
o
Design
and Build Mashups as appropriate
·
Data
Governance -
o
Ensure
Compliance with Policies and Procedures.
o
Modify
as appropriate
* Wikipedia
on Database Design - http://en.wikipedia.org/wiki/Database_design
* Wikipedia
on Data Modeling - http://en.wikipedia.org/wiki/Data_modeling
Here is a series of Steps in
designing a Database :-
Step 1. Establish the Scope of the Database.
Step 2. Identify the ‘Things of Interest’
Step 3. Define the Business Rules that determine how these
‘Things of Interest’ are related
Step 4. Choose the Data Modelling
Tool.
Step 5. Produce first draft Data Model and review with the Users.
Step 6. Ask the Users to provide sample data.
Step 7. Load data into Database and confirm the Design.
A very valuable set of over
600 Kick-Start Data Models are available on the Database Answers Web Site :-
-
http://www.databaseanswers.org/data_models/index.htm
You will probably find
something to give you an excellent start to designing a new Database.
If not, contact us by email
at barryw@databaseanswers.org
and we will help you to get started.
Here is an example of an Entity-Relationship
Diagram for a Database designed for Parking Tickets in a Local Authority in the
UK :-

There is a wide choice of
Data Modeling Tools and here is a sample of the most popular Tools available :-
* Data Modelling - http://www.databaseanswers.org/tutorial4_data_modelling/index.htm
* Database Design - http://www.databaseanswers.org/tutorial4_getting_started_with_db_design/index.htm
* Understanding a Database
Schema - http://www.databaseanswers.org/tutorial4_db_schema/index.htm
Certification can be
described as ‘Necessary but not sufficient’. In other words, some employers
consider it as evidence that you have the necessary technical knowledge and
skills to be a Database Administrator, but without any experience, it will not
guarantee you a job.
If you take your profession
seriously and are committed to self-improvement, then you should certainly
consider getting certified in the DBMS of your choice.
Here are some very useful
Microsoft Web Links :-
* Overview of Certification - http://www.microsoft.com/learning/mcp/default.mspx
* Database Administrator - http://www.microsoft.com/learning/mcp/mcitp/dbadmin/default.mspx
* Microsoft Certified Master - http://www.microsoft.com/learning/mcp/master/sql/default.mspx
* Certified Database Architect - http://www.microsoft.com/learning/mcp/architect/database/default.mspx
Examine the Query Execution
Plan to make sure that the appropriate Indexes have been created and are being
used properly.
Skills include T-SQL for SQL
Server and PL/SQL for Oracle.
A good Database Administrator
(DBA) likes to have responsibility for a clearly defined area, namely a
production Database. He (or she) is
happy to make decisions and defend them against questions from Developers,
Managers and End-Users.
It is useful for a Database
Designer to have a DBA background, but is likely to welcome the challenge of
interacting with Users, creating a design for a new Database and working with
Users to get agreement on the new design.
* Wikipedia
on Data Integration - http://en.wikipedia.org/wiki/Data_integration
* Wikipedia
on Data Quality - http://en.wikipedia.org/wiki/Data_quality
* Wikipedia
on Microsoft’s Integration Services -
http://en.wikipedia.org/wiki/SQL_Server_Integration_Services
Case Study
Here’s a Case Study on the
Database Answers Web Site about Data Integration in the Clouds
http://www.databaseanswers.org/data_integration_case_study.htm
Connecting Databases
One of the requirements might
be to connect separate physical Databases.
In order to achieve this, the
requirements can be defined and then appropriate products can be selected from
chosen vendors. For example,
a simple techniques is to prefix a Table name with the Database
name in an SQL statement.
Architectures are vitally
important to an understanding of Data Integration.
After the appropriate
Architecture has been correctly designed, the choice of Products can be made.
It is possible that sometimes
these Products might be developed in-house, especially if an organization or
individual has experience and a Library of Software Utilities has been
established.
A number of different
Architectures are included in this Section to provide a starting-point for
specific Projects.
For planning the Steps for a
Project, here is a general Approach :-
Step 1. Establish the Scope of both Sources and Targets.
Step 2. Identify the key Data Owners within the Scope.
Step 3. Define the Mappings between Source and Target Data
Items
Step 4. Agree the minimum acceptable Data Quality standards.
For
example, every Address will be validated.
This page lists some useful
Web Links for Customer Data Integration :–
http://www.databaseanswers.org/customer_data_integration.htm
Mapping is defined at the
field level between all Sources and Targets.
For example, for Local
Government, a Voter from the Electoral Register can be mapped to a Customer in
the Customer Master Index.
A Parking Ticket Vehicle
Owner can also be mapped to the same Customer.
When there are many sources
of similar data, such as Customers, there are frequently duplicate records.
For example, in the
In the
The rules for recognizing and
resolving this kind of problem has led to the development of software for Deduplicating records. This
process is informally referred to as ‘de-duping’, especially by people who do a
great deal of it.
Best Practice is to look for
a commercial product, rather than to write your own bespoke software because it
usually takes longer than expected and commercial products can be quite cheap.
This page on the Database
Answers Web Site is an excellent starting-point :-
http://www.databaseanswers.org/deduping.htm
This diagram shows a top-down
view of the major Components in the Architecture.

This diagram shows details of
the Data Integration Component in the Architecture shown above.


The use of Web Services
allows some Components in this Architecture to be distributed in the Clouds.

MDM requires a Common Data
Model as the Target to which data from multiple Sources can be loaded.
This Data Model can used to
provide a generic, flexible foundation for a Data Services Layer.
This diagram shows a very
high-level Data Model which is one candidate for this kind of CDM.
In practice, this is never
used because it is too cumbersome and makes it difficult to obtain with the
interested Stakeholders. It can also postpone difficult decisions and therefore
can encourage bad practice.

If one of the Data Sources is Salesforce.com, then
knowledge of the Salesforce Database design is vital.
The ERD is shown in a Chapter at the end of this
document.
The most important Entities are Account (ie Customer), Case, Contact, Contract, Partner
A Customer Master Index (CMI)
is very important in establishing a Single View of a Customer. The CMI consists
basically of cross-references between each Source System and the single Target
System.

One of the major components
in Master Data Management (‘MDM’) is Customers.
A Customer Master Index, (‘CMI’)
supports a Single View of a Customer.
Master Data Management
applies the same principles to all the ‘Things of Interest’ in an organisation.
This can typically include
Employees, Products and Suppliers.
MDM involves the same kind of
operations as a CMI. That is, identification and removal of duplicates, and
putting in place to eliminate duplicates in any new data loaded into the
Databases.
There is a wide choice of
software vendors offering MDM products.
De-duplication and Address
validation is a niche market in this area.

On my Database Answers Web
Site, I have a Tutorial on Getting Started in MDM :-
These building-blocks
represent successive levels that can be put in place in a controlled manner.
Each building-block builds on
the previous manner.
This can be used in the
planning and control of the Data Management.
Data Governance provides a
thread of continuity through the process and can ensure the integrity and
consistency of the data.

Here’s a page on the Database
Answers Web Site discussing Performance Reports :-
http://www.databaseanswers.org/tutorial4_integrated_performance_reporting/index.htm
The Information Catalogue
records a range of critical data related to a Data Migration activity.
For example, a list of
Entities, Tables, Fields Mappings and Rules for Relationships and
Validation.
This Templates
is used to define the mapping of Entities or Tables from a specific Source to a
specific Target.
For example, from an Electoral
Register to a Generic Customer Services Data Model
(GCDM).
This Transformation is
supported by Mapping Specifications and the appropriate software.
This software can be either
manually-coded SQL, a specialized solutions, such as Salesforce’s Excel Connector,
or a general-purpose commercial product, such as Informatica.
|
Source Table |
Target Table |
Comment |
|
Example : Electoral Register |
Example : Customer |
|
|
Example : Elections |
Example : Customer_Event |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This Template defines the
correspondence between Fields in Data Sources and Targets.
An example of this Template
in use is included in Section 6.4.
|
SOURCE TABLE |
DATA ITEM |
TYPE |
VALIDATION |
TARGET TABLE |
TARGET ATTRIBUTE |
COMMENT |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
These
Business Rules define the conditions that Relationships between Entities must
support.
They
can be translated into SQL which can be applied as Test Conditions for the Data
Warehouse.
A
sample is provided for as an example.
For example, many people can live at the same
Address.
These are the Rules for validation
of the data in a Table.
Two
example Rules are provided for guidance.
|
DATA ITEM |
TYPE |
VALIDATION |
COMMENT |
|
Example : address_id |
Integer |
>0 and unique |
Unique Identifier for each Address. |
|
Example : easting |
Integer |
A six-digit number, less than 660000 |
The Easting coordinate for a BLPU |
|
|
|
|
|
|
|
|
|
|
It is quite common to develop
bespoke software for smaller internal projects with limited scope.
Organisations frequently build up a Library
of Data Integration software
Major vendors for Integration
Tools include Informatica and Microsoft.
Details are shown in a
separate document.
There are three Tutorials on
the Database Answers Web Site that are helpful :-
i) Data Quality -
http://www.databaseanswers.org/presentations/Strategy_for_Data_Quality.ppt
ii) Master Data Mgmt - http://www.databaseanswers.org/tutorial4_bp_in_mdm/index.htm
iii) MDM and Ref Data - http://www.databaseanswers.org/presentations/MDM_and_Ref_Data.ppt
Informatica offers Certification in Data Integration
:-
http://www.informatica.com/products_services/education_services/certification/Pages/index.aspx
To be competent in this area
it is important to have a clear understanding of the end-to-end process of
transforming Source data into Target data and to derive satisfaction from
achieving the end-result of seeing good-quality data loaded and available for
subsequent analysis and reporting.
Someone who works in this
area is happy to work with Developers, Managers and End-Users.
Two articles on Wikipedia summarise the
State-of-the-Art on
Performance Reports and
Business Intelligence are very similar in their interpretation.
There are three areas involved :-
i)
Determine
the Data Sources from the Data Marts
ii)
Choose
the commercial Report-Writer
iii)
Create
Data Validation and Transformation procedures
Report Templates are
available showing Content and Layout for standard Ad-Hoc and Off-the-Shelf
Reports.
This
diagram shows a Data Model for a Data
Mart to hold data about Parking Tickets issued by a Local Authority in the
It was produced in a Word
document from early discussion with the End-User and was very helpful in
establishing communication and a collaborative method of working.
End-users find to easier to
understand and agree to this kind of Data Model than a formal ERD.
This approach is therefore
recommended.
Each Fact is associated with
a number of Dimensions.
The ‘FACTS’ Table contains the list of data items which is
available.
The other Tables are called
‘Dimensions’ and define how the Facts can be analysed.

This diagram produced by a
Data Modelling Tool and is the complete analysis of
all the data required.

This Map shows Key
Performance Indicators (KPIs) for the Wards in a
Local Authority
Each Ward is displayed in either Red, Amber or Green, depending in whether the KPIs Threshold values are reached or exceeded.
Red indicates a situation that
requires urgent management attention, amber is a warning and green is within
acceptable limits.

This Report shows the total
count of Customers gained and lost in an imaginary South-East Region
|
RPt.1 Total
Customers Gained and Lost by Week |
|||||||
|
Date selected:
Month of January, 2010 |
|||||||
|
Week Ending |
|
Location |
|
Total Gained |
|
Total Lost |
|
|
March 6th
09 |
|
SE Region |
|
10 |
|
10 |
|
|
March 13th
09 |
|
SE Region |
|
20 |
|
20 |
|
|
March 20th.
09 |
|
SE Region |
|
30 |
|
30 |
|
|
March 27th.
09 |
|
SE Region |
|
40 |
|
40 |
|
|
April 3rd/
09 |
|
SE Region |
|
50 |
|
50 |
|
|
April 10th.
09 |
|
SE Region |
|
30 |
|
30 |
|
|
April 17th.
09 |
|
SE Region |
|
20 |
|
20 |
|
|
April 24th.
09 |
|
SE Region |
|
10 |
|
10 |
|
.
This Report shows the total
count of Customers gained and lost for
|
RPt.1 Total
Customers Gained and Lost by Week |
||||||
|
Date selected:
Month of January, 2010 |
||||||
|
Week Ending |
|
Location |
|
Total Gained |
|
Total Lost |
|
March 6th
09 |
|
|
|
1 |
|
1 |
|
March 13th
09 |
|
|
|
2 |
|
2 |
|
March 20th.
09 |
|
|
|
3 |
|
3 |
|
March 27th.
09 |
|
|
|
4 |
|
4 |
|
April 3rd/
09 |
|
|
|
5 |
|
5 |
|
April 10th.
09 |
|
|
|
3 |
|
3 |
|
April 17th.
09 |
|
|
|
2 |
|
2 |
|
April 24th.
09 |
|
|
|
1 |
|
1 |
This table shows a sample
Template of unrealistic data for Parking Ticket Reports.
The Template is available on
this page of the Database Answers Web Site :-
http://www.databaseanswers.org/Parking_Rpts/PK06_TotalPaidPCNs_withPaymentMethod_demo_rpt.xls
|
PK.6 -
Report on Total PCNs Paid with Payment Methods |
||||||||||
|
Date selected:
Month of January, 2010 |
||||||||||
|
PCN Type |
|
Source |
|
Payment
Method |
|
PCNs Paid |
|
Amount Paid |
||
|
PCN - BLE |
|
H |
|
Credit Card |
|
5 |
|
£300.00 |
||
|
PCN - BLE |
|
O |
|
Cheque |
|
186 |
|
£11,160.00 |
||
|
PCN - BLE |
|
O |
|
Credit Card |
|
1 |
|
£60.00 |
||
|
PCN - BLE |
|
O |
|
Postal Order |
|
4 |
|
£240.00 |
||
|
PCN - BLE |
|
U |
|
Auto Phone
Payment |
|
594 |
|
£35,700.00 |
||
|
PCN - CCTV |
|
H |
|
Credit Card |
|
3 |
|
£150.00 |
||
|
PCN - CCTV |
|
H |
|
Debit Card |
|
5 |
|
£250.00 |
||
|
PCN - CCTV |
|
O |
|
Cheque |
|
171 |
|
£8,700.00 |
||
|
PCN - CCTV |
|
O |
|
Postal Order |
|
2 |
|
£100.00 |
||
|
PCN - CCTV |
|
U |
|
Cash |
|
50 |
|
£2,500.00 |
||
|
PCN - CCTV |
|
U |
|
Cheque |
|
5 |
|
£250.00 |
||
|
PCN - DTE |
|
H |
|
Credit Card |
|
28 |
|
£1,680.00 |
||
|
TOTAL |
|
|
10,000 |
|
£500,000 |
|||||
A wide range of vendors have
realized that many User requirements for Performance Reports have a similar
style for layout and content and that it is therefore possible to define
Templates and general-purpose reporting software. Products available from Microsoft
are described in a separate document.
Here’s a Tutorial from
Database Answers on Integrated Performance Reporting –
http://www.databaseanswers.org/tutorial4_integrated_performance_reporting/index.htm
Certification is discussed in
Section 1.6.1 in these terms :-
Certification can be
described as ‘Necessary but not sufficient’. In other words, some employers
consider it as evidence that you have the necessary technical knowledge and
skills to be a Database Administrator, but without any experience, it will not
guarantee you a job.
If you take your profession
seriously and are committed to self-improvement, then you should certainly
consider getting certified in the DBMS of your choice.
Here is a Web Link discussing
the role of Microsoft Certified Technology Specialist in SQL Server Business Intelligence :-
http://www.microsoft.com/learning/mcp/mcts/bi/default.mspx
When you are planning to
produce Reports, it is vital to plan for changes to avoid disappointment.
The most common response when
Users get their much-anticipated Reports for the first time,
is for them to say – “Oh dear, that isn’t really what I wanted’.
Even when the Reports meet
their Requirements, which will have been well-documented, and probably
signed-off by the Users, they still want changes made.
There are some technical
things you can do, including specifications for Report Templates which capture
the features in families of similar Reports.
From a procedural point of
view, you can discuss with the Users, how they see the patterns of future
changes, and try to understand the operational environment. This will help you
see how the Reports fit into their management style and
You can identify a
progression from KPIs (Key Performance Indicators),
Traffic Light Reports (using Red, Amber and Green to indicate the seriousness
of situations being reported on), Dashboards, Scorecards
This will help you to arrange
for the appropriate management education so that you and your Users are always
in step, with your planning for what is just around the corner.
To be competent in this area
of Performance Reporting it is useful to be able to see things from the User’s
perspective and formulate the layout and content of the Reports accordingly
People who are successful
working in this area are happy to work with End-Users and formulate Report
requirements in a style that can be easily understood and implemented by the
developers who might be the Report specialist.
They are subsequently able to
implement the inevitable changes requests by the End-user and manage the
expectations of the End-user and developers.
Wikipedia provides a useful introduction to Enterprise Mashups –
Wikipedia defines Enterprise Mashups as :-
“a Web
application that combines data from one or more sources … thereby creating
a new and distinct Web service that was not originally provided by either
source.
A Mashup [is] frequently done by APIs, [for example] data
from Google
Maps combined with location real estate
data.”
Best Practice will discuss
how to prepare a Business Case and create Mashups
Mashups are most frequently created by Developers, but they
are increasingly being produced using Web Sites like Boomi,
that offer high-level Tools that can be used by non-professional developers to
create Mashups.
Mashups are frequently associated with Widgets that are
placed on PC Pages or
Templates cover specification
of User Requirements and compliance with Data Governance.
Tools for creating Mashups are provided by a number of commercial Suppliers, for
example :-
i)
Boomi
– http://www.boomi.com/
ii)
Google Mashup Editor - http://en.wikipedia.org/wiki/Google_Mashup_Editor
iii)
Jackbe - http://en.wikipedia.org/wiki/JackBe
iv)
Microsoft
Popfly - http://en.wikipedia.org/wiki/Microsoft_Popfly
Tutorials will be derived
from a review of Mashup Data Architectures.
The best way to demonstrate
the potential is to produce something quickly that meets a genuine business
requirement.
The best way to identify the
requirement is to engage with Stakeholders to determine what kind of data they
need that they are not currently getting, especially data that comes from more
than one source .
To be successful in creating Mashups it is necessary to possess a number of qualities :-
* have some level of
experience as a developer
* to
like to work with users
* to
enjoy seeing results produced quickly that meet a clearly defined User
requirement.
* to
enjoy keeping up-to-date with innovations in fast-moving technology
* to
be able to recognize innovations that are important and have lasting power.
Martin Fowler is a
highly-respected writer in State-of-the-Art topics and has a Catalog of
Enterprise Architectures –
http://martinfowler.com/eaaCatalog/
Mike Walker’s Blog is worth a look –
Modelling Tool and Consulting for IT Architects with download
Trial -
http://www.mega.com/index.asp/l/en/c/ea
MEGA Database Builder –
http://www.mega.com/index.asp/l/en/c/product/p/mega-modeling-suite/p2/mega-database-builder
Modelling Suite for
http://www.mega.com/index.asp/l/en/c/product/p/mega-modeling-suite
Blog for Architects - http://chiefarchitect.squarespace.com/ea/
* Popfly
on Architectures –
* http://www.popfly.com/users/tooh/Enterprise%20Architecture%20Mashup
From this Blog - http://togaforblunder.blogspot.com/2007/10/how-to-develop-complete-picture-of.html
* Wikipedia
on Data Governance - http://en.wikipedia.org/wiki/Data_governance
* Alignment of
If you are active in this
area, you should consider joining a professional organizational.
This helps you to network
with your peer group and will encourage you to keep up-to-date in knowledge and
professional practice.
Here are two organisations that are planning active roles in Data Governance :-
i) The Data Governance Institute (Membership starts at
$150 for individuals) :-
http://www.datagovernance.com/
ii) The Data Governance and Stewardship Community of
Practice ($150/year) :-
- http://www.datastewardship.com/
It includes coverage of some very useful Case Studies :-
http://www.datastewardship.com/content.aspx?page_id=22&club_id=885168&module_id=37956
It also maintains a Data
Governance Software Web Site :-
http://www.datagovernancesoftware.com/
and Sarbanes-Oxley Web Site - http://www.sox-online.com/
Successful Data Governance
requires successful completion of many Tasks, most of which are not technical
and all of which are complex and challenging.
The process of initially
establishing Data Governance will always take much longer than anticipated.
Therefore, management of expectations is very important.
It is necessary to aim for
collaboration and by-in from the start.
This buy-in must be top-down
and bottom-up.
The top-down buy-in manifests
itself by, for example, a realization that ‘Data Quality is an Enterprise
Issue’.
The bottom-up buy-in is
important because quality of Source data is critical and must be good,
otherwise any Data Governance issue will fail.
Auditing, for example, of
database changes, is a basic requirement.
The Steps involved are as follows :-
Step 1. Determine the principles and goals of Data
Governance.
For example ‘Data Quality is an Enterprise Issue’.
Step 2. Clarify Roles and responsibilities, both top-down and
bottom-up.
Step 3. Obtain agreement on how success will be measured.
Step 4. Determine Data Quality standards at all phases in ETL
Step 4. Avoid a demanding schedule.
Step 5. Establish a spirit of collaboration and ownership of
a common problem.
This will help obtain timely resolution of all
issues.
* Standard documents (Intranet
Forms) for sign-off to plan and control Data Management.
* Data Governance - http://datagovernance.com/dgi_framework.pdf
The basis for a Self-Assessment
checklist is in place.
Data Standardisation
in Public Sector –
- http://www.databaseanswers.org/presentations/data_standards_in_public_sector.ppt
Strategy for Information Mgt -
http://www.databaseanswers.org/presentations/info_mgt_strategy.ppt
It is essential to get key
Stakeholders involved and one good way to achieve this is by arranging
Facilitated Workshops. Where necessary, you can supplement these with and
one-on-one discussions between you or your team and specific individuals who
have concerns that are best resolved before holding Workshops.
To be successful in Data
Governance it is important to be comfortable working with senior management and
to have a clear idea of what is achievable in large organizations and how to
establish agreement over overall goals for an enterprise.
Someone who works in this
area is happy to work with senior management and will understand just enough
about technology to make an effective contribution and to ensure that the role
of technology is used to the maximum effect.
The Information Catalogue plays
a very important role in supporting three activities related to systems generation :-
In addition, the Information
Catalogue is used to record Users with Roles and Responsibilities and to keep
track of progress in signing-off milestones, Project documents and so on.
The Information Catalogue
will be stored ‘in the Clouds’ so that it will be accessible to anyone, at any
time and from any place using any device.
This diagram shows the Data
Model for the Information Catalogue, including an interface to Salesforce Objects and Custom Objects.

.
This Case Study is taken from
Local Government is based on the experience of Database Answers Ltd. gained
over a five year period working with Local Authorities in the
This Case Study can be applied
to Microsoft’s ‘peopleready’ Project
:-
http://www.microsoft.com/business/peopleready/en-us/
It is presented in a
step-by-step sequence, from Data Sources to Data Governance.
The Steps are
:-
·
Data
Sources -
o
Identify
the Data Sources
o
Create
Data Models
·
Data
Integration –
o
Design
Target ERD Data Model for combined Data Sources
·
Mapping
–
o
Map
Entities
o
Map
Attributes
o
Define
Rules for Relationships and Field validation
·
Data
Quality (DQ) –
o
Produce
DQ Profiles
o
Agree
required DQ Standards
o
Repeat
Data Validation and Clean-Up as necessary
·
Design
the Data Mart
·
Performance
Reports -
o
Agree
KPIs with Users
o
Agree
Top-Level Summary Reports
o
Agree
Detailed Reports
·
Internet
Mashups -
o
Determine
the requirements for Mashups
o Design and Build Mashups