This page contains some very useful material that helps to answer the question "What is a Data Architect".
If you have any comments, please feel free to Email us
The first three Entries are Job Ads I found on the Internet, and the ones that follow were responses to a question I posted on LinkedIn.
1)A Job Requirement for a Data Architect :-
Data Architect, client data, reference data architect, static data, information/pricing data, CRM.
A Data Architect is required to fulfil a contract position at a city based bank in London.
The successful applicant must have strong knowledge of client data, amounting to knowing how it functions
within a banking environment, how to set up architecture, and how the the system interfaces.
The candidate will also be categorising data needing to be stored centrally.
Overview of Requisite Skills:
- Client Data
- Information/Pricing Data
- Static Data
Job reference information Advertiser
Astbury Marsden & Partners Limited
Contact Name Matthew Haynes
Telephone 020 7065 1195 (Please reference IT Job Board when calling)
2) Another Job Requirement :-
Enterprise Data Architect
" Responsible for responding quickly to requests for technical analysis and research
" To work with colleagues within other departments to identify information flows and data models
" To determine information requirements and delivery methods and then producing information flow diagrams
" Responsible for identifying viable alternative data integration and dissemination methods and approaches
" To ensure integrity of information and validation process associated
" To create and maintain information delivery artefacts such as models, taxonomies, dictionaries, diagrams, etc
Key skills / knowledge required:
" Business and technical awareness with numerical ability
" A wide knowledge and understanding of data modelling, collection and integration methodologies, analysis, principles and techniques
" A logical approach to researching and analysing processes and mechanisms
" Excellent report writing, analytical and project management skills, with excellent attention to detail
" At least 5 years experience of data analysis across a variety of industries.
" Ability to recommend changes to drive business objectives.
" Ability to understand and define entities and data objects
This is a 4 month contract starting ASAP.
Day rate is £450-500 per day. If you have the relevant skills and / or experience please forward your CV to firstname.lastname@example.org.
Premier is acting as a recruitment agency with regards to this role.
Job reference information Advertiser Premier Group
Contact Name Sean
Telephone 0207 367 5200 (Please reference IT Job Board when calling)
Job reference information Salary £450.00 - £500.00 per day
Job Type Contract
Date Posted 26-Aug-2011 17:37
3) A Job Requirement was posted for a Data Architect on the IT Job Board
I am looking for a Data Architect to join a top tier global investment bank in the city.
As a Data Architect, you'll develop and implement the data architecture essential for business intelligence applications.
This will include gathering data requirements, developing the data architecture design, advising on development of logical data models and implementing the physical data models.
You shall have extensive expertise in logical data modelling, physical database design, business intelligence, master data management or data integration technologies.
Requirements for the role:
Experience in data architecture
Experience with data-modelling and/ or object-relational mapping
Strong data warehousing experience
The candidate MUST have strong INVESTMENT BANKING knowledge.
The right candidate will have proven experience performing these actions at a high level.
Responses to our Question on LinkedIN
We posted a question in the LinkedIn Group: Data Architecture Professionals Group
"What exactly is a Data Architect ?"
4) Hal replied :-
I really like your mental machinations relative to the future of data.
However I tend to think in more concrete terms like data integration, which is the bread and butter of most data architects today (I would argue).
I like to think that our responsibilities start with envisioning a future state where data in its many forms (email, XML documents, and data models,
which equate to un/semi/and structured data) can be easily accessed and organized in ways that are meaningful at the corporate (metadata registry),
Business-unit (data repository) and individual level (i.e. desktop search).
I think this kind of out of the box thinking is what the best data architects are capable of doing and you clearly are in this class!
Great work. Hope we can collaborate sometime in the future.
William ("Hal") Williford
5) Moshe J replied :-
I've always thought of Data Architecture as an enterprise-wide view data design. It starts with the recognition that data is an
enterprise asset rather than a local project afterthought.
The architecture lays out and describes consistent definitions for key entities (such as customer, product, etc) and wrestles with
disparate versions and views of those entities across the enterprise.
It establishes standards and rules by which individual development projects design their data structures, so that data is ensured
to be an enterprise asset with the appropriate security, access and utilization to bring value to the larger scope.
The Data Architect is the individual with the enterprise-wide view and authority to make the architecture a reality.
Is this different that you were thinking?
6a) Bob Frasca replied :-
Data modeling is only one of the skills required of a data architect.
In fact, that's a relatively small component of the architectural process.
Anyone who understands the rules of normalization (or the concept of facts and dimensions) can model data structures.
These are the skills that I highlight when I go on an interview.
1. It's not enough to understand the data for a specific application.
A data architect needs to understand how data flows through the enterprise,
i.e. how is it captured, who uses it, who adds value to it, and how it can be leveraged for business intelligence purposes.
This usually means that the data architect has a profound understanding of the business and its processes.
2. Expert level knowledge of at least one relational data engine, i.e. Oracle, SQL Server etc.
It isn't necessary to understand all of them as the principles for implementation are basically the same for all of them.
For example, if I'm designing a large scale database and I think data partitioning or sharding will be necessary for
scalability and performance then my architecture will take that into consideration.
The method of implementation of partitioning might be different between Oracle and SQL Server but the principle is the same.
3. This leads to the requirement for expert level performance tuning skills.
An architecture that doesn't perform and isn't scalable is a waste of time and money.
Many projects fail because these considerations aren't addressed at design time.
Proper use of surrogate keys, referential integrity, and indexing strategies are as much a part of the architecture as the table definitions.
This includes infrastructure considerations such as configuring SAN's, identifying the proper RAID configuration, optimizing file distribution
to ensure I/O performance, and server and memory configuration options.
4. Understanding the consumer. Reporting consumers have a very different set of requirements than transactional system users.
This is why a "typical" design might consist of an OLTP database optimized for transactional use, a separate but similar structure usually called
an Operational Data Store (ODS) optimized for operational reporting ONLY. Finally, a third set of structures, often using a dimensional model
and usually referred to as a data mart, that is optimized for aggregation and analytical type reporting.
This may use an existing Data Warehouse but each application must feed that warehouse.
5. Performance is a recurring theme.
An OLTP system is meant to CAPTURE data not report on it so the emphasis is on write performance.
An ODS is about operational reporting so the emphasis is on read performance.
A data mart is about optimized aggregation reporting and data mining.
Often these aggregations are performed separately, i.e. building cubes etc.
The point is that the data architect must understand the fundamental differences between each use case and design for it.
6. ETL. Understanding that the data will be used downstream and that value may be added to it as it flows through the enterprise is critical.
The process of Extract, Transform, and Load can become quite convoluted and data latency issues as well as data availability considerations must be considered.
A sales order in a transactional system is just another order but by the time it gets to a data mart there might be Dun and Bradstreet data
and/or consumer demographics information applied to it.
This is fairly high level. Obviously, I could have drilled much further into each of these ideas.
The nature of the use cases can vary and the methods of getting data from the OLTP system to the ODS can vary from transactional replication, to triggers, to log shipping and more.
In short, the data architect needs to understand data at the granular level and at the 50,000 foot level.
Often, I think organizations make the mistake of thinking that data architecture is a tactical concept.
It's not. It's strategic.
6b) Bob also posted this (excellent) response to a followup question that asked whether a Database Architect was the same as a Data Architect ? :-
Well, in my opinion, a data architect can fill the role of a database architect although a database architect may not have the breadth of
experience required to be a true data architect.
I view the data architect as a more strategic position, i.e. an enterprise level role while the database architect is more tactical,
i.e. a project specific position.
Frequently, I fill both roles as I am doing now in the project I'm working on.
I'm designing the ODS, DW, and multiple data marts but I'm also evangelizing the value of data governance and data lineage
as well as identifying opportunities for adding value to the data from external sources.
I'm also working with the systems group for identifying the appropriate hardware infrastructure including storage arrays and implementing the reporting tool of choice.
I think another important part of the data architect role is getting the business excited about what they will be able to do and converting the naysayers who don't like change into supporters.
6c) Bob Frasca said :-
Sorry, I didn't mean to be so long winded but this is a topic near and dear to my heart.
As a matter of fact, I've decided to delve into this a little deeper and produce an article based on this discussion.
Even companies that hire data architects often aren't sure what the definition is.
Some think it's only about data modelling, some view the role as a hybrid DBA.
Of course, then we can have a talk about what exactly a DBA is.
Some think it's only about database design and some think it's only about performance.
I think the topic deserves some clarity.
7)Dick Brummer said :-
I agree with Bob and William but would like to extend the architecture vision with the fact that Data Architecture is one of,
if not the most important and critical domain in Enterprise Architecture as referenced in the TOGAF - TMF Industry model and framework.
With this said I believe that n Data Architect lives outside the IT and Business Entities and represent the Enterprise Architecture view
Application and Information that enable integrated procesess, utilising Industry best practises, standards and policies to define and
establish a "foundation architecture" for enterprise Data and information activitie to be shared as one version of the truth.
This includes the Enterprise metadata rep[ository to record and manage all artefacts for the Solution Development Lifecycle from concept to
retirement including but not limited to BI,DQM,MDM,LOB systems, Enterprise extended solutions, ECM, Repository services,Archiving,
Clasification, retention, archiving, Acess control and security, Auditing and tracking and then Data Governance to ensure controls to
measuresuccess and maturity are in place.
We should create a clear understanding of the difference between an Enterprise Data Architect, Data Architect, Data Analyst, Data Steward
and DBA as they all play an important part in managing data/information as an asset for unstructured as well as structured data and Information.
I think you will agree that once the Foundation Data Architecture is in place we can move on to align IT and Business for existing
capabilities as well as change request that will enable Business Demand for now and in future.
8) Harshendu Desai said :-
Please see this formula: DA = DA+DQ+DI+2MDM+DG+BI+DBA
Where DA = Data Architecture,
DA = Data Analysis,
DQ = Data Quality,
DI = Data Integration,
2MDM = Master Data Management and Meta Data management,
DG = Data Governance,
BI = Business Intelligent and
DBA = Database Administration
These are the components of data architecture.
In short, anything related to data and data process is data architecture.
Bottom-line is, someone needs to know the guts of the data.
8) Bryan said :-
I would expect to be able to have a long, thoughtful and valuable discussion about data architecture with a data architect in which we
never use the words "database", "IT", "server" or any vendor's name. And darned few acronyms.
I would expect to hear words like "usage", "ownership", "meaning" (or "semantics"), "quality", "control" and "revision" used a lot.
Perhaps in a follow-on conversation, the words "warehouse", "dimension", "entity" and "model" might work their way in.
I think this ties in with some other thoughts expressed here by others.
9) Andy Graham said :-
Interesting discussion and although I donít normally get much time to add to these discussion boards this one is very near and
dear to me so felt obliged to chip in.
Iíve been over the last few weeks interviewing potential Data Architects and I have a few observations which the group may
(or may not) find interesting.
Firstly, the majority of the candidates are highly technology focused. Although having an understanding of database, data
warehousing and MDM technology is useful to a data architect it is not the primary focus.
The ability to step back from the technology and understand the design and how that design fits within the wider enterprise
context has to be one of the primary skills needed.
Just to be clear when I refer to design I am talking about data design. If I hear another person rattle on for 10 minutes
about table partitioning or Oracle vs Netezza I will scream.
Yes I do understand both but in the former cause would expect a DBA to take the lead and in the later a technical/systems architect.
Secondly, the majority of candidates think data architecture is all about the design and forget that there are a number of other
high important aspects of a data architects day job Ė data governance, etc
A wide discussion on this subject can be found in this article on my blog
Thirdly, questions which I would consider common knowledge for a data architect that has spent years working (allegedly) in the
data warehousing space (as all my candidates have been) such as:
* What is a Junk Dimension and how do you use it?
* What is a Conformed Dimension and how do you use it?
* What is a Slowly Changing Dimension and what are the different types?
* Compare and contrast the approaches of Inmon vs Kimball?
Seem to be an alien language, aahhhh.
Rant over and hopefully this isnít too far off track from the main theme of the discussion.
10) Dave Poole said :-
There has to be at least some facet that is extracting coherent requirements from the business.
The ability to demonstrate or illustrate the art of the possible and recognise that which is not possible....yet.
Being able to show what could be achieved without introducing scope gallop.
Being able to steer the business towards choosing a suitable strategic data solution.
Being able to propose data solutions with a full cost (financial or otherwise) and benefit (financial or otherwise).
Keeping abreast of developments in the data field and sorting the hype from the reality.
Having the confidence to say NO, the persistence to keep going against adversity.
The ability to sell a data solution at all levels of the organisation to both technical and non-technical audiences.
11) Alan Freedman said :-
"Anyone who understands the rules of normalization (or the concept of facts and dimensions) can model data structures"
Although I agree with Bob's statement as a disconnected fact, I'm not sure I'm comfortable using it to support the
conclusion that data modeling is a small part of data architecture.
I recently opened an in-house seminar by stating that data modeling is ridiculously simple in concept but extremely subtle
And done well, it can often obviate the need for complex solutions to resolve performance, scalability and quality issues.
Also, data modeling has exceeded normalized erd's and multi-dimensional models for some time, and we have to model data in
xml schemas, column-oriented structures, key-value pairs, serialized blobs, etc.
And as Dave Poole pointed out, your model must reflect the business requirements and domain, which can be extremely complex
and hard to extract.
I know it was one little sentence in a great post, but I felt I had to stick up for the importance of data modeling and the
skill involved in doing it well