420px|thumb|The data modeling process. The figure illustrates the way data models are developed and used today . A [[Conceptual schema|conceptual data model is developed based on the data requirements for the application that is being developed, perhaps in the context of an activity model. The data model will normally consist of entity types, attributes, relationships, integrity rules, and the definitions of those objects. This is then used as the start point for interface or database design. The data requirements are initially recorded as a conceptual data model which is essentially a set of technology independent specifications about the data and is used to discuss initial requirements with the business stakeholders. The conceptual model is then translated into a logical data model, which documents structures of the data that can be implemented in databases. Implementation of one conceptual data model may require multiple logical data models. The last step in data modeling is transforming the logical data model to a physical data model that organizes the data into tables, and accounts for access, performance and storage details. Data modeling defines not just data elements, but also their structures and the relationships between them.
Data modeling techniques and methodologies are used to model data in a standard, consistent, predictable manner in order to manage it as a resource. The use of data modeling standards is strongly recommended for all projects requiring a standard means of defining and analyzing data within an organization, e.g., using data modeling:
- to assist business analysts, programmers, testers, manual writers, IT package selectors, engineers, managers, related organizations and clients to understand and use an agreed-upon semi-formal model that encompasses the concepts of the organization and how they relate to one another
- to manage data as a resource
- to integrate information systems
- to design databases/data warehouses (aka data repositories)
Data modelling may be performed during various types of projects and in multiple phases of projects. Data models are progressive; there is no such thing as the final data model for a business or application. Instead, a data model should be considered a living document that will change in response to a changing business. The data models should ideally be stored in a repository so that they can be retrieved, expanded, and edited over time. Whitten et al. (2004) determined two types of data modelling:
Topics
Data models
[[File:3-4 Data model roles.svg|thumb|320px|How data models deliver benefit.
Some common problems found in data models are:
- Business rules, specific to how things are done in a particular place, are often fixed in the structure of a data model. This means that small changes in the way business is conducted lead to large changes in computer systems and interfaces. So, business rules need to be implemented in a flexible way that does not result in complicated dependencies, rather the data model should be flexible enough so that changes in the business can be implemented within the data model in a relatively quick and efficient way.
- Entity types are often not identified, or are identified incorrectly. This can lead to replication of data, data structure and functionality, together with the attendant costs of that duplication in development and maintenance. Therefore, data definitions should be made as explicit and easy to understand as possible to minimize misinterpretation and duplication.
- Data models for different systems are arbitrarily different. The result of this is that complex interfaces are required between systems that share data. These interfaces can account for between 25 and 70% of the cost of current systems. Required interfaces should be considered inherently while designing a data model, as a data model on its own would not be usable without interfaces within different systems.
- Data cannot be shared electronically with customers and suppliers, because the structure and meaning of data have not been standardised. To obtain optimal value from an implemented data model, it is very important to define standards that will ensure that data models will both meet business needs and be consistent.
- Conceptual schema: describes the semantics of a domain (the scope of the model). For example, it may be a model of the interest area of an organization or of an industry. This consists of entity classes, representing kinds of things of significance in the domain, and relationship assertions about associations between pairs of entity classes. A conceptual schema specifies the kinds of facts or propositions that can be expressed using the model. In that sense, it defines the allowed expressions in an artificial "language" with a scope that is limited by the scope of the model. Simply described, a conceptual schema is the first step in organizing the data requirements.
- Logical schema: describes the structure of some domain of information. This consists of descriptions of (for example) tables, columns, object-oriented classes, and XML tags. The logical schema and conceptual schema are sometimes implemented as one and the same.]]
In the context of business process integration (see figure), data modeling complements business process modeling, and ultimately results in database generation. only two modeling methodologies stand out, top-down and bottom-up:
- Bottom-up models or View Integration models are often the result of a reengineering effort. They usually start with existing data structures forms, fields on application screens, or reports. These models are usually physical, application-specific, and incomplete from an enterprise perspective. They may not promote data sharing, especially if they are built without reference to other parts of the organization.]]
There are several notations for data modeling. The actual model is frequently called "entity–relationship model", because it depicts data in terms of the entities and relationships described in the data.]]
Generic data models are generalizations of conventional data models. They define standardized general relation types, together with the kinds of things that may be related by such a relation type.
The definition of the generic data model is similar to the definition of a natural language. For example, a generic data model may define relation types such as a 'classification relation', being a binary relation between an individual thing and a kind of thing (a class) and a 'part-whole relation', being a binary relation between two things, one with the role of part, the other with the role of whole, regardless the kind of things that are related.
Given an extensible list of classes, this allows the classification of any individual thing and to specification of part-whole relations for any individual object. By standardization of an extensible list of relation types, a generic data model enables the expression of an unlimited number of kinds of facts and will approach the capabilities of natural languages. Conventional data models, on the other hand, have a fixed and limited domain scope, because the instantiation (usage) of such a model only allows expressions of kinds of facts that are predefined in the model.
Semantic data modeling
The logical data structure of a DBMS, whether hierarchical, network, or relational, cannot totally satisfy the requirements for a conceptual definition of data because it is limited in scope and biased toward the implementation strategy employed by the DBMS. That is unless the semantic data model is implemented in the database on purpose, a choice which may slightly impact performance but generally vastly improves productivity.
[[File:A2 4 Semantic Data Models.jpg|thumb|320px|Semantic data models.
See also
References
Further reading
<!-- * J.H. ter Bekke (1991). Semantic Data Modeling in Relational Environments -->
- John Vincent Carlis, Joseph D. Maguire (2001). Mastering Data Modeling: A User-driven Approach.
- Alan Chmura, J. Mark Heumann (2005). Logical Data Modeling: What it is and how to Do it.
- Martin E. Modell (1992). Data Analysis, Data Modeling, and Classification.
- M. Papazoglou, Stefano Spaccapietra, Zahir Tari (2000). Advances in Object-oriented Data Modeling.
- G. Lawrence Sanders (1995). Data Modeling
- Graeme C. Simsion, Graham C. Witt (2005). Data Modeling Essentials
- Matthew West (2011) Developing High Quality Data Models
External links
- Agile/Evolutionary Data Modeling
- Data modeling articles
- Database Modelling in UML
- Data Modeling 101
- Semantic data modeling
- System Development, Methodologies and Modeling Notes on by Tony Drewry
- Request For Proposal - Information Management Metamodel (IMM) of the Object Management Group
- Data Modeling is NOT just for DBMS's Part 1 Chris Bradley
- Data Modeling is NOT just for DBMS's Part 2 Chris Bradley
