Data Modeling and Databases II Entity-Relationship (ER) Model Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich
Database design Information Requirements Requirements Engineering Book of duty Processing Requirements Conceptual design (ER) Conceptual Modeling DBMS Logical Modeling Logical design (schema) Physical Modeling Physical design Hardware/OS 2
Book of Duty Describe information requirements Objects used (e.g., student, professor, lecture) Domains of attributes of objects Identifiers, references / relationships Describe processes E.g., examination, degree, register course Describe processing requirements Cardinalities: how many students? Distributions: skew of lecture attendance Workload: how often a process is carried out Priorities and service level agreements 3
Entity/Relationship (ER) Model Entity ( object ) Student Relationship ( connection ) attends Attribute ( property ) Name Key MatrNr Role enrolls 4
Example Legi Name Semester Student attends Lecture Attendant Course Nr Title CP 5
Example: University requires Legi prerequisite follow-up Nr Name Student attends Lecture CP Semester Title Grade tests gives PersNr Level Name Assistant works-with Professor Room Area PersNr Name 6
What does this say? Students have a LegiNr, Name and Semester. The Legi identifies a student uniquely. Lectures have a Nr, CP and Title. The Nr identifies a lecture uniquely. Professors have a PersNr, Name, Level and Room. The PersNr identifies a professor uniquely. Assistants have a PersNr, Name and (research) Area. The PersNr identifies an assistant uniquely. Students attend lectures. Lectures can be prerequisites for other lectures. Professors give lectures. Assistants work with professors. Students are tested by professors about lectures. Students receive grades as part of these tests. 7
What it does not say How assistants are related to lectures How students interact with assistants How a student gets the credits for a lecture In which area a professor works Models should capture what is known and what is needed. No more no less. 8
Avoid redundancy When modeling, special attention should be paid to avoid redundancy (introducing the same concept twice) PersNr Adviser Level Name Assistant works-with Professor Room Area PersNr Name 9
Keep it simple Assistant works-with Adviser acts-as Professor Assistant works-with Professor 10
Two Binary vs. One Ternary Relat. A professor recommends a book for a course Model as two binary relationships Model as one ternary relationship What is better? 11
A professor suggests a textbook for a course PROFESSOR COURSE SUGGESTS TEXTBOOK 12
PROFESSOR Something has been added There is more information SUGGESTS Something has been removed One relationship is now lost TEXTBOOK USED IN COURSE 13
TEXTBOOK Something has been added There is more information SUGGESTS Something has been removed One relationship is now lost PROFESSOR TEACHES COURSE 14
TEXTBOOK Something has been added There is more information USED IN Something has been removed One relationship is now lost COURSE TEACHES PROFESSOR 15
TEXTBOOK SUGGESTS USED IN PROFESSOR TEACHES COURSE 16
TEXTBOOK COURSE REFERS TO TO BE USED IN RECOMMENDATION MAKES PROFESSOR 17
Rules of thumb Attribute vs. Entity Entity if the concept has more than one relationship Attribute if the concept has only one 1:1 relationship Partitioning of ER Models Most realistic models are larger than a page Partition by domains (library, research, finances,...) No good automatic graph partitioning tool Good vs. Bad models Do not model redundancy or tricks to improve performance Less entities is better (the fewer, the better!) Remember the C4 rule: concise, correct, complete, comprehensive 18
Cardinalities...... E E 2 1 R R E 1 x E 2 1:1 E 1 E 2 1:N N:1 N:M 19
Cardinalities of n-ary relationships E 1 N E n R E 2 P M 1 E k R : E 1 x... x E k-1 x E k+1 x... x E n E k 20
Example: seminar Student N supervise 1 1 Professor Topic Grade supervise : Professor x Student Topic supervise : Topic x Student Professor 21
Constraints The following is not possible: 1. Students may only do at most one seminar with a prof. 1. Students may only work on a topic at most once. The following is possible: Profs may recycle topics and assign the same topic to several students. The same topic may be supervised by several profs. 22
Example Professor p 1 Student b 1 p 2 b 2 p 3 s 1 s 2 b 3 p 4 s 3 b 4 t 1 s 4 b 5 t 2 Dashed lines represent illegal references b 6 t 3 t 4 Topic 23
Cardinalities requires Legi Name Semester Student N N attends M N M Lecture M N Nr CP Title Grade tests gives PersNr 1 1 Level Name Assistant Area N works-for 1 Professor PersNr Name Room 24
R E 1 x... x E i x... x E n E1 (min, max)-notation (min 1 max 1 ) E n R E 2 (min i, max i ) E i For all e i E i : At least min i records (..., e i,...) exist in R AND At most max i records (..., e i,...) exist in R 25
Geometric Modelling Polyhedron 1 covers N Surface N boundary PolyID SurfaceID Polyhedron M Edge N Node StartEnd M EdgeID X Y Z 26
Geometric Modelling Polyhedron covers Surface Edge Node 1 (4, ) N (1,1) N boundary M (2, 2) N (2, 2) StartEnd M (3, ) (3, ) PolyID SurfaceID EdgeID X Y Z Polyhedron 27
Size BldNr Weak Entities RoomNr 1 Building located Room N Size The existince of room depends on the existence of the associated building. Why must such relationships be N:1 (or 1:1)? RoomNr is only unique within a building. Key of a room: BldNr and RoomNr 28
Exams depend on the student 1 N Grade Student takes Exam Part Legi covers N N gives Nr M M PersNr Lecture Professor Can the existence of an entity depend on several other entities? (E.g., exam on student and prof?) 29
Corner Case 1 A human cannot exist without a heart. A heart cannot exist without a human. Anne lives on Bob s heart. Bob lives on Anne s heart. Possible? Heart 1 1 owns Person 30
Corner Case 2 A human can only survive with at least one kidney. Not expressible with ER! (Why not?) Kidney N 1 owns Person 31
Student N supervise 1 1 Professor Topic Grade 32
Student N TAKES N 1 SEMINAR 1 N Grade TEACHES THEME 1 1 Professor Topic 33
Generalization Name Uni-Member is-a Student Employee PersNr Legi is-a Level Area Assistant Professor Room 34
Legi Name Semester Student (0,*) attends (3,*) (0,*) (0,*) requires (0,*) (0,*) Lecture (1,1) Nr CP Title Area Grade tests gives (1,1) (0,*) (0,*) (0,*) Assistant Works-for Professor Level Room PersNr Name is-a Employee 35
Aggregation Bicycle Part-of Part-of Frame Wheel Part-of Part-of Part-of Part-of........................ 36
Aggregation and Generalization Vehicle is-a Manual vehicle is-a Motor vehicle is-a Tricycle Bicycle Motorcycle Car Part-of Part-of Frame Wheel Part-of Part-of Part-of Part-of........................ 37
Why is ER modelling so difficult? View 3 View 1 View 4 Consolidate Global Schema No redundancy No conflicts Avoid synonyms Avoid homonyms View 2 38
Consolidation Hierarchies S 1,2,3,4 Problem: How to achieve multi-lateral consensus? S 1,2,3, S 4 S 1,2 S 3 S 1 S 2 S 1,2,3,4 S 1,2 S 3,4 S 1 S 2 S 3 S 4 39
Example: Professor View Student Assistant Professor do supervise write supervise Title Master thesis PhD thesis Title 40
Example: Library View Department Library owns Document Signature manages Authors lends Title Uni-Member Year Date 41
Example: Lecture View Lecture Textbook Authors recommends Title Year Lecturer Publisher 42
Observations Lecturer and Professor are synonyms. Uni-Member is a generalization of Student, Professor and Assistant. However, libraries are managed by Employees. (View 2 is imprecise in this respect.) Dissertations, Master theses and Books are different species of Document. All are held in libraries. Do and Write are synonyms in View 1. Things get complicated very quickly requires engineers Not unique Need to invent new concepts Need to compromise (e.g., authorship of documents) 43
writes faculty Library Signature keeps Title Year Document Master Thesis Dissertation Book Publisher lends supervise supervise recommends manages Assistant Professor Date Student Employee Uni-Member Lecturer Person 44
Models lead to schemas Building a model (and eventually a schema) is costly and takes time. There are use cases where one can simply not model the data -> Key Value Stores 45
Data Modelling with UML Unified Modelling Language UML De-facto standard for object-oriented design Data modelling is done with class diagramms Class in UML ~ Entity in ER Attribute in UML ~ Attribute in ER Association in UML ~ Relationship in ER Compositor in UML ~ Weak Entity in ER Generalization in UML ~ Generalization in ER Key differences between UML class diagrams and ER Methods are associated to classes in UML Keys are not modeled in UML UML explicitly models aggregation (part-of) UML supports the modelling of instances (object diagrams) UML has much more to offer: use cases, sequence diagr. 46
Class: Professor Professor - PersNr: Integer + Name: String - Level: String + promote() 47
Associations (directed, undirected) 48
Functionalities & Multiplicities Multiplicities Every instance of A is associated to 4 to 6 instances of B. Every instance of B is associated to 2 to 5 instances of A. Be careful: Flipped around as compared to ER. Be careful: Cannot be used for n-ary relationships. Functionalities Represented as UML multiplicities: 1, *, 1..*, 0..*, or 0..1 Otherwise, the same as in ER. 49
Aggregation 50
Generalization 51