|
30 | 30 | \gitExecSQLraw{}{}{normalization/1nf/anomaly_composite}{cleanup.sql}{}{}{}% |
31 | 31 | % |
32 | 32 | In \cref{fig:anomalyComposite}, we illustrate a part of a logical model that relates student records to address records. |
33 | | -In this relationship, each student has exactly one address. |
34 | | -In \cref{lst:1nf:anomaly_composite:03_public_address_table_5071,lst:1nf:anomaly_composite:04_public_student_table_5075}, we create the two tables, while leaving the \db\ and constraint creation to your imagination. |
| 33 | +In this model, each student has exactly one address and a name. |
| 34 | +Addresses are text stored in a separate table. |
| 35 | + |
| 36 | +The auto-generated script \cref{lst:1nf:anomaly_composite:03_public_address_table_5071} creates the table~\sqlil{address}. |
| 37 | +It has two columns, namely the surrogate primary key~\sqlil{id} and the column~\sqlil{full_address}, which is a variable-length string of up to 255~characters. |
| 38 | +As the name suggests, we will store the addresses of the students in this table. |
| 39 | + |
| 40 | +The script \cref{lst:1nf:anomaly_composite:04_public_student_table_5075} then creates the table~\sqlil{student}. |
| 41 | +This table, too, has a surrogate primary key~\sqlil{id}. |
| 42 | +It also sports the column~\sqlil{name} storing the name of the students. |
| 43 | +Column~\sqlil{address} is a foreign key reference to the \sqlil{id}~column of table~\sqlil{address}. |
| 44 | +This is secured by a \sqlil{REFERENCES} constraint that we create with another script. |
| 45 | +We do not print that one here and neither do we print the script for generating the \db, as they do not contribute much to the understanding of the scnario. |
| 46 | + |
35 | 47 | We then insert some data into them \db\ \cref{lst:1nf:anomaly_composite:insert}. |
| 48 | +We first create four address records, one for our Hefei University~(合肥大学) located in the beautiful city of Hefei~(合肥市) in China, one address in my hometown Chemnitz, Germany, one address located in the Chinatown of New York, USA, and, finally, one address in Quanzhou city~(泉州市), China. |
| 49 | +We then create four \sqlil{student} records for Mr.~Bibbo, Mr.~Bebbo, Mrs.~Bibbi, and Mr.~Bebbo. |
| 50 | +Via their foriegn key, these are linked to the above addresses in that order. |
36 | 51 | At first glance, all looks well. |
37 | 52 |
|
38 | | -And all could be well, if we would treat the address of a student always as a single text string. |
| 53 | +And all could be well, if we would treat the address of a student always as a single unstructured text string. |
39 | 54 | However, this is not necessarily true, especially not true in our teaching management platform example. |
40 | 55 | In our example, Mr.~Bibbo lives directly in our Hefei University whereas Mr.~Babbo comes from Quanzhou~(泉州市) in the Fujian province~(福建省). |
41 | | -Mr.~Bebbo and Ms.~Bibbi, however, are foreign exchange students~(留学生) from Germany and the USA, respectively. |
| 56 | +Mr.~Bebbo and Mrs.~Bibbi, however, are foreign exchange students~(留学生) from Germany and the USA, respectively. |
42 | 57 | Assume that this table would be much larger. |
43 | 58 | What would happen if we wanted to know who of our students have a valid address in China? |
44 | 59 | How would we do that?% |
45 | 60 | % |
46 | 61 | \begin{sloppypar}% |
47 | 62 | Matter of fact, we encountered this very same situation back in \cref{sec:factory:table:customer:insert}. |
48 | | -Back then, we used the \sqlilIdx{ILIKE} expression and we do so again here: |
| 63 | +Back then, we used the \sqlilIdx{ILIKE} expression~\cite{PGDG:PD:PM} and we do so again here: |
49 | 64 | In \cref{lst:1nf:anomaly_composite:select}, we combine the tables~\sqlil{student} and \sqlil{address} by using an \sqlilIdx{INNER JOIN} statement. |
50 | | -We then only keep the rows \sqlil{WHERE full_address ILIKE '\%china\%'}, in other words, where the word \inQuotes{china} occurs anywhere in the \sqlil{full_address} columns, regardless of its casing. |
| 65 | +We then only keep the rows \sqlil{WHERE full_address ILIKE '\%china\%'}. |
| 66 | +In other words, we retain only the rows where the word \inQuotes{china} occurs anywhere in the \sqlil{full_address} columns, regardless of its case. |
51 | 67 | \inQuotes{China,} \inQuotes{china,} \inQuotes{CHINA,} \inQuotes{cHina} -- all are OK. |
52 | | -Doing this will yield the two students Mr.~Bibbo and Ms.~Bibbi. |
| 68 | +Doing this will yield the two students Mr.~Bibbo and Mrs.~Bibbi. |
53 | 69 | Ms.~Bibbi, however, is a foreign exchange student. |
54 | 70 | She lives in \emph{China}town, New York. |
55 | | -Also, Mr.~Babbo was not listed, as he declared his address to be in the PRC, i.e., the People's Republic of China.% |
| 71 | +Also, Mr.~Babbo was not listed, as he declared his address to be in the PRC, i.e., the People's Republic of China~(中华人民共和国).% |
56 | 72 | \end{sloppypar}% |
57 | 73 | % |
58 | 74 | We are faced with two problems: |
|
63 | 79 | Here, we did not model the attribute for the address as a composite attribute. |
64 | 80 | We modelled it as an atomic attribute, which turned out to be wrong, because now we want to access its components. |
65 | 81 | An atomic attribute does not have components. |
| 82 | +So the error did occur in the conceptual modeling phase. |
| 83 | +It became apparent only after we finished implementing the logical model. |
66 | 84 |
|
67 | 85 | Now our \db\ can still \inQuotes{work}. |
68 | 86 | We can construct the second query in \cref{lst:1nf:anomaly_composite:select}, which deals with both of the special cases mentioned above. |
|
105 | 123 | A proper solution can only be to model the address as a composite attribute. |
106 | 124 | At least the country needs to be split off. |
107 | 125 | Maybe also the province, because that could come in handy, too. |
108 | | -We probably also want a postal code. |
| 126 | +Adn while we are at it, we probably also want to know the city and postal code. |
| 127 | +The components of this composite attribute then will become separate columns in a table. |
109 | 128 | We apply these ideas to create the improved logical model in \cref{fig:fixedComposite}. |
110 | 129 |
|
111 | 130 | The attribute~\sqlil{full_address} now now longer exists when we create the table \sqlil{address} in \cref{lst:1nf:fixed_composite:03_public_address_table_5071}. |
112 | | -Instead, we have the columns~\sqlil{country}, \sqlil{province}, \sqlil{city}, \sqlil{postal_code}, and~\sqlil{street_address}, all of which are of type \sqlilIdx{VARCHAR} of appropriate lengths. |
113 | | -We permit \sqlil{province} to be \sqlilIdx{NULL}, because some countries maybe do not have provinces, whereas all other fields must be~\sqlilIdx{NOT NULL}. |
| 131 | +Instead, we have the columns~\sqlil{country}, \sqlil{province}, \sqlil{city}, \sqlil{postal_code}, and~\sqlil{street_address}. |
| 132 | +All of them are of type \sqlilIdx{VARCHAR} of appropriate lengths. |
| 133 | +We permit \sqlil{province} to be \sqlilIdx{NULL}, because some countries maybe do not have provinces. |
| 134 | +All other fields must be~\sqlilIdx{NOT NULL}. |
114 | 135 | Nothing else changes, the table~\sqlil{student} can stay as it is. |
| 136 | +We thus do not reproduce its creation here as a listing. |
115 | 137 |
|
116 | 138 | When we insert the data into our \db\ \cref{lst:1nf:fixed_composite:insert}, we of course also need to split the addresses properly over the columns. |
117 | 139 | This also shows us a slight drawback that is inherent to all normal forms: |
118 | 140 | They break compound data into independent pieces. |
119 | 141 | If we later need the complete data again, we need to reassemble the pieces. |
120 | 142 | Thus, if we need the full address string, we first must reassemble it, probably using the string concatenation operator~\sqlil{||}\sqlIdx{\textbar\textbar}~\cite{PGDG:PD:SFAO}. |
121 | 143 |
|
| 144 | +Anyway. |
| 145 | +This time, we create five student records, adding Ms.~Bebbe to the mix. |
| 146 | +The addresses of the other students stay basically the same, but are broken down into their components. |
| 147 | +Ms.~Bebbe lives in somewhere Beijing~(北京), China. |
| 148 | + |
122 | 149 | As you can see in \cref{exec:1nf:fixed_composite:select}, we now can indeed obtain the list of all students with addresses in China much more easily. |
123 | | -It is a given that we still have to deal with the fact that different people may use different names for the country, but at least we cannot accidentally classify someone from Chinatown in San Francisco as a PRC resident. |
| 150 | +It is a given that we still have to deal with the fact that different people may use different names for the country. |
| 151 | +But at least we cannot accidentally classify someone from Chinatown in San Francisco as a PRC resident. |
124 | 152 |
|
125 | 153 | While we are here, let's do a small excursion that just fits nicely in this topic but is otherwise unrelated to the \pgls{1NF}. |
126 | 154 | If you read \cref{lst:1nf:fixed_composite:select}, you notice that reassembling the full address was a bit complicated and went beyond simply using~\sqlil{||}\sqlIdx{\textbar\textbar}. |
|
140 | 168 | When reading the query, you will also find one additional change when checking the country: |
141 | 169 | We could have used the logical \sqlilIdx{OR} to combine the three conditions \sqlil{country ILIKE '\%china\%'}, \sqlil{country ILIKE '\%PRC\%'}, and \sqlil{country ILIKE '\%P.R.C.\%'}\sqlIdx{ILIKE}. |
142 | 170 | Instead, we wrote \sqlil{country ILIKE ANY(ARRAY['\%china\%', '\%PRC\%', '\%P.R.C.\%'])}\sqlIdx{ANY}\sqlIdx{ARRAY}, which is equivalent to that~\cite{PGDG:PD:RAAC,PGDG:PD:A}: |
143 | | -We can declare an array of the values \sqlil{a}, \sqlil{b}, \sqlil{c}, and~\sqlil{d} as \sqlil{ARRAY[a, b, c, d]}\sqlIdx{ARRAY}. |
| 171 | +We can declare an array of the values \sqlil{a}, \sqlil{b}, \sqlil{c}, and~\sqlil{d} via \sqlil{ARRAY[a, b, c, d]}\sqlIdx{ARRAY}. |
144 | 172 | The expression \sqlil{XXX operator ANY(ARRAY[...])}\sqlIdx{ANY} becomes \sqlil{TRUE} if \sqlil{XXX operator YYY} is \sqlil{TRUE} for any, i.e., at least one, \sqlil{YYY} in the array~\cite{PGDG:PD:RAAC}. |
145 | 173 | In our case, \sqlil{XXX} is \sqlil{country} and \sqlil{operator} is \sqlilIdx{ILIKE}. |
146 | 174 | (Similarly, the expression \sqlil{XXX operator ALL(ARRAY[...])}\sqlIdx{ALL} becomes \sqlil{TRUE} if \sqlil{XXX operator YYY} is \sqlil{TRUE} for every single, i.e., all, \sqlil{YYY} in the array~\cite{PGDG:PD:RAAC}.) |
|
0 commit comments