BCNF and relational algebra

BCNF and relational algebra

Post by Perceptio » Tue, 24 Feb 2004 05:00:52


Hi all,

I have the following relations (expressed as functional dependencies) which
I consider to be in BCNF (please correct me if I'm wrong):

Patient Identifier -> Patient Name, Ward
Date
Doctor -> Clinic

Now the problem here is that I am unable to deduce how to make any real
query on these set of data because there are no common attributes between
the relations. Is it even possible for relations under BCNF to have no
common attributes? Note that the original relation was identical expect it
had the additional functional dependency: Patient Identifier, Date,
Doctor -> Patient Name, Ward, Clinic (i.e. the attributes on the LHS were
the primary key). I decomposed the relation as I have shown above given that
I knew each Patient Identifier was unique, a patient would never change
wards but could attend several clinics, on each occasion a patient would see
one doctor, doctor names were unique and that doctors were only associated
with a single clinic. Now I believe that I have made a reasonable
interpretation of this information both in my choice of primary key in the
original relation (i.e. Patient Identifier, Date and Doctor) and in my
decompositions into BCNF (every determinant is a candidate key).

Thus, if I wanted to list, say, all patient identifiers and names for all
patients attending a certain clinic and having not been seen by a certain
doctor how on earth would this be possible in relational algebra given that
I cannot join over any common attribute?? So either I have made a mistake in
my decomposition or there are some techniques that I am not aware of (that
can link the FDs in some way).

In either case I would appreciate any help or clarification!

Thanks in anticipation.
 
 
 

BCNF and relational algebra

Post by Bob Badou » Tue, 24 Feb 2004 05:15:01


which
that
see
that
in

What do you suppose the unary relation of dates means?

 
 
 

BCNF and relational algebra

Post by Perceptio » Tue, 24 Feb 2004 08:10:02


Thanks for the reply.

To answer your question: the unary relation means that every date is
associated with more than one value for every other attribute (that is, Date
only determines itself).

Could you perhaps have hinted that I should do a cartesian/cross product of
Date with my other two normalised relations and then join them? More
specifically, if I were to call my relations A, B and C where

A:Patient Identifier -> Patient Name, Ward
B:Date
C:Doctor -> Clinic

And if I wanted to achieve the example query I mentioned before, namely list
all patient identifiers and names for all patients attending a certain
clinic and having not been seen by a certain doctor, would the following be
legal:

First, AxB and BxC (where x means 'cross'). Then join the new relations over
'Date'. Then select over the condition Doctor/='certain doctor' and
Clinic='certain clinic'. And finally project over patient identifier and
patient name.

Would that give me the correct answer? The reason I ask is because first I
wasn't sure if my relations were in BCNF because there were no common
attributes among them (although every determinant was a candidate key) and
second whether I can cross two relations with no common attributes (although
I think I can). If the answer to those two questions is yes then will I have
computed what I was looking for? If I have gone on a completely wrong
tangent please let me know, I'm pretty new to this database stuff!

Thanks again.
 
 
 

BCNF and relational algebra

Post by Bob Badou » Tue, 24 Feb 2004 09:24:28


Date

That tells me what Date determines, but it does not tell me what it means.
Is every date for all time in the unary relation? What does it mean for a
specific date if it is in the relation? What does it mean for a specific
date if it is not in the relation?
 
 
 

BCNF and relational algebra

Post by mAsterda » Tue, 24 Feb 2004 10:00:23

Hi,
I am completely new to this group, but I'll give it a try.




Start with providing the predicates which
would let the population data convey true propositions.

Example:
A:
A:Patient Identifier -> Patient Name, Ward
------------------ ------------ ----
1 Johnson IC
2 Perception RE

What do these rows *mean* ?
That should give a clue to the predicate for A.
 
 
 

BCNF and relational algebra

Post by Perceptio » Tue, 24 Feb 2004 10:56:31


Sorry for the misunderstanding. Specific dates correspond to when a patient
visited a doctor. Thus if a date is in the relation, it means a patient had
seen a doctor/attended a clinic on that date.

The original un-normalised datebase had tuples of the form:

Patient Id, Patient Name, Ward, Date, Clinic, Doctor

with data of the form, for instance

3842, Smith, 2, 6.7.03, 4, Anderson
9.9.03, 3, Haswell

corresponding to the above attributes for one particular patient.
 
 
 

BCNF and relational algebra

Post by Bob Badou » Tue, 24 Feb 2004 11:43:19


is,
means.
a
patient
had

How does a unary relation with only a date attribute say anything about a
patient or a doctor or a clinic?
 
 
 

BCNF and relational algebra

Post by Jonathan L » Tue, 24 Feb 2004 13:36:54


Your BCNF breakdown bears little or no resemblance to the original
data - you have lost lots of information, but the decomposition
process is supposed to be lossless.

What are the unspoken rules about the data? Can a single patient see
more than one doctor on a given date? Can they see the same doctor
several times at the same clinic, in the same ward on the same date?
Given the example data, it seems we can infer that both clinics 3 and
4 have a ward 2. We can clearly create a 1NF relation by replicating
the patient ID, patient name, and ward values for the second row. We
are then missing all the information that would help us deduce how
this data can be normalized further.

The chances are good that Patient ID --> Patient Name.

Every other FD in your initial solution is disputable. A given doctor
may work at several clinics. There is unlikely to be a rule that once
a patient has been admitted to ward 2 at some clinic, thereafter, he
must always be admitted to ward 2 at every clinic. As already
discussed in other postings, the unary relation date is not significant.

You still have not given us all the information we need to do your
homework for you, I'm afraid.

Note that relational databases work very largely by comparing data in
relations (tables) on the basis of comparable attributes (columns).
Depending on the ideological purity of your RTD (relational theory of
data), the column names and types of the attributes (columns) should
be the same in the two relations being compared. In your breakdown
into 3 relations, there are no common attributes between any pair of
the relations. As Codd delighted in saying, a database is
partitionable into subsets c and C if the relations in c use no common
domains (types) with those in C. Your initial solution is fully
partionable into three relations. As you said in your original post,
it is not possible to frame any queries between such disjoint tables.

--
Jonathan Leffler #include <disclaimer.h>
Email: XXXX@XXXXX.COM , XXXX@XXXXX.COM
Guardian of DBD::Informix v2003.04 -- http://www.yqcomputer.com/
 
 
 

BCNF and relational algebra

Post by rkc » Tue, 24 Feb 2004 23:14:21


un-normalised datebase had tuples of the form:



In a nutshell, you skipped 2nf and 3nf to get to what you somehow
determined to be bcnf. How'd ya do that?
 
 
 

BCNF and relational algebra

Post by Perceptio » Wed, 25 Feb 2004 03:43:15

Jonathan Leffler" < XXXX@XXXXX.COM > wrote in message
news:Grf_b.5791$ XXXX@XXXXX.COM ...


All yes.


I actually mentioned in my first post that a doctor is only associated with
one clinic, and that patients will not change wards during their stay (and
that doctor names are unique). These were not an assumption on my part, but
facts that came with the data. Sorry if I wasn't clearer.


Yes exactly, which is why this is all so confusing. If I left the data in
1NF then the FDs would be:

Patient Id, Date, Doctor -> Patient Name
Patient Id, Date, Doctor -> Ward
Patient Id, Date, Doctor -> Clinic

since (Patient Id, Date, Doctor) must be the primary key. But also I can
deduce that Patient Id -> Patient Name, and Patient Id -> Ward and finally
Doctor -> Clinic from the info I mentioned a couple of lines above. Also
Date on its own doesn't determine anything. And basically that was how I
decomposed my data into BCNF, by having three relations where Patient Id ->
Patient Name, Ward and Doctor -> Clinic and Date which followed from the
above observations. Clearly this is BCNF because the determinants are
candidate keys (and obviously 3NF since full functional dependence and no
transitivity). The puzzling thing is that I can't make any query on this (as
mentioned), so either I've missed a functional dependence somewhere or there
is an alternate way to express the data in BCNF (but I can't see how at
all). Or perhaps I can make a query if I do a cross product of date with the
other two relations and then join the results over date--is this
possible/make sense (since the cross product operaton does not require
common attributes)?

Thank you again.


 
 
 

BCNF and relational algebra

Post by Perceptio » Wed, 25 Feb 2004 03:50:42


Could you please elaborate on what you mean by true propositions?
 
 
 

BCNF and relational algebra

Post by mAsterda » Wed, 25 Feb 2004 08:12:44


Well, you are supposed to be the domain expert.

Domain in both the broad sense of 'universe of discourse', what are we
talking about, what kind of facts should the content of these rows
reflect, and in a more detailed sense: what is the valid set of values
for each attribute/column/field/item/property/whathaveyou (a.k.a. it's
type).


I can only take a wild guess from this.
You should be able to make a more meaningful set, but you could take
this to start with.
A first try could be:

[Attributes]

Patient Identifer: type=integer.
It comes into existence when <some meaningful event in the universe of
discourse> (up to you)

Patient Name: type=text.
Given name, manually copied from his id-card when hen the patient is
admitted to a ward (... just fantasizing to give you an idea, ehrrrm
elaborate)

Ward: type=two characters
The code at the entrance door of the ward.


[Predicate]

A possible predicate could be:

A Patient, identified by "Patient Identifier", with the given name
"Patient Name", has been released from our Ward "Ward" on last tuesday.

(still improvising, that much should be clear)

[Propositions]

The propositions would now become:

Patient 1, Johnson was released from the IC Ward on tuesday.
Patient 2, Perception was released from the RE Ward on tuesday.

Now you may not like the 'on tuesday' part (or other parts). I added it
to illustrate that *if* this would be the predicate, this would be the
set of propositons conveyed by the population. It might be a nice list
for the billing department (they would need mor, of course).

In short: what does it *mean*.



HTH
 
 
 

BCNF and relational algebra

Post by Jonathan L » Wed, 25 Feb 2004 15:14:00

erception wrote:



In that case, you don't have a primary key for the schema you outline
(unless your interpretation of 'date' includes a sufficiently precise
time component). If the 'date' only identifies the day, then the only
way to record the fact that the same patient sees the same doctor
several times at the same clinic in the same ward on the same date is
by using some sort of duplicate record - a possibility which the RM
(relational model) denies.



OK - I didn't read that bit carefully enough; sorry.

So, we have an FD that Doctor --> Clinic. We also appear to have an
FD that { Patient ID, Date } --> { Clinic, Ward }, though I'm not
quite sure about the Clinic/Ward interrelationships.


That is, I suppose, an FD, but it is not a minimal FD. The minimal FD
is Patient ID --> Patient Name. The Date and Doctor have no effect on
things.


You have not said that a given doctor only works on one ward at the
given clinic.


Again, you have a non-minimal FD; Doctor --> Clinic you said, so the
augmented determinant { Patient ID, Date, Doctor } --> Clinic is also
an FD, but it is not minimal.


Not according to what you said at the start of this - a single patient
can be visited several times on a given date by a given doctor.

However, I'm willing to think that you misunderstood what I was asking
and that this system only records that a given patient saw the given
doctor on the given date, but not how many times the doctor actually
visited the patient (or, equivalently, that the doctor only sees the
patient once on any given date). Under that modified defnition, {
Patient ID, Date, Doctor } is the primary key.


No; { Patient ID, Date } --> Ward.


Agreed.


In another posting, you said:

So, we have:

{ Patient ID, Date, Doctor } --> { Patient Name, Ward, Clinic }

But this is nowhere near BCNF because there are all sorts of other FDs
too. Notably:
PATIENT: { Patient ID } --> { Patient Name }
DOCTOR: { Doctor } --> { Clinic }
PATIENT_VISIT: { Patient ID, Date } --> { Ward }

There is still an FD associating Patient, Date and Doctor:
DOCTOR_VISIT: { Patient ID, Date, Doctor } --> { }

I'm not clear about how clinics and wards are related, if at all.

Given the four relations implied by the FDs named above (primary key
on the LHS of the FD), it is possible to reconstruct your original
data. There might also be some extra unidentified constraints related
to clinics and wards - there's also a constraint that the Patient ID,
Date combination in DOCTOR_VISIT must appear in PATIENT_VISIT; this is
a foreign key constraint.

PATIENT: 3842 --> Smith
DOCTOR: Haswell --> 3
Anderson --> 4
PATIENT_VISIT: 3842, 6.7.03 --> 2
3842, 9.9.03 --> 3
DOCTOR_VISIT: 3842, 6.7.03, Anderson --> { }
3842, 9.9.03, Haswell --> { }

These four relations can be joined on their various sets of common
columns to recreate the original data. Oh well, I guess that sorts
out your homework for you...

--
Jonathan Leffler #include <disclaimer.h>
Email: XXXX@XXXXX.COM , XXXX@XXXXX.COM
Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/

 
 
 

BCNF and relational algebra

Post by Perceptio » Fri, 27 Feb 2004 07:28:33


<SNIP>

Much appreciated. Thank you for your help.