Wednesday, March 7, 2012

Hierarchically arranged attributes...mission impossible?

Struggling over the design of a database. Basically, we have a long
list of attributes. We've decided to arrange them hierarchically so
users can find them more easily. The problem is, this doesn't seem very
easy to implement! Let me try to explain what I'm talking about a
little more specifically.
I have a list of attributes organized hierarchically. For example:
CarAttributes
--Configuration
--Doors
--EngineLocation
--DriveWheels
--Engine
--Type
--Cylinders
--Arrangement
--Specs
--Size
--Power
--Torque
--Transmission
--Type
--Gears
I also have a lot of attribute data (for leaf attributes only). For
example:
CarID = 1, Doors = 4
CarID = 1, EngineLocation = 'Front'
CarID = 1, DriveWheels = 'Front'
CarID = 1, Cylinders = 6
CarID = 1, Arrangement = 'Inline'
CarID = 1, Size = '3.0'
...
CarID = 2, Doors = 2
CarID = 2, EngineLocation = 'Mid'
Etc.
QUESTION: How do I set up a proper relational database to support this
scheme?
I was thinking I could set up an adjacency table for the attributes.
The conundrum is how to store the attribute data.
Couple possibilities:
===+===+===+===+===+===+===+===+===
Idea #1 - I could set up my CarData table like this:
Table: CarData
CarID int
AttributeID int
AttributeValue nvarchar(100)
But then my attribute values, whether they are numeric, category, or
string, are all getting stored as a string. That seems like a big
divantage.
===+===+===+===+===+===+===+===+===
Idea #2 - Alternatively, I could set up my CarData table like this:
Table: CarData
CarID int
Doors int
EngineLocation EngineLocationID (references EngineLocation table)
DriveWheels DriveWheelsID
Cylinders int
EngineArrangement EngineArrangementID (references EngineArrangement
table)
EngineSize float
EnginePower int
EngineTorque int
TransmissionType TransmissionTypeID (reference TransmissionType
table)
TransmissionGears int
Everything is now explicitly typed but there's no apparent way to set
up a link between an attribute in the CarAttributes tree and its
corresponding column in the CarData table. For example, say the user is
looking at data on CarID = 1 and clicks on the "Engine/Type/Cylinders"
leaf. Now I need a select statement that looks something like this...
SELECT Cylinders FROM CarData WHERE CarID = 1
...but the only way I can think to get the word "Cylinders" into that
query would be through dynamic code or some hideously ugly CASE
statment, which also seems like a big divantage.
===+===+===+===+===+===+===+===+===
So, is there a "right" way to do this so I can have both properly typed
attribute values *and* be able to reference attribute values without
resorting to dynamic queries?
Thanks!
-DanHave you gotten a copy of TREES & HIERARCHIES IN SQL for various
approaches to this kind of problem?|||LOL, there is a book about his problem. Daniel, you are lucky!
"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1141348474.658630.45940@.v46g2000cwv.googlegroups.com...
> Have you gotten a copy of TREES & HIERARCHIES IN SQL for various
> approaches to this kind of problem?
>|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1141348474.658630.45940@.v46g2000cwv.googlegroups.com...
> Have you gotten a copy of TREES & HIERARCHIES IN SQL for various
> approaches to this kind of problem?
>
Look at the problem: it's not a hierarchy or tree problem:
CarAttributes
--Configuration
--Doors
--EngineLocation
--DriveWheels
--Engine
--Type
--Cylinders
--Arrangement
--Specs
--Size
--Power
--Torque
--Transmission
--Type
--Gears
These are all attributes of a single entity: the car.
Perhaps an Engine is shared among multiple Cars and should be factored into
a seperate table. But other than that this is just a simple table:
create table CAR
(
NAME VARCHAR(50) PRMARY KEY,
DOORS INT CHECK (DOORS < 6),
ENGINE_LOCATION CHAR(1) CHECK (ENGINE_LOCATION IN ('F','B')),
DRIVE_WHEELS CHAR(1) CHECK (ENGINE_LOCATION IN ('F','B')),
ENGINE_CYLINDERS INT CHECK (ENGINE_CYNINDERS < 12),
...
dAVID|||Actually, Joe, I haven't, but I did read this...
http://blogs.msdn.com/anthonybloesc...r />
2005.aspx
...which seems to suggest that a good ol' adjacency table is just fine.
Besides being simple, it apparently runs a lot faster than the "Nested
Sets" technique.
My question, however, is not how to implement/traverse a hierarchy.
It's more about how to store the actual attribute data. If I get to a
particular leaf of the hierarchy, how do I query that particular column
in the table?
Be great to get your take on this...I hope you'll give my post a read
sometime.
Thanks,
-Dan|||Yes, David, they are all attributes of a car, but the attributes
themselves are arranged in a hierarchy. The table you are proposing
(which is basically like my Idea #2) cannot be linked to the hierarchy.
At least, not as far as I can tell.
See what I'm getting at?
Thanks,
-Dan|||"Daniel Manes" <danthman@.cox.net> wrote in message
news:1141367733.531758.185340@.e56g2000cwe.googlegroups.com...
> Yes, David, they are all attributes of a car, but the attributes
> themselves are arranged in a hierarchy. The table you are proposing
> (which is basically like my Idea #2) cannot be linked to the hierarchy.
> At least, not as far as I can tell.
> See what I'm getting at?
>
I think so. If so the relational model just doesn't quite get you there.
The relational model doesn't have support for the notion of attribute
groupings on an entity. You can get a bit of that by factoring the entity
into smaller, related entities, but that's just a hack.
You will have to supply and store that additional metadata yourself.
I would probably use a single flat table, and also provide an XSD schema
which mapped to the table and provided the hierarchicial relationships among
the attributes.
Or adopt a naming convention for your columns which embeds the attribute
relationships
eg
SELECT ENGINE_TYPE_CYLINDERS FROM CarData WHERE CarID = 1
David|||Hi David,
Knowing that I'm not missing some really obvious relational concept is
actually a relief. I think I'm going to go this route:
1. Put the attribute data in one flat table (e.g., Car).
2. Put the attribute names (which will correspond to the column names
in the Car table) and hierarchy information in an adjacency table
(e.g., CarAttribute).
If my app needs data on a particular car, I'll just write a query to
grab the whole row (all the attributes) and another query to grab the
particular column names that should be displayed. Then I'll write code
in my .NET app to handle the rest.
Thanks for the help,
-Dan|||>> Look at the problem: it's not a hierarchy or tree problem: <<
Yes, it is. It is called a parts explosion or Bill of Materials. Get
the book and use teh code for this. I think that you might be thinking
in OO terms and not RDBMS.|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1141531629.790826.299470@.p10g2000cwp.googlegroups.com...
> Yes, it is. It is called a parts explosion or Bill of Materials. Get
> the book and use teh code for this. I think that you might be thinking
> in OO terms and not RDBMS.
>
The example chosen (Car) might be mistaken for a Bill of Materials problem.
But I think it's a typical entity model, but with the the entity attributes
"grouped" for the UI.
In a Bill of Materials problem an entity is composed of other entities, each
of which is composed of other entities, etc. Moreover all the entities
belong to a single domain (eg Parts). Here there's only one entity and a
fixed list of attributes. The problem concerns how to group related
attributes.
<quote>
Basically, we have a long list of attributes. We've decided to arrange them
hierarchically so
users can find them more easily.
</quote>
Some entities have a lot of attributes. Sometimes there are logical
groupings of attributes. For instance a Person may have Height, Weight,
Age, Income, NumberOfPets. Height, Weight and Age are all demographic
atributes, and it would sometimes be convenient to group them together.
They shouldn't be extracted into a seperate PersonDemographicData table, but
short of that the relational model doesn't help.
David

No comments:

Post a Comment