dslreports logo
site
 
    All Forums Hot Topics Gallery
spc

spacer




how-to block ads


Search Topic:
uniqs
402
share rss forum feed

jfmezei
Premium
join:2007-01-03
Pointe-Claire, QC
kudos:23

Indexing fields with multiple values

In a database (mysql for instance) if you have a record with a field that can contain multiple tags that are not part of a pre-defined list of tags:

(for instance: category= cake, chocolate, fattening) what is the best strategy to handle this ?

Are there field types that contain multiple values
for instance mytag= "cake", "chocolate", "fattening"
which create an index entry for each value, making it efficient to pull out all records that have "fattening" in their tags ?

Or does this require brute force doing a free form wildcard search within the "mytag" field of every record in the database ?

(I am aware of "enum" and "set" datatypes but both require a pre-defined list of valid values in the field)


cdru
Go Colts
Premium,MVM
join:2003-05-14
Fort Wayne, IN
kudos:7
A properly normalized table would have the tags split out into it's own table. Commonly you just have a table with a minimum of two fields, one that points to the main record's ID, another that has the tag.

The only thing you'd have to do is verify that the particular tag doesn't exist prior to inserting it.

If you want to dynamically grow your list of tags, so that all posts, recipies, whatever share a common list of tags, you'd need 3 tables. One for your main records that includes a main_id. A tag table that has a tag_id and tag name, and the 3rd a mapping, that has two fields, main_id and tag_id. If you need to add a new tag, you create the tag record, return the tag_id, then insert the main_id along with the tag_id in the mapping table. If the tag already exists, then you just need to insert the mapping. To remove a tag, delete the mapping, optionally deleting the tag when no more mappings exist for a given tag.

The actual implementation can be done in your code straight up, business objects, database stored procs, trigger...lots of options.

jfmezei
Premium
join:2007-01-03
Pointe-Claire, QC
kudos:23
Thanks. That makes sense.

So, if one updates a record to remove 1 of 4 tags and add 2 more, what is the common way to handle this when writing the record back ? Delete all previous tags and recreate a new list of tags ? Or go through existing tags one by one, deciding if they stay or get deleted, and then add only those that were not present in the original record ?

(I know these questions sound like programing 101, but I am just trying to translate my old skills to SQL terminology and techniques).


cdru
Go Colts
Premium,MVM
join:2003-05-14
Fort Wayne, IN
kudos:7
said by jfmezei:

So, if one updates a record to remove 1 of 4 tags and add 2 more, what is the common way to handle this when writing the record back ? Delete all previous tags and recreate a new list of tags ? Or go through existing tags one by one, deciding if they stay or get deleted, and then add only those that were not present in the original record ?

Depends on how you are maintaining the list. I personally would have a way to add mutiple (potentially) new values via a comma separated list, but existing records are deleted individually.

However you could always maintain them as just a CSV list. It would just be more work as you'd have to delete existing values that weren't in the updated list. The update process wouldn't be as atomic as what individualized deletes would be, or inserting a list individually.


drew
Radiant
Premium
join:2002-07-10
Port Orchard, WA
kudos:6
reply to jfmezei
DELETE FROM recipe_tags
WHERE tag_id IN (@TAG_IDs)
AND recipe_id = @RECIPE_ID
 

Provide the parameters and it's easy...

A similar INSERT INTO... would be simple.
--
flickr | 'Cause I've been waiting, all my life just waiting
For you to shine, shine your light on me


cdru
Go Colts
Premium,MVM
join:2003-05-14
Fort Wayne, IN
kudos:7
said by drew:

[code]
DELETE FROM recipe_tags
WHERE tag_id IN (@TAG_IDs)
AND recipe_id = @RECIPE_ID
[/code]

Provide the parameters and it's easy...

MS SQL won't allow this, as written. Passing in a CSV list as a parameter won't work unless you use a CTE or some other function to break the list into a rowset first.