In the following presentation, the overall system design of flickr.com is described:
http://www.niallkennedy.com/blog/uploads/flickr_php.pdf
Especially Page 27 raised my attention, as I encountered that kind of thinking around MySQL users a lot. As I’m just turning von MySQL to PostgreSQL, I’m curious what people which are using „real“ databases saying about the following statements on that sheet:
– JOIN’s are slow
– Normalised data is for sissies
– Keep multiple copies of data around
– Makes searching faster
– Have to ensure consistence in the application logic
That last item frightens me most, as I know from my own experiences, that this is a very difficult task. I currently prefer a clean database design with aggregation of data done by triggers.
All the others things in that presentation do not sound like magic. It’s quite normal to move SELECT statements to a read only copy of the database to avoid too much requests to the main database. Especially for some MySQL table types this is a major issue as MySQL only supports table locking on those types (all except InnoDB and BDB). Flickr uses both types (row and table locking) and seems to do SELECTs only on the table locking. So this is no problem for them. But all this shows how you have to design your system around MySQL. I no longer feel good about this, as there is no other database behaving like MySQL but with better (in term of more professional) quality. I hope to have a possible upgrade path from PostgreSQL to DB2 or Oracle – without currently knowing. So if you have to share your opinion, leave a comment.
Unfortunately, the numbers on page 22 of this presentation are not further specified. So you cannot tell in what period of time they take place:
– 44,220,588 SELECTS
– 1,349,234 INSERTS
– 1,755,503 UPDATES
Nice to see, flickr.com is not much more than a bigger one-person-project, at least only one programmer… (Page 7).
[via Lummaland]