It’s not about platform, it’s about architecture
Dopo quello offerto da Windows Live, un altro esempio di scalabilità di SQL Server, ce lo offre questo mese un'intervista a 3 responsabili dati di MySpace apparsa su SQL Server Magazine (Hala Al-Adwan - vice president for data, Christa Stelzmuller - chief data architect, e George Tevelde - director of database administration).
Quello che emerge da questa intervista è davvero interessante! Stiamo parlando di ben 450 server per un totale di 1.200 database (ad oggi SQL 2005 con una migrazione a 2008 in cantiere), con 130 milioni di utenti attivi al mese! Gli intervistati raccontano in breve l'evoluzione dell'architettura basata inizialmente su SQL Server 2000 (!??!?!?) e replica transazionale:
When we started in 2003, we had one instance of SQL Server running on one server. Everything was fine until we experienced more and more growth.Our first approach to scaling was a master/slave model, using transactional replication. We replicated our master read/write node to all of the slaves,which worked fine for us until we reached about two million users.
poi, col sopraggiungere dei problemi di latenza della replica, partizionata verticalmente su server differenti per funzionalità:
Around our two million user mark, we experienced latency issues with replication.[...]We went with a vertical partitioning approach: We separated our features onto different servers; that worked until we hit the four million user mark.
e successivamente partizionata orizzontalmente per ogni "milione di utenti":
We partition those functionally into different groups, and within each of those groups we partition again by user ranges. So we have databases for every million users, and we add more databases for every new set of million users. The application tier is aware of that and routes activity to the appropriate database depending on the user that is requesting it.
quando poi si sono superati i 100 milioni di utenti, la replica ha mostrato i suoi limiti portando l'introduzione del neonato Service Broker (!!!) che ha soppiantato in breve tempo le repliche transazionali:
Certain types of data were replicated to every single one of the user profile databases. This worked well for us, until the 100 million user range. SQL Server replication works great when it works great, but when you have failures they tend to be dramatic and have cascading effects.[...]Data transactions weren’t atomic—they would succeed in one location and not succeed in another, which led to a really bad end-user experience.[...] So we started looking at Service Broker. I’m pretty sure MySpace is handling one of the most extensive Service Broker implementations in SQL Server right now.[...] We loved Service Broker because it gave us the opportunity to write data in a way that we could handle it after the fact.
L'articolo si conclude con questa battuta di Al-Adwan:
We have a motto here—it’s not about platform, it’s about architecture. It’s platform agnostic—you just have to know how to set it up.