Cloud service providers develop their software as SaaS applications to support multiple tenants within the same instance. In a public cloud scenario, it is required to support thousands of tenants utilizing a large number of clustered instances. In such a situation, it is required to properly manage tenant allocation per instance. It is not feasible to load all of tenants in all the clusters randomly. This would increase resource utilization. Tenants needed to be properly partitioned into different clusters to achieve optimal results. The following diagram shows an un-managed deployment vs. tenant-partitioned cluster.
Load balancing is a key function in a tenant-partitioned deployment, because we need to route the request to the correct cluster. Hence the term is coined as “tenant-aware” load balancing. Extensive research has been done on the tenant aware load balancing and the research paper  presents a reference architecture.