Microservices, Sharding and Consolidation with Oracle Database

As the #1 database leader overall, Oracle is also the best database for microservices architectures and the best database for sharding. Microservices and Sharding are application development and data modeling decisions. Whether to consolidate those databases and how to consolidate them is a “macro-level” infrastructure management decision. In this article, we will explore these topics:

  • Microservices Design Pattern: Database Per Service
  • Sharding: Subdividing a Large Database into Shards
  • Why Isolate Databases?
  • How to Isolate Databases
  • Don’t Over-Isolate: Limit Virtual Sprawl
  • Stop Noisy Neighbors: Use Resource Manager
  • Use Resource Shapes for Simplicity
  • Reducing Blast Area Still Means a Blast!
  • HA is Still Important With Microservices & Sharding
  • Data Driven Apps: Benefits of Converged Database

Oracle is a Converged Database, providing the full range of database capabilities for modern Data Driven Applications. Oracle Database provides industry leading capabilities that exceed what point-solution databases can deliver.  You can deploy Oracle Databases as separate, dedicated databases supporting specific applications or microservices, or consolidate them to make more efficient use of system resources and administrative staff. Oracle Database can scale much larger than other databases, but can also scale-down to handle even the smallest microservices or sharding needs.

Microservices Design Pattern: Database Per Service

The Database Per Service Design Pattern says that each microservice has it’s own “database”, which means a collection of data and does not necessarily refer to a dedicated Oracle Database. In fact, (as outlined by Chris Richardson here) there are 3 ways for each microservice to have it’s own set of data as follows:

  • Private Tables per Service
  • Schema per Service
  • Oracle Database per Service

Each microservice has it’s own set of data, and that data is only accessed through API’s belonging to the microservice. Whether a particular microservice has a set of tables, a dedicated schema, or an entire dedicated database generally depends on the SIZE of data, volume of activity and need for isolation. Microservices with larger amounts of data will tend to have their own databases, whereas smaller amounts of data and activity would argue for a dedicated schema or even simply a number of tables within a schema. Of course an Oracle Database can be a stand-alone database, but multiple databases can also be combined together as Pluggable Databases within a Container Database. We will look at this and other isolation topics in later sections of this post.

Sharding: Sub-Dividing a Large Database into Shards

I have worked on multiple applications where we sub-divided a massive logical database into many physical Oracle databases going back to the 1990s. The sharding concept is really nothing new, but Oracle Database now offers Sharding as built-in, out-of-the-box services that make building sharded databases and applications much simpler. Sharding features are built into the Oracle Database so that application developers don’t need to implement those features in application code.

At the end of the day, when you shard a database into multiple smaller pieces, you still have the same amount of data. If you shard a 70TB database into 10 shards of 7TB each, it’s still 70TB. If you shard that data into 100 shards, it’s still 70TB. Oracle is the KING of managing massive amounts of data, and it doesn’t matter if that’s 1 database or 100.

The point of sharding is that each database will be smaller and therefore easier to operate. You can then isolate those databases in order to improve operations.  Let’s look at WHY it’s sometimes necessary to isolate databases, then we will look at HOW to isolate them.

Why to Isolate Databases

Multiple microservices can reside within a single database, or might have dedicated databases (or Pluggable Databases within Containers), while shards are always spread across databases. Once microservice databases and shards are defined at the database layer, there are 5 factors that govern why those databases need to be isolated vs. consolidated as follows:

  • Administrative Separation
  • Security Separation
  • Maintenance (patching and upgrades)
  • Blast Radius (scope of failure impact)
  • Resource Management

Administrative Separation refers to organizations that have multiple DBA teams or other needs to separate databases from an administrative standpoint. This is also common in SaaS (Software as a Service) providers, where customers of the service provider have administrative access, or the service provider has separate teams that administer databases. In either case, the multiple administrative teams means those databases need to be isolated from each other.

Security Separation refers to use of dedicated networking infrastructure to service specific databases (quarantine LAN, etc.), and subjecting certain databases to higher security standards such as for regulatory reasons (HIPAA, PCI, PII, etc.). Databases must be isolated in cases where security requirements differ widely between them.

Maintenance considerations include patching and upgrades of servers, databases, and any supporting infrastructure. As microservices are consolidated together more closely, any maintenance must be coordinated among those microservices and testing is often required to ensure compatibility. The biggest concern with maintenance is major version upgrades at the infrastructure layer (O/S, database, etc.), where there is greater potential for functional changes that might impact how services operate.

Blast radius (or fault isolation) considerations refer to the scope of impact any failure has. An outage of any single microservice can jeopardize the function of the application as a whole. However, smaller databases can be repaired more quickly than larger database, so limiting the scope of impact can result in faster service restoration during a failure.

Resource Management refers to ensuring each database receives the resources it requires, as well as guarding against the “noisy neighbor” problem. One approach to resource management is to isolate databases onto dedicated physical or virtual machines, but we recommend to simply use Oracle Resource Manager instead.

How to Isolate Databases

In addition to the reasons why to isolate databases, there is also the question of how the Database per Service Design Pattern and sharded databases are deployed at the physical infrastructure level. With Oracle, we have 4 options for how to isolate databases as follows:

  • Dedicated Physical Servers per Database
  • Dedicated Virtual Machine per Database
  • Multiple Databases per Physical/Virtual Machine
  • Multiple Pluggable Databases Per Container
  • Oracle Resource Manager (DBRM & IORM)

We need to make sure developers have the autonomy they’re looking for, but then balance how they are deployed in the infrastructure. Managers of the infrastructure must determine how and when to isolate vs. consolidate databases. Excessive isolation such as the “one database per virtual machine” approach results in high cost of operation, so we recommend a more balanced approach.

Don’t Over-Isolate: Limit Virtual Sprawl

Data Centers in the past often suffered from sprawl (physical sprawl), where each application and it’s database(s) ran on their own physical servers. Virtual Machine technology has allowed IT organizations to stem the tide of physical sprawl, but this has often resulted in virtual sprawl. Increasing compute and storage density has also allowed more workload to be laid on top of the same physical footprint, using virtualization to maintain the same (or even greater) isolation between workloads.

You will quickly build an administrative nightmare if each microservice uses a dedicated database, and if each database uses a dedicated Virtual Machine. You can easily meet the needs of your application developers by taking a more judicious approach to deploying databases by using the full range of options at your disposal, using the right tool for the task such as using Oracle Resource Manager to manage resources.

Stop Noisy Neighbors: Use Resource Manager

There is no reason to isolate Oracle Databases simply for the purposes of resource management. We can easily ensure each database receives the appropriate amount of resources using Oracle Database Resource Manager (DBRM).  There are 4 primary resources that need to be managed in any system:

  • CPU
  • Memory
  • Processes
  • I/O

CPU is managed using Oracle Database Resource Manager (DBRM), using shares & limits inside of a Container Database, Instance Caging (CPU_COUNT) across containers and non-pluggable databases, and within databases using Consumer Groups.  DBRM gives us control of ALL types of databases and workloads, and it’s integrated with IORM (see below for detail).

Memory usage for Oracle databases is managed by controlling SGA (System Global Area) and PGA (Process Global Area) settings.

Processes within Oracle databases are managed by controls for sessions and parallel query servers.

I/O resources are much simpler to manage on Exadata because we have IORM (I/O Resource Manager). Exadata is integrated hardware and software, spanning both compute and storage. Using IORM Objective “auto” allows IORM to inherit the same resource ratios set for CPU in DBRM, so you have one set of controls that handles everything.

See our demo series on Resource Management (starting here) that shows how you can manage resources between pluggable databases, between container databases and non-pluggable databases, and even within databases using Consumer Groups.

Use Resource Shapes for Simplicity

Of course we recommend establishing a standardized set of “resource shapes” to choose from rather than making each database a unique and special snowflake. It’s much easier to manage a large population of databases if they all follow some standards, including the resources assigned to them. Using Resource Shapes means you establish shapes with the same ratios for the following resources:

  • CPU
  • Memory (SGA & PGA)
  • Processes (Sessions & PQ Servers)
  • I/O

I covered this in detail in the MAA Best Practices white paper on Database Consolidation available here. The white paper includes different allocations for DW vs. OLTP databases, so you’ll see 2 different tables of Resource Shapes. DW systems need larger PGA memory and more processes are devoted to parallel processing than on OLTP systems.  You will see those differences in the resource shape tables.

Reducing Blast Area Still Means a BLAST!

Reducing the “blast area” by dividing the data by each microservice and sharding large databases doesn’t ELIMINATE the blast (meaning “failure”). What a smaller blast area gives you is pieces that are smaller and easier to fix (because they are smaller). It’s also important to realize that a piece of an application dead in the water is still dead in the water.  If that piece of the application (a microservice) is critical, the entire application might also be dead in the water.

Sharding is another design that reduces the “blast area” of a failure, but an outage of a shard is still an outage.  It’s just easier to fix because it’s smaller, but it still needs to be fixed! In the best case scenario, you have SOME of your users dead in the water because the data they need (and the microservice) isn’t available.  In the worst case scenario, you will have some microservices or portions of your application that need ALL of the shards available for it to function, and those portions of the application are dead in the water.

In short, reducing the blast area doesn’t necessarily increase availability, but might help in reducing the duration of an outage.

HA is still Important with Microservices & Sharding

High Availability (HA) systems are still important, even for microservices and sharded databases. There can still be a “blast” that takes down a database, which takes down a microservice or shard, which will impact at least some of your users if not all of them depending on how your application uses that data.

Just because you have things broken down into small pieces doesn’t mean there’s no impact of an outage on any piece. It’s still important to provide HA, which we do with these tools:

  • Core HA Features of Oracle Database
  • Oracle Real Application Clusters
  • Oracle Active Data Guard
  • Oracle Application Continuity

Core HA features of Oracle Database have become something that we almost don’t even consider these days, but are still huge differentiators compared to other databases on the market. Oracle experts often take for granted things like online index rebuilds, online table move, and other features that have been developed over the years. I am often surprised to find other databases haven’t caught up, and some of the newer database engines are still years behind.

Oracle Real Application Clusters (RAC) is still unique in the market after all of these years. We often think of RAC for scalability, but it’s a huge part of the Oracle Database availability solution as well.

Oracle Active Data Guard is the Oracle solution for providing Read Replicas  (or “reader farms”), but it’s also a critical part of the Oracle Database availability solution. Active Data Guard provides for availability when a “blast” (failure) occurs, but it’s also used to provide availability during proactive maintenance.

Oracle Application Continuity hasn’t gotten enough press over the past 5+ years, but it’s yet another completely unique capability that Oracle offers. People seem to remember TAF (Transparent Application Failover), but Application Continuity is a completely different animal.  TAF required application code changes, while AC has essentially pushed much of that complexity into the SQL driver for Oracle.

Oracle Database has long exceeded the capabilities of applications, and Application Continuity closes the gap. Applications need to tolerate rolling outages of a RAC Cluster, and Application Continuity is what makes this happen. Application Continuity is implemented through CONFIGURATION changes, not within application code.

Data Driven Apps: Benefits of Converged Database

The Oracle Converged Database brings a number of critical advantages for data driven applications compared to point-solution databases.  See Juan Loaiza’s talk on Data Driven Apps at OOW London 2019 for a great talk on this topic. Although point-solution databases are sometimes referred to as “best of breed”, many of them don’t live up to that claim. You would suppose a database with a narrow focus might do that job better, but that’s not necessarily the case. The Oracle Converged Database advantages include:

  • Data Modeling Flexibility
  • Simplified Data Movement
  • Portable Developer Skills
  • Increased Developer Productivity
  • Any Workload, Any Data
  • Simplified Consolidation & Isolation
  • Any Size, Any Scale

Data Modeling Flexibility comes from having a single database that provides the full range of data modeling needs. While developers might make the perfect decision and choose the exact point-solution database needed, requirements often change. A development team might choose a relational database, only to find that some data really needs a different modeling approach such as JSON Documents.

While Oracle Databases can be deployed using a SINGLE modeling technique (Relational, Document, Star Schema, Property Graph, etc.) you can also use MULTIPLE modeling approaches within the same Oracle Database. I have worked with data that didn’t fit easily into the relational model, so it’s great to be able to use JSON Documents or another technique that fits better for certain portions of your data model. It would be a huge pain (show-stopper really) if you had to switch databases or use another database for a small component like a single microservice.

Simplified Data Movement comes from using the same Oracle Database on both the sending and receiving side. That data movement is made vastly easier if the data resides with a single shared database, but is easier between Oracle Databases because both databases use the same tools, same drivers, same datatypes, etc. Data movement can be completely eliminated in cases where applications share the same data, such as combined OTLP and Reporting in a single database (yes, this is possible and quite common with Oracle). Data Movement can also be quite costly in Cloud environments, which is another benefit of using Oracle Converged Database rather than multiple point-solution databases.

Portable Developer Skills are a key benefit of the Oracle Converged Database, allowing developers to work on ANY microservice or application that uses the same database without extensive re-training. Developers who write analytic or Data Mining code can easily move into a development team working on transactional applications or vice-versa.

Increased Developer Productivity comes from having a common set of interfaces for all databases, regardless of what features are used within each database. Any feature provided by the database represents code that DOES NOT have to be written by a developer.

Any Workload, Any Data includes traditional OLTP and Analytic applications, but also includes Machine Learning, block chain, property graphs, Time Series analysis, Spatial, Internet of Things, and Event Processing. Oracle Database includes robust capabilities in these areas that have been proven over many deployments at customer sites worldwide.

Simplified Consolidation & Isolation comes by having databases that all run the same Converged Database. Administrators have the flexibility to isolate databases where needed, or consolidate them to ease the administrative burden as well as to improve resource efficiency. Oracle Databases can even be consolidated into a Container Database, for much greater density and lower cost of operation. Yes, this also applies to Cloud environments, where costs can easily escalate. The bottom line is you simply can’t do this with multiple divergent database engines such as one for Relational OLTP, one for Relational Star Schema DW, one for JSON Documents, etc.

Any Size, Any Scale database or workload can be handled by Oracle Database. It’s amazing to me that other databases haven’t caught up to Oracle after all these years. We still encounter customers who are faced with a database migration (to Oracle) because their chosen database engine simply can’t keep up. Oracle Database is able to easily support databases at least 10X larger than our competitors can handle, and the Exadata platform simply takes that number higher.

Microservices and Sharding increases the number of databases, while also reducing the maximum size of databases, but it’s great to know your can scale a database without worry if it’s running Oracle Database. You have time to adjust if you need to re-shard your databases or if your microservices databases grow unexpectedly. Having size limits that are 10X or even 100X higher than other databases gives you flexibility.

Microservices and Sharding DO NOT reduce the overall amount of data! In fact, the volume of data can increase somewhat if you’re duplicating any data across microservices or shards. In any case, the volume of data certainly isn’t less.

Summary

Oracle Converged Database gives you flexibility to deploy databases as needed, using microservices and sharding. Oracle is the #1 most widely used database overall, but also the #1 most widely used for microservices and sharding. Oracle Databases can be easily consolidated or isolated as needed to meet the needs of modern Data Driven Applications.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s