🧩 Modern Database Concepts: ACID, BASE, CAP & Sharding

Databases are the backbone of modern applications. This post dives into ACID, BASE, CAP theorem, Hibernate, JDBC, and NoSQL concepts, with practical examples and tips for Java developers. 🚀

1. 🧩 Consistency Models: ACID vs BASE

ACID (Relational Databases)
- ⚡ Atomicity: Each transaction is all-or-nothing.
- ⚡ Consistency: DB remains consistent after transactions.
- ⚡ Isolation: Transactions do not interfere concurrently.
- ⚡ Durability: Committed data persists.
BASE (NoSQL / Distributed Systems)
- Basically Available
- Soft State
- Eventually Consistent

2. 🌐 CAP Theorem

In a distributed system, when a partition occurs:

⚖️ Can only choose Consistency (C) or Availability (A) for the system.
💡 Example: Two nodes disconnected; you must either accept stale/inconsistent reads or deny availability.

3. 🏷 Indexes

Indexes speed up queries by storing key info separately.

CREATE INDEX index_name ON table(column)
⚡ Good for large read-heavy tables
⚠️ Can slow down writes due to updates on indexed columns

4. 🐘 Hibernate & ORM

Hibernate maps Java objects to database tables:

Core Concepts: Config, SessionFactory, Session, Transaction, Query
Session Methods:
- createSQLQuery = HQL queries
- saveOrUpdate → updates if PK exists
- save → returns ID; persist → returns nothing
- get → returns null if not found; load → throws exception if not found
Object States:
- Transient → new, not saved
- Persistent → saved and session open
- Detached → session closed, can reattach with merge
Loading Types:
- Eager → fetch at initialization
- Lazy → fetch on demand
Dirty Checking → reduces DB writes by updating only changed entities
Automatic session closing (with Spring or try-with-resources)
Exception handling automated via HibernateException

5. 🔒 Locking Strategies

Optimistic → assumes rare conflicts, checks version before commit
Pessimistic → locks row/table exclusively during transaction

6. 💾 JDBC vs JPA vs Hibernate

JDBC → low-level API, manual mapping between Java objects & DB
JPA → specification, ORM abstraction, manages entities & persistence
Hibernate → implementation of JPA + caching, session management, advanced features
Spring Data JPA → requires JPA provider (Hibernate)
JTA → manages distributed transactions (@Transactional, multiple resources)

7. ⚡ SQL vs NoSQL

SQL → vertically scalable, predefined schemas, multi-row transactions
NoSQL → horizontally scalable, dynamic schemas (JSON, documents), sharding for high traffic

8. 📌 Key Concepts: Sharding vs Partitioning vs Replication

Partitioning → organizing data within a single database into smaller, manageable pieces.
- Can be horizontal (rows split across partitions) or vertical (columns split across partitions).
- Purpose: improve **query performance**, maintenance, and manage large tables.
- Example: splitting orders by month within the same database.
Sharding → horizontally partitioning data **across multiple database nodes**.
- Each shard holds a subset of the data based on a **shard key** (e.g., user ID, region).
- Purpose: improve **write scalability** and **horizontal scaling**.
- Example: Users 1–1000 → Shard 1, Users 1001–2000 → Shard 2.
- ⚡ Key difference from partitioning: **sharding is distributed across multiple servers**, partitioning can exist inside a single DB.
Replication → duplicating data across nodes for **read scaling** and **fault tolerance**.
- Read replicas reduce load on primary nodes.
- Writes go to master; replicas are eventually consistent.
- Example: Reporting DB or caching layer served by read replicas.
Decision Guide
- ✅ Read-heavy → Replication
- ✅ Write-heavy → Sharding / Partitioning
- ⚡ Mixed workloads → combine sharding with replication (replicated shards)

💡 Practical Tips for Java Developers

Use Spring Data JPA with **entity mappings** and **transaction boundaries** that respect sharding/partitioning keys.
For read replicas, annotate service methods with @Transactional(readOnly = true) to avoid accidental writes.
Pick **stable shard keys** to avoid hotspots and uneven data distribution.
Partition tables in the same DB for performance optimization; shard across multiple nodes for scaling write-heavy workloads.

💡 Practical Tip for Java Developers

Use Spring Data JPA with proper **entity mappings** and **transaction boundaries** for sharded databases.
For read replicas, consider **@Transactional(readOnly = true)** on service methods to avoid accidental writes.
Partitioning keys (e.g., user ID, account ID) should be stable and evenly distributed to avoid hot spots.

9. 📝 Example: Hibernate Save vs Persist


Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

User user = new User();
user.setName("Alice");

// save → returns generated ID
Serializable id = session.save(user);

// persist → returns void
session.persist(user);

tx.commit();
session.close();

10. 🧾 Transaction Isolation & Dirty Reads

Dirty Read: Transaction reads uncommitted changes; risk if rollback occurs.
Non-Repeatable Read: Data changes between reads in the same transaction.
Phantom Read: New rows appear between repeated queries in a transaction.

11. 🗂 SQL Clauses & Queries

WHERE: Filters rows before grouping
HAVING: Filters groups after aggregation
GROUP BY: Aggregates rows by specified columns
JOIN: Combines tables using ON table1.col = table2.col


SELECT department, COUNT(*) 
FROM employees
WHERE active = 1
GROUP BY department
HAVING COUNT(*) > 5
ORDER BY department;

SELECT e.name, d.name 
FROM employees e
JOIN departments d
ON e.department_id = d.id;

12. 📊 Summary Table

Concept	Definition / Purpose	Notes
ACID	Transaction guarantees in RDBMS	Atomicity, Consistency, Isolation, Durability
BASE	Eventual consistency in NoSQL	Basically Available, Soft State, Eventually Consistent
Sharding	Partitioning data across nodes	Improves scalability
Replication	Duplicate data on multiple nodes	Improves availability and fault tolerance
Hibernate States	Transient, Persistent, Detached	Determines object lifecycle in session
Dirty Read	Reading uncommitted data	Can lead to inconsistencies if rollback occurs

13. 🖼 Database Architecture Diagram

Shard 1

Subset of data (Partition)

Shard 2

Subset of data (Partition)

Replication Node

Exact copy for availability

Consistency / ACID

Transaction guarantees in relational DB

BASE / Eventual

Distributed, NoSQL systems

14. 🚀 Tips for Java Developers

Use JPA/Hibernate to simplify DB interactions and avoid boilerplate JDBC code.
Carefully choose between eager vs lazy loading to optimize performance.
Monitor transaction locks and isolation levels for concurrency-sensitive applications.
Consider indexes for large read-heavy tables but avoid over-indexing write-heavy tables.
Leverage NoSQL for flexible, high-throughput scenarios, and SQL for relational, transactional workloads.

15. 🔄 Database Normalization & Denormalization

Normalization and denormalization are techniques to organize data in databases for consistency, integrity, and performance.

📌 Normalization

Normalization reduces redundancy and ensures data integrity by splitting data into multiple related tables.

Example: Instead of storing customer name in every order row, create a Customers table and reference it via customer_id in the Orders table.
Benefits:
- Minimizes duplicate data
- Improves consistency and maintainability
Trade-offs:
- Requires joins for queries → can impact read performance

📦 Denormalization

Denormalization combines data into fewer tables to optimize read performance, often at the cost of some redundancy.

Example: Store customer name directly in the Orders table to speed up reporting queries without joining the Customers table.
Benefits:
- Faster reads
- Simpler queries for reporting and analytics
Trade-offs:
- More storage used
- Potential inconsistency if duplicated data is not kept in sync

💡 Practical Tip for Java Developers

Use JPA/Hibernate mappings carefully: decide which relationships to normalize vs denormalize for performance vs integrity.
Consider denormalization for read-heavy reporting tables or cache tables in distributed systems.
Normalize transactional tables to prevent anomalies and maintain data integrity.

Labels: Java, Databases, Hibernate, JPA, JDBC, SQL, NoSQL, ACID, BASE, CAP Theorem, Sharding, Replication, Java Performance, Dirty Read, Transaction Isolation, GROUP BY, HAVING

Search This Blog

The Code Hut