Follow

VDB Metadata Caching System

Overview

The VDB (Virtual Database) Metadata Caching System is a comprehensive performance optimization feature that dramatically reduces database round-trips when running in application server environments. It caches three types of database metadata to eliminate repeated DatabaseMetaData calls that occur on every file operation.

Architecture

The caching system implements a three-tier metadata cache:

1. Connection Metadata Cache

  • Purpose: Caches database type, version, and connection-specific settings
  • Default TTL: 7 days (604,800,000 ms)
  • Cached Data:
    • Database product name and version
    • JDBC driver version
    • Savepoint support capabilities
    • Database-specific settings (BLOB types, cascade options, pagination limits)
    • DBMS enum type for optimizations

2. Table Metadata Cache

  • Purpose: Caches table existence checks and schema information
  • Default TTL: 24 hours (86,400,000 ms)
  • Cached Data:
    • Table existence (including negative results)
    • Actual table names (handling case variations)
    • View vs. table detection
    • Catalog and schema information

3. Column Metadata Cache

  • Purpose: Caches column structure information for VDB tables
  • Default TTL: 24 hours (86,400,000 ms)
  • Cached Data:
    • Column names, types, and sizes
    • VDB-specific column patterns (DATA, IDX*, CI_*, REL, SEQ)
    • Column metadata for table structure analysis

Performance Benefits

Before Caching

  • Every file open triggered 10+ database metadata calls:
    • Multiple DatabaseMetaData.getTables() calls (table name variations)
    • DatabaseMetaData.getColumns() for table structure
    • DatabaseMetaData.getDatabaseProductName() for DB type
    • DatabaseMetaData.getDatabaseMajorVersion() for version
    • Additional capability checks

After Caching

  • First access: Performs metadata calls and caches results
  • Subsequent accesses: Uses cached data (99%+ reduction in database calls)
  • Typical improvement: 100-1000x reduction in metadata queries per minute

Configuration

Automatic Detection

The cache is automatically enabled when running in detected application server environments:

  • Tomcat: catalina.home system property
  • JBoss/WildFly: jboss.home.dir system property
  • WebLogic: weblogic.home system property
  • WebSphere: was.install.root system property
  • Jetty: jetty.home system property
  • GlassFish: glassfish.home system property

Configuration Priority

Properties are loaded in the following priority order:

  1. System Properties (-D flags): Highest priority, override all other settings
  2. deploy.properties: Standard VDB configuration file on classpath
  3. Defaults: Built-in default values

deploy.properties Configuration

Add VDB metadata cache settings to your existing deploy.properties file:

# VDB Metadata Cache Configuration in deploy.properties
vdb.metadata.cache.enabled=auto
vdb.metadata.cache.ttl=86400000
vdb.metadata.cache.connection.ttl=604800000
vdb.metadata.cache.max.entries=10000
vdb.metadata.cache.log.stats=false

# Other VDB settings like enable.cursor also go here
enable.cursor.CUSTOMER_TABLE=true

System Properties Configuration

# Enable/disable caching
# Values: true, false, auto (default)
vdb.metadata.cache.enabled=auto

# Cache Time-To-Live for table/column metadata (milliseconds)
# Default: 86400000 (24 hours)
vdb.metadata.cache.ttl=86400000

# Connection metadata TTL (milliseconds)
# Default: 604800000 (7 days)
vdb.metadata.cache.connection.ttl=604800000

# Maximum cache entries per cache type
# Default: 10000
vdb.metadata.cache.max.entries=10000

# Enable statistics logging
# Default: false
vdb.metadata.cache.log.stats=false

Alternative Configuration Methods

If not using deploy.properties, you can also configure via:

  • System properties at JVM startup
  • Environment-specific property files
  • Application server configuration

Monitoring and Management

Programmatic Access

// Check if caching is enabled
boolean enabled = VDBMetadataCache.isEnabled();

// Get comprehensive statistics
String stats = VDBMetadataCache.getCacheStatistics();
System.out.println(stats);

// Clear all caches
VDBMetadataCache.clearAllCaches();

// Clear specific table from caches
VDBMetadataCache.clearTable("CUSTOMER_TABLE");

// Reset statistics counters
VDBMetadataCache.resetStatistics();

Statistics Output Example

VDB Metadata Cache Statistics:
  Enabled: true
  TTL: 86400000 ms, Connection TTL: 604800000 ms
  Max Entries: 10000
  Table Cache: size=45, hits=1250, misses=48
  Column Cache: size=32, hits=890, misses=35
  Connection Cache: size=3, hits=2140, misses=4
  Overall Hit Rate: 97.85%
  Evictions: 0

JMX Management

The cache exposes a JMX MBean for runtime management:

MBean Name: com.heirloomcomputing:type=VDBMetadataCache

Available Operations:

  • clearAllCaches(): Clear all cache entries
  • clearTableCache(String tableName): Clear specific table
  • getCacheStatistics(): Get current statistics
  • isEnabled(): Check cache status
  • setEnabled(boolean): Enable/disable cache
  • getCacheTTL(): Get current TTL
  • setCacheTTL(long): Set cache TTL
  • getMaxCacheEntries(): Get max entries
  • setMaxCacheEntries(int): Set max entries
  • resetStatistics(): Reset counters

JMX Usage Example

// Connect to MBean server
MBeanServer server = ManagementFactory.getPlatformMBeanServer();
ObjectName name = new ObjectName("com.heirloomcomputing:type=VDBMetadataCache");

// Get statistics via JMX
String stats = (String) server.invoke(name, "getCacheStatistics", null, null);

// Clear cache via JMX
server.invoke(name, "clearAllCaches", null, null);

Troubleshooting

Cache Not Working

  1. Verify Environment Detection:

    System.out.println("Cache enabled: " + VDBMetadataCache.isEnabled());
  2. Force Enable:

    -Dvdb.metadata.cache.enabled=true
  3. Check Logs: Look for cache hit/miss messages in application logs when logging is enabled.

Performance Issues

  1. Monitor Hit Rates:

    • Target hit rate: >95%
    • Low hit rates may indicate TTL too short or cache size too small
  2. Adjust Cache Size:

    -Dvdb.metadata.cache.max.entries=20000
  3. Tune TTL:

    # Shorter TTL for development
    vdb.metadata.cache.ttl=3600000  # 1 hour
    
    # Longer TTL for stable production
    vdb.metadata.cache.connection.ttl=1209600000  # 14 days

Memory Considerations

  • Typical cache entry size: 100-500 bytes
  • 10,000 entries: ~1-5 MB memory usage
  • Automatic eviction: LRU-style when cache is full
  • Monitor evictions: High eviction rates indicate undersized cache

Cache Consistency Issues

If you experience stale cache entries after DDL operations:

  1. Check Automatic Invalidation:

    // Ensure DDL operations use VDB methods that trigger cache clearing
    // Manual DDL outside of VDB won't trigger automatic invalidation
  2. Manual Cache Clearing:

    // Clear specific table after external DDL
    VDBMetadataCache.clearTable("TABLENAME");
    
    // Clear all caches after schema changes
    VDBMetadataCache.clearAllCaches();
  3. Enable Invalidation Logging:

    # Enable fine logging to see cache invalidation
    java.util.logging.Logger.getLogger("com.heirloomcomputing.ecs.isamsql.VDB").setLevel(FINE);

Development vs Production

Development Environment:

# Disable caching for development
vdb.metadata.cache.enabled=false

# Or use short TTL for testing
vdb.metadata.cache.ttl=60000  # 1 minute

Production Environment:

# Auto-detection works well
vdb.metadata.cache.enabled=auto

# Standard production settings
vdb.metadata.cache.ttl=86400000      # 24 hours
vdb.metadata.cache.connection.ttl=604800000  # 7 days
vdb.metadata.cache.max.entries=10000

# Enable monitoring
vdb.metadata.cache.log.stats=true

Implementation Details

Cache Keys

  • Table Cache: URL|Username|Catalog|Schema|TableName
  • Column Cache: URL|Username|TableName
  • Connection Cache: URL|Username

Automatic Cache Invalidation

The cache system automatically invalidates entries when DDL operations occur:

Table/View Deletion:

  • When VDB.deleteFile() is called (explicit table deletion)
  • When tables are dropped via emptyOrRemoveTableIfRequired()
  • Automatically calls VDBMetadataCache.clearTable(tableName)
  • Removes both table metadata and column metadata entries
  • Handles both DROP TABLE and DROP VIEW operations

Cache Consistency:

  • Cache invalidation occurs immediately after successful DDL execution
  • Case-insensitive table name matching for cache clearing
  • Connection metadata remains cached (database type doesn't change)
  • Logging confirms cache invalidation operations

Thread Safety

  • Uses ConcurrentHashMap for lock-free reads
  • Atomic counters for statistics
  • Thread-safe cache operations

Cache Eviction

  • When: Cache size exceeds max.entries
  • Strategy: Simple FIFO (removes oldest 10% of entries)
  • Future: Could be enhanced with proper LRU

Expiration Handling

  • Lazy expiration: Entries checked for expiration on access
  • Automatic cleanup: Expired entries removed and statistics updated
  • No background threads: Keeps implementation simple

Best Practices

Deployment

  1. Enable in application servers: Use auto detection
  2. Disable in development: For debugging and testing
  3. Monitor statistics: Track hit rates and performance
  4. Size appropriately: Balance memory usage vs. hit rates

Troubleshooting

  1. Start with defaults: Most environments work well with default settings
  2. Monitor first: Use statistics to identify issues
  3. Adjust gradually: Small TTL/size changes first
  4. Clear when needed: Use programmatic cache clearing for DDL operations

Security

  • No sensitive data cached: Only metadata structure information
  • Connection-specific: Caches respect database security boundaries
  • Automatic cleanup: Expired entries are automatically removed

Migration Guide

Existing Applications

No code changes required - the caching is transparent to existing VDB applications.

Integration with deploy.properties

The VDB metadata cache now reads configuration from the same deploy.properties file used for other VDB settings like enable.cursor. This provides a centralized configuration approach where all VDB-related settings are managed in one place.

New Applications

Consider these optimizations:

  1. Connection pooling: Reuse connections to maximize cache effectiveness
  2. Stable table names: Avoid dynamic table name generation when possible
  3. Monitor performance: Use JMX or programmatic statistics

Version History

  • Initial Release: Basic table and column metadata caching
  • Enhanced Version: Added connection metadata caching
  • Current: Full three-tier caching with JMX management and comprehensive statistics
Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk