IBM Data Engineering: Relational Database Administration

This course provided a solid foundation in relational database administration using MySQL, Db2, and PostgreSQL. It focused on the lifecycle of a database, how to manage it effectively, monitor performance, and secure access. It’s not a track I expect to specialize in, but it was helpful to learn the basics and build some confidence for future troubleshooting or system-level data work

 

Module 1: Introduction to Database Management

This module introduced the core responsibilities of a DBA and walked through the lifecycle of managing relational databases.

Key tasks of a DBA:

  • Designing and planning databases

  • Managing server and DB objects

  • Setting up backup and recovery procedures

  • Handling user access and security

  • Monitoring and optimizing performance

  • Automating routine tasks

Database lifecycle phases:

  1. Requirement analysis – define scope, users, and purpose

  2. Design and planning – schema modeling, ERDs, access control

  3. Implementation – build objects, configure access, set up automation

  4. Monitoring & maintenance – patching, upgrades, log review, compliance

Important concepts:

  • System catalogs: store DB metadata and can be queried

  • Instances → Databases → Schemas → Objects: the standard hierarchy

  • Tablespaces: logical storage containers mapped to physical containers

  • Storage types: hot/cold data, partitioning, storage groups

  • MySQL storage engines: e.g. InnoDB, MyISAM, MEMORY, ARCHIVE


Module 2: Managing Databases

This one was about backup strategies, recovery techniques, and system-level security and encryption.

Backup types:

  • Full – complete snapshot

  • Differential – changes since last full

  • Incremental – changes since last backup (full or incremental)

  • Point-in-time recovery – restores DB to a specific moment using logs

Physical vs. logical backups:

  • Physical – faster, file-level, less flexible

  • Logical – SQL-based, granular, slower

Backup policies include:

  • Hot (online) vs. Cold (offline)

  • Compression, encryption, frequency, scheduling

  • Cloud options: automated or manual

Security layers:

  • Authentication – identity verification

  • Authorization – permissions

  • Auditing – tracks access and actions

  • Encryption – at rest and in transit

  • Roles & groups – simplify access management

Encryption types:

  • Symmetric – faster, shared key (e.g. AES)

  • Asymmetric – public/private keys (e.g. RSA)

  • Transparent Data Encryption – automatic full-DB encryption


Module 3: Monitoring and Optimization

This module focused on keeping a database healthy, responsive, and efficient over time.

Monitoring strategies:

  • Reactive – respond after problems occur

  • Proactive – detect anomalies early using alerts and baselines

Key metrics:

  • Throughput (queries/sec)

  • Response time, availability, resource usage

  • Contention, lock waits, buffer pool usage

Tools:

  • MySQL: Workbench, Query Profiler

  • PostgreSQL: pgAdmin, pganalyze

  • Db2: Data Management Console, Snapshot monitors

  • 3rd-party: Datadog, SolarWinds, PRTG

Optimization techniques:

  • MySQL: OPTIMIZE TABLE

  • PostgreSQL: VACUUM, REINDEX, ANALYZE

  • Db2: RUNSTATS, REORG

Query tuning:

  • Use EXPLAIN plans to understand execution

  • Avoid SELECT *, use indexes, rewrite slow queries

  • Consider buffer sizes and schema design

  • Use tools like EverSQL, dbForge, shell benchmarking

Indexing:

  • Boosts read speed, slows down writes

  • Choose columns based on usage (WHERE, JOIN, ORDER BY)

  • Types: Primary, Unique, Full-text, Prefix

  • Poor indexing = common bottleneck


Module 4: Troubleshooting and Automation

Final module focused on system reliability, error investigation, and reducing manual work through scripts and alerts.

Troubleshooting:

  • Ask the right questions (what, where, when, conditions, reproducibility)

  • Check logs: error logs, trace logs, event logs

  • Use platform tools (e.g. SHOW STATUS, db2pd, pg_isready)

  • Error codes + documentation are essential

Logs contain:

  • Severity, timestamp, user, event description

  • Usually plain text or tool-readable formats

  • Need to be checked frequently

Automation:

  • Automate time-consuming, repetitive tasks

  • Use scripts for backup, cleanup, health checks

  • Schedule with cron jobs or external tools

  • Keep everything under version control

Alerting and reports:

  • Reports = system health over time

  • Alerts = real-time problems (e.g. failed jobs, low disk space)

  • Thresholds: warning vs. critical

  • Customize content and frequency


Final Thoughts

While I don’t expect to work as a full-time DBA, this course gave me a much clearer understanding of:

  • How relational databases are structured

  • What it takes to maintain their performance and integrity

  • How to back them up, secure them, and tune them for better speed

It’s a useful foundation for anyone working in data engineering, backend infrastructure, or even analytics roles that need to interact with production databases

Course Certificate: View on Coursera

All notes and opinions are personal interpretations of the IBM Python for Data Science, AI & Development course on Coursera.

Next
Next

IBM Data Engineering: Hands-on Introduction to Linux Commands and Shell Scripting