IBM Data Engineering: Relational Database Administration
This course provided a solid foundation in relational database administration using MySQL, Db2, and PostgreSQL. It focused on the lifecycle of a database, how to manage it effectively, monitor performance, and secure access. It’s not a track I expect to specialize in, but it was helpful to learn the basics and build some confidence for future troubleshooting or system-level data work
Module 1: Introduction to Database Management
This module introduced the core responsibilities of a DBA and walked through the lifecycle of managing relational databases.
Key tasks of a DBA:
Designing and planning databases
Managing server and DB objects
Setting up backup and recovery procedures
Handling user access and security
Monitoring and optimizing performance
Automating routine tasks
Database lifecycle phases:
Requirement analysis – define scope, users, and purpose
Design and planning – schema modeling, ERDs, access control
Implementation – build objects, configure access, set up automation
Monitoring & maintenance – patching, upgrades, log review, compliance
Important concepts:
System catalogs: store DB metadata and can be queried
Instances → Databases → Schemas → Objects: the standard hierarchy
Tablespaces: logical storage containers mapped to physical containers
Storage types: hot/cold data, partitioning, storage groups
MySQL storage engines: e.g. InnoDB, MyISAM, MEMORY, ARCHIVE
Module 2: Managing Databases
This one was about backup strategies, recovery techniques, and system-level security and encryption.
Backup types:
Full – complete snapshot
Differential – changes since last full
Incremental – changes since last backup (full or incremental)
Point-in-time recovery – restores DB to a specific moment using logs
Physical vs. logical backups:
Physical – faster, file-level, less flexible
Logical – SQL-based, granular, slower
Backup policies include:
Hot (online) vs. Cold (offline)
Compression, encryption, frequency, scheduling
Cloud options: automated or manual
Security layers:
Authentication – identity verification
Authorization – permissions
Auditing – tracks access and actions
Encryption – at rest and in transit
Roles & groups – simplify access management
Encryption types:
Symmetric – faster, shared key (e.g. AES)
Asymmetric – public/private keys (e.g. RSA)
Transparent Data Encryption – automatic full-DB encryption
Module 3: Monitoring and Optimization
This module focused on keeping a database healthy, responsive, and efficient over time.
Monitoring strategies:
Reactive – respond after problems occur
Proactive – detect anomalies early using alerts and baselines
Key metrics:
Throughput (queries/sec)
Response time, availability, resource usage
Contention, lock waits, buffer pool usage
Tools:
MySQL: Workbench, Query Profiler
PostgreSQL: pgAdmin, pganalyze
Db2: Data Management Console, Snapshot monitors
3rd-party: Datadog, SolarWinds, PRTG
Optimization techniques:
MySQL:
OPTIMIZE TABLE
PostgreSQL:
VACUUM
,REINDEX
,ANALYZE
Db2:
RUNSTATS
,REORG
Query tuning:
Use
EXPLAIN
plans to understand executionAvoid
SELECT *
, use indexes, rewrite slow queriesConsider buffer sizes and schema design
Use tools like EverSQL, dbForge, shell benchmarking
Indexing:
Boosts read speed, slows down writes
Choose columns based on usage (WHERE, JOIN, ORDER BY)
Types: Primary, Unique, Full-text, Prefix
Poor indexing = common bottleneck
Module 4: Troubleshooting and Automation
Final module focused on system reliability, error investigation, and reducing manual work through scripts and alerts.
Troubleshooting:
Ask the right questions (what, where, when, conditions, reproducibility)
Check logs: error logs, trace logs, event logs
Use platform tools (e.g.
SHOW STATUS
,db2pd
,pg_isready
)Error codes + documentation are essential
Logs contain:
Severity, timestamp, user, event description
Usually plain text or tool-readable formats
Need to be checked frequently
Automation:
Automate time-consuming, repetitive tasks
Use scripts for backup, cleanup, health checks
Schedule with cron jobs or external tools
Keep everything under version control
Alerting and reports:
Reports = system health over time
Alerts = real-time problems (e.g. failed jobs, low disk space)
Thresholds: warning vs. critical
Customize content and frequency
Final Thoughts
While I don’t expect to work as a full-time DBA, this course gave me a much clearer understanding of:
How relational databases are structured
What it takes to maintain their performance and integrity
How to back them up, secure them, and tune them for better speed
It’s a useful foundation for anyone working in data engineering, backend infrastructure, or even analytics roles that need to interact with production databases
Course Certificate: View on Coursera
All notes and opinions are personal interpretations of the IBM Python for Data Science, AI & Development course on Coursera.