Rosaiah Chowdary — Data Engineer

About Me

I am a passionate Data Engineer and ETL DataStage consultant with extensive hands-on experience in designing, developing, and optimizing data integration solutions. Over my career, I have worked on projects spanning performance tuning, operational support and migrations.

Expertise: IBM DataStage, performance tuning, troubleshooting, and admin
Delivered end-to-end data warehousing and migration projects
Integrated data from relational DBs, files, and cloud
Problem-solving focus: data quality, governance, automation
Modernized legacy ETL & migration to cloud

Production Support & Operations

Managed over 90,000 automated and scheduled jobs, ensuring smooth operations, minimal downtime and meeting strict SLAs. Monitored batch windows, handled failovers, and coordinated cross-team incident responses.

Issue Resolution & RCA

Conducted in-depth root cause analysis and quickly resolved critical incidents and change requests to maintain business continuity. Authored RCA reports and implemented preventive actions.

DataStage Job Development

Designed robust, high-performing ETL jobs per mapping documents and business requirements, ensuring quality, parallelism and efficient resource usage.

Data Extraction & Integration

Orchestrated data flows from various sources (databases, Excel, CSV, flat files), ensuring data accuracy throughout integration and implementing transformations reliably.

Documentation & Knowledge Management

Developed technical specifications, transformation/logic documentation, and runbooks for efficient knowledge transfer and swift onboarding of new team members.

Testing & Quality Assurance

Defined and executed comprehensive system, integration, and UAT test plans to ensure data and process quality, including performance benchmarking and regression testing.

Client & Stakeholder Engagement

Managed client incidents, led regular status reviews, initiated automation projects, and ensured stakeholder satisfaction through clear communication and on-time delivery.

Deployment & Change Management

Coordinated code migrations, job scheduling, and production cutover in collaboration with cross-functional teams, maintaining rollback plans and validation checks.

ETL Development & Support

Designing, building, and managing Extract, Transform, Load (ETL) processes for reliable movement of data across systems. Ensuring high-quality and timely data delivery with minimal errors.

IBM DataStage

Advanced skill in IBM DataStage, including scalable ETL job design, parallel processing, job tuning, troubleshooting, and ongoing operational support for data integration.

SQL & PL/SQL

Writing and optimizing queries, stored procedures, and data transformation scripts for secure, high-performance and accurate data processing in relational databases.

Python for Data Engineering

Automating data workflows, custom ETL scripting, and integrating Python into broader data engineering projects to improve speed and flexibility.

Data Warehousing

Designing robust data models, developing data warehouses and data marts for business analytics, and supporting large-scale analytical workloads.

Apache Kafka & Messaging

Integrating real-time data streams with Apache Kafka and message queues, enabling event-driven architectures and near-real-time processing.

Cloud Platforms (AWS, GCP)

Migrating on-premise data solutions to the cloud, leveraging AWS and GCP managed services for ETL, data storage, automation, and scalability.

CI/CD & DevOps

Managing source code, automating deployments and streamlining development pipelines with Git, Jenkins, and related DevOps tooling.

Data Warehouse Migration

Led migration of on-prem ETL to cloud-native pipelines, improving load times by 4x and cutting infra costs by 35%. Designed resilient data flows and optimised batch runtimes, created transformation specs and executed phased cutovers.

Real-time Stream Integration

Implemented Kafka-based streaming pipelines enabling near-real-time analytics, alerts and event-driven processing. Built idempotent consumers, schema evolution strategies, and monitoring dashboards for throughput and lag.

Operational Automation

Automated daily maintenance, log cleanup, job restart checks and validation workflows — reducing manual intervention by 70% and improving SLA stability. Integrated alerting and auto-remediation scripts for common failure classes.

Data Quality & Observability

Designed and implemented data quality checks, drift detection, and lineage mapping. Built observability into pipelines using metrics, logs and dashboards to reduce silent failures and accelerate troubleshooting.