Rosaiah Chowdary

Data Engineer • ETL Specialist

Rosaiah Chowdary

Rosaiah Chowdary

Data Engineer & ETL Specialist

IBM DataStage Python SQL Kafka AWS GCP

About Me

I am a passionate Data Engineer and ETL DataStage consultant with extensive hands-on experience in designing, developing, and optimizing data integration solutions. Over my career, I have worked on projects spanning performance tuning, operational support and migrations.

  • Expertise: IBM DataStage, performance tuning, troubleshooting, and admin
  • Delivered end-to-end data warehousing and migration projects
  • Integrated data from relational DBs, files, and cloud
  • Problem-solving focus: data quality, governance, automation
  • Modernized legacy ETL & migration to cloud
Production Support & Operations
Managed over 90,000 automated and scheduled jobs, ensuring smooth operations, minimal downtime and meeting strict SLAs. Monitored batch windows, handled failovers, and coordinated cross-team incident responses.
Issue Resolution & RCA
Conducted in-depth root cause analysis and quickly resolved critical incidents and change requests to maintain business continuity. Authored RCA reports and implemented preventive actions.
DataStage Job Development
Designed robust, high-performing ETL jobs per mapping documents and business requirements, ensuring quality, parallelism and efficient resource usage.
Data Extraction & Integration
Orchestrated data flows from various sources (databases, Excel, CSV, flat files), ensuring data accuracy throughout integration and implementing transformations reliably.
Documentation & Knowledge Management
Developed technical specifications, transformation/logic documentation, and runbooks for efficient knowledge transfer and swift onboarding of new team members.
Testing & Quality Assurance
Defined and executed comprehensive system, integration, and UAT test plans to ensure data and process quality, including performance benchmarking and regression testing.
Client & Stakeholder Engagement
Managed client incidents, led regular status reviews, initiated automation projects, and ensured stakeholder satisfaction through clear communication and on-time delivery.
Deployment & Change Management
Coordinated code migrations, job scheduling, and production cutover in collaboration with cross-functional teams, maintaining rollback plans and validation checks.
ETL Development & Support
Designing, building, and managing Extract, Transform, Load (ETL) processes for reliable movement of data across systems. Ensuring high-quality and timely data delivery with minimal errors.
IBM DataStage
Advanced skill in IBM DataStage, including scalable ETL job design, parallel processing, job tuning, troubleshooting, and ongoing operational support for data integration.
SQL & PL/SQL
Writing and optimizing queries, stored procedures, and data transformation scripts for secure, high-performance and accurate data processing in relational databases.
Python for Data Engineering
Automating data workflows, custom ETL scripting, and integrating Python into broader data engineering projects to improve speed and flexibility.
Data Warehousing
Designing robust data models, developing data warehouses and data marts for business analytics, and supporting large-scale analytical workloads.
Apache Kafka & Messaging
Integrating real-time data streams with Apache Kafka and message queues, enabling event-driven architectures and near-real-time processing.
Cloud Platforms (AWS, GCP)
Migrating on-premise data solutions to the cloud, leveraging AWS and GCP managed services for ETL, data storage, automation, and scalability.
CI/CD & DevOps
Managing source code, automating deployments and streamlining development pipelines with Git, Jenkins, and related DevOps tooling.
Data Warehouse Migration
Led migration of on-prem ETL to cloud-native pipelines, improving load times by 4x and cutting infra costs by 35%. Designed resilient data flows and optimised batch runtimes, created transformation specs and executed phased cutovers.
Real-time Stream Integration
Implemented Kafka-based streaming pipelines enabling near-real-time analytics, alerts and event-driven processing. Built idempotent consumers, schema evolution strategies, and monitoring dashboards for throughput and lag.
Operational Automation
Automated daily maintenance, log cleanup, job restart checks and validation workflows — reducing manual intervention by 70% and improving SLA stability. Integrated alerting and auto-remediation scripts for common failure classes.
Data Quality & Observability
Designed and implemented data quality checks, drift detection, and lineage mapping. Built observability into pipelines using metrics, logs and dashboards to reduce silent failures and accelerate troubleshooting.