Open to Senior & Staff Data Engineer roles

Bhanu Prakash
Reddy Rella

Senior Data Engineer at Meta  ·  AI & ML Data Platforms  ·  IEEE Senior Member

10+ years building cloud-native data platforms, streaming pipelines, and AI/ML-ready infrastructure at Meta (Ads), Walmart Global Tech, and TCS (Citibank). Architect of streaming-first systems processing 100B+ records/month at 5+ TB/day.

Bhanu Prakash Reddy Rella
Available
10+Years in Data & AI
100B+Records / Month Pipelined
5+ TBDaily Throughput
22+Peer-Reviewed Papers
Selected experience Meta Walmart Global Tech TCS / Citibank IEEE

Engineering at scale.
Research with conscience.

I am a Senior Data Engineer at Meta, building measurement data infrastructure within Meta's Ads org — Python ETL on Spark/Hive powering campaign lift measurement, conversion-lift studies, and incrementality experiments consumed by Data Science and ML teams.

Before Meta, I spent 2.5 years as Lead Data Engineer at Walmart Global Tech, building supplier-facing analytics on Databricks + Iceberg + Kafka for an ecosystem of 240M+ weekly customers. Before that, 6+ years at Tata Consultancy Services delivering large-scale Azure and Hadoop platforms for Citibank under SOX/PCI-DSS controls, and an earlier stint as a Data Analyst at Flipkart.

In parallel, I am the founder of The Green AI Initiative, an IEEE Senior Member, and serve as Secretary of the IEEE Computer Society Santa Clara Valley Chapter. I am the author of Energy-Efficient Computing for Modern AI, inventor of a patented sparse neural network technique, and a peer reviewer for IEEE PAMI, Elsevier JPDC, and PLOS ONE.

I delivered the keynote on Green AI: Paving the Way for Sustainable Technology at IEEE Cloud Summit 2025 in Washington, D.C., and my recent peer-reviewed work appears in JoVE (April 2026), Springer IJIT (2026), and ACL NLP4DH 2025.

A decade of building at scale.

From Hadoop ETL at Citibank to streaming AdTech at Meta — a single arc of making large systems faster, more reliable, and more sustainable.

Jul 2025 — Present Current

Senior Data Engineer

Meta Platforms, Inc. · Menlo Park, CA
  • Architected Databricks Lakehouse pipelines on Azure & AWS consolidating advertising measurement, engagement, and experimentation datasets — enabling near real-time analytics for 500+ internal stakeholders and improving data availability 35%.
  • Engineered large-scale PySpark & Spark Structured Streaming workflows processing 8B+ ad interaction events monthly — reduced pipeline latency from 4–5 hours to under 20 minutes and improved campaign attribution accuracy.
  • Built metadata-driven ETL frameworks (Databricks, Delta Lake, Kafka, Airflow) with schema validation, partition pruning, Z-Ordering, OPTIMIZE/VACUUM, liquid clustering, salting, and reusable PySpark UDFs — reduced operational overhead 40%.
  • Implemented Agentic AI data observability with OpenAI, LangChain, and RAG over runbooks, metric definitions, and lineage metadata — orchestrating OpenAI, Llama, and Claude to automate root-cause analysis, cutting incident investigation time 55% and saving ~24,000 engineering hours annually.
  • Built scalable feature engineering pipelines supporting PyTorch-based ML models for lift measurement and advertiser optimization, accelerating experimentation cycles.
  • Established enterprise governance with Unity Catalog & Delta Sharing and automated data-quality controls — reduced critical data-quality incidents 30%.
DatabricksPySparkSpark Structured StreamingDelta LakeKafkaAirflowUnity CatalogPyTorchOpenAILangChainTerraformKubernetes
Dec 2022 — Jun 2025

Lead Data Engineer

Walmart Global Tech · Sunnyvale, CA
  • Led modernization of Walmart's supplier analytics platform on Databricks Lakehouse, Apache Iceberg, and Delta Lake — scalable processing of 100B+ records/month, reducing reporting latency from over 24 hours to under 6 hours.
  • Directed Kafka + Spark Structured Streaming ingesting 5+ TB of supply-chain & inventory data daily — improving real-time visibility for 1,500+ suppliers serving 240M+ weekly customers.
  • Designed unified enterprise data architecture across Databricks, BigQuery, Snowflake, and GCS — improving accessibility and reducing duplicate data movement.
  • Implemented Infrastructure as Code with Terraform & Kubernetes on GCP/Azure — cut environment deployment time from days to under 4 hours.
  • Optimized PySpark/SQL pipelines via partitioning, caching, and Delta optimization — 35% lower compute cost, up to 80% runtime reduction, improved SLA compliance.
  • Drove code-quality governance (SonarQube, automated testing, CI/CD) — 10%+ quality-score lift; established dbt ELT frameworks with Great Expectations validation.
  • Mentored a team of engineers through architecture reviews and cloud-modernization programs — saving ~24,000 engineering hours annually.
DatabricksDelta LakeApache IcebergUnity CatalogPySparkKafkaKubernetesTerraformBigQuerySnowflakedbtAirflowGCP
Dec 2020 — Nov 2022

Senior Data Engineer

Tata Consultancy Services · Client: Citibank · Irving, TX
  • Designed cloud-native banking data platforms using Azure Data Factory, ADLS Gen2, Databricks, and Synapse Analytics — scalable ingestion of regulatory and financial datasets under SOX/PCI-DSS controls.
  • Developed Databricks ETL (PySpark + Delta Lake) for high-volume transaction data — reduced end-to-end processing from 8 hours to under 5 hours while improving reliability for risk and compliance workloads.
  • Migrated legacy warehouses to Snowflake & Azure35% lower storage costs and improved query performance for BI users.
  • Built reusable ingestion frameworks integrating APIs, relational databases, and financial apps into Azure Data Lake — faster onboarding, less manual effort.
  • Implemented data quality, lineage, and governance with Great Expectations — strengthening SOX and PCI-DSS compliance.
  • Delivered Power BI dashboards supporting fraud detection, customer insights, and operational reporting; automated provisioning via Terraform & Azure DevOps.
Azure Data FactoryADLSSynapseDatabricksPySparkSnowflakeTerraformAKSPower BI
Oct 2016 — Nov 2020

Data Engineer

Tata Consultancy Services · Client: Citibank · Hyderabad, India
  • Built enterprise-scale ETL with Python, Spark, Hadoop, Hive, and Kafka processing banking transaction and customer datasets — improving data availability for analytics, compliance, and reporting.
  • Developed real-time streaming pipelines with Kafka, Spark Streaming, and Apache Flink — reducing batch dependencies and enabling faster operational insights.
  • Implemented ingestion frameworks using Apache NiFi, Sqoop, and HDFS to automate movement of structured and semi-structured data from multiple banking systems.
  • Optimized Spark workloads via partitioning, resource tuning, and query enhancements — 25% faster execution and improved cluster utilization.
PythonSparkSpark StreamingFlinkKafkaNiFiHDFSOoziePrometheus
Aug 2015 — Sep 2016

Data Analyst

Flipkart · Hyderabad, India
  • Analyzed customer behavior, sales trends, and product performance (SQL, Excel) — actionable insights driving merchandising and business-growth initiatives.
  • Designed dimensional data models and ETL workflows for e-commerce reporting — improving data consistency and KPI accuracy.
  • Built dashboards for inventory, order fulfillment, and customer-engagement metrics; partnered with cross-functional teams on data quality and metric reconciliation.

The tools behind the work.

Pulled from production systems at Meta, Walmart, and Citibank — not a list of buzzwords but a working stack.

Lakehouse & Warehousing

DatabricksDelta LakeSnowflakeBigQueryUnity CatalogIcebergRedshiftAzure SynapseHive Metastore

Streaming & Distributed Systems

Apache SparkPySparkKafkaSpark Structured StreamingApache FlinkPresto/TrinoHiveHadoop/HDFS

AI / ML & LLM Apps

LangChainRAGPyTorchMLflowFeature StoresLlamaClaudeOpenAIHugging FaceAWS BedrockpgvectorPinecone

Cloud Platforms

GCPAWSAzureDataprocGCSPub/SubS3EMRGlueAthenaADFADLS

DevOps & Infrastructure

KubernetesTerraformDockerHelmGitGitHub ActionsJenkinsMavenGKEEKSAKS

Orchestration & Transformation

dbtApache AirflowDatabricks WorkflowsDagsterAutomicOozieGreat Expectations

Programming

PythonSQLScalaJavaShellpandasNumPy

Governance & BI

Data ContractsSchema VersioningLineagePII/PHISOXGDPRPower BITableauLooker

Peer-reviewed work indexed in SCIE, ACL, Scopus & PubMed.

22+ peer-reviewed publications, 1 book, 1 patent, 7 research chapters. Below: the most recent and most relevant.

2026

Energy-Efficient Optimization of Retail Transaction Time-Series using Wavelet Transform & Sparse Neural Architecture

B. P. R. Rella, S. C. Konduru, N. Kolli, R. K. Konduru, N. Kakani, L. P. Maram Reddy

International Journal of Information Technology, Springer · Accepted

First / Corresponding
All

Complete publication list on Google Scholar →

22+ peer-reviewed papers across AI, distributed systems, energy-efficient computing, and signal processing

141+ citations · Indexed in IEEE Xplore, Scopus, Web of Science, ACL Anthology

Scholar ↗

Beyond the day job.

Leadership, recognition, and contributions to the broader engineering and research community.

IEEE Senior Member

Elevated grade — top 10% of IEEE

Awarded for sustained contributions to AI, distributed systems, and energy-efficient computing. Active in Computer Society and Signal Processing Society.

2025 · Computer Society · SPS

Chapter Officer

Secretary, IEEE Computer Society SCV Chapter C16

Elected unanimously to lead the Santa Clara Valley Chapter for the 2026 term — chapter operations, events, and member outreach in Silicon Valley.

2026 Term · Santa Clara Valley

Advisory Role

IEEE DataPort Site Enhancement Subcommittee

Advising the platform's usability, AI/ML innovations, and next-generation data science features — strengthening IEEE DataPort as a premier global research data platform.

June 2025 — Present

Keynote Speaker

IEEE Cloud Summit 2025 — Washington, D.C.

Delivered keynote on Green AI: Paving the Way for Sustainable Technology — energy and water footprint of large AI systems, and infrastructure that scales without compromising the planet.

2025 · Also: ICMCTC · AI Week

Founder

The Green AI Initiative

Global platform for sustainable AI — federated learning, sparse networks, structured pruning, water-footprint optimization, ESG-aligned ML.

Founder & Technical Lead

Peer Reviewer

IEEE PAMI · Elsevier JPDC · PLOS ONE

Active peer reviewer for top-tier journals in pattern analysis, parallel & distributed computing, and open science. Session Chair / Program Committee Member at multiple IEEE and Springer conferences.

Ongoing

Author — Book

Energy-Efficient Computing for Modern AI

A 200+ page practitioner's guide to pruning, sparsity, distillation, federated learning, and the systems engineering that makes sustainable AI ship.

Author · International edition

Inventor — Patent

Sparse Neural Network Techniques

Patented methods for sparse representation in large-scale neural architectures — reducing inference cost and energy consumption while preserving accuracy at deployment scale.

1 patent · 7 research chapters

Formal credentials.

Education

Doctor of Business Administration (Energy-Efficient AI)
Golden Gate University, San Francisco
In Progress
M.S., Management Information Systems
University of Memphis · Grade: A+
B.Tech., Electrical & Electronics Engineering
Jawaharlal Nehru Technological University

Certifications

AWS Cloud Practitioner
Amazon Web Services
Azure Data Engineer Associate
Microsoft
Databricks Certified Data Engineer
Databricks
SnowPro Core Certification
Snowflake
GCP Professional Data Engineer
Google Cloud
Oracle Certified Professional, Java SE 6
Oracle
Alteryx · Data Analytics in Technology
Industry credentials

Let's talk.

Open to Senior & Staff Data Engineer roles at top tech and AI companies. Also available for technical advising, keynote speaking, and research collaborations on energy-efficient AI and large-scale data infrastructure.