Dheeraj R - Sr. DATA ENGINEER |
[email protected] |
Location: Remote, Remote, USA |
Relocation: Yes |
Visa: GC |
Resume file: Dheeraj Resume DE_1744833653714.docx Please check the file(s) for viruses. Files are checked manually and then made available for download. |
Experience in writing AWS Lambda functions in Python to invoke scripts that perform various transformations and analytics on large datasets within EMR clusters
Skilled in utilizing the Amazon Web Services (AWS) cloud platform, including services such as EC2, S3, VPC, ELB, DynamoDB, CloudFront, CloudWatch, Route 53, Security Groups, Redshift, and CloudFormation Transferred an existing on-premises application to AWS, utilizing EC2 and S3 for processing and storage of small datasets; experienced in maintaining Hadoop clusters on AWS EMR Proficient in leveraging Azure Databricks and Apache Spark for distributed data processing and transformation tasks Skilled in ensuring data quality and integrity through effective validation, cleansing, and transformation operations Hands-on experience with Azure Cloud services such as Azure Data Factory, Azure Databricks, Logic Apps, Azure Function Apps, Snowflake, and Azure DevOps Working experience with Azure Stack, moving data from Data Lake to Azure Blob Storage Strong background in data load and integration using Azure Data Factory (ADF) Experience building ETL pipelines in Azure Databricks using PySpark and Spark SQL Experience developing Spark pipelines using both Scala and PySpark Experience working with Azure Logic Apps for integration use cases Implemented Azure Functions, Azure Storage, and Service Bus queries for large-scale ERP integration systems Experienced in creating and managing CI/CD pipelines using Azure DevOps, ensuring seamless deployment and integration Proficient in data pipeline development, data modeling, and working with Snowflake features like Multi-Cluster Warehouses, Cloning, and Time Travel Contributed to the development, improvement, and maintenance of Snowflake database applications Built logical and physical data models in Snowflake based on evolving business requirements Defined roles and privileges for secure access to Snowflake database objects Strong understanding of Snowflake database, including schemas and table structures Collaborative approach to working with data analysts and stakeholders to implement suitable data models and structures Expertise in Spark job optimization and using Azure Synapse Analytics for large-scale data processing and analytics Proven track record in performance optimization and capacity planning for efficient, scalable solutions Experienced in developing CI/CD frameworks for data pipelines and working with DevOps teams for automated deployment Proficient in scripting using Python and Scala Skilled in working with Hive, Spark SQL, Kafka, and Spark Streaming for ETL and real-time data processing Experience with Hadoop, HDFS, MapReduce, Hive, Tez, Python, and PySpark Hands-on experience building large-scale data pipelines using Spark and Hive Experience using Apache Sqoop to import and export data from HDFS and Hive Hands-on experience setting up workflows with Apache Oozie for managing and scheduling Hadoop jobs Experienced in setting up workflows using Apache Oozie for Hadoop job orchestration, and transferring data via Sqoop from HDFS to relational databases Optimized Hive query performance using bucketing and partitioning techniques, and tuned Spark jobs extensively Highly proficient in Agile methodologies, with hands-on experience using JIRA for project management and reporting Keywords: continuous integration continuous deployment sthree |