Job Details

Home

Urgent Hiring || Data Engineer- Big Data Engineer with Deploying AI || Remote at Remote, Remote, USA

http://bit.ly/4ey8w48
https://jobs.nvoids.com/job_details.jsp?id=2117753&uid=

Dear Vendors,

I hope this email finds you well.

Role:Data Engineer- Big Data Engineer
Location::Remote
Job description:

Job Overview:
Were seeking a highly skilled Data Engineer, Big Data Engineer to build scalable data pipelines, develop ML models, and integrate big data systems. You'll work with structured, semi-structured,
and unstructured data, focusing on optimizing data systems, building ETL pipelines, and deploying AI models in cloud environments.
Key Responsibilities:
Data Ingestion: Build scalable ETL pipelines using Apache Spark, Talend, AWS Glue, Google Dataflow, Apache NiFi. Ingest data from APIs, file systems, and databases.
Data TransformationValidation: Use Pandas, Apache Beam, and Dask for data cleaning, transformation, and validation. Automate data quality checks with Pytest, Unittest.
Big Data Systems: Process large datasets with Hadoop, Kafka, Apache Flink, Apache Hive. Stream real-time data using Kafka, Google Cloud PubSub.
Task Queues: Manage asynchronous processing with Celery, RQ, RabbitMQ, or Kafka. Implement retry mechanisms and track task status.
Scalability: Optimize for performance with distributed processing (Spark, Flink), parallelization (joblib), and data partitioning.
CloudStorage: Work with AWS, Azure, GCP, Databricks. Store and manage data with S3, BigQuery, Redshift, Synapse Analytics, and HDFS.

Required Skills:
ETL Data Processing: Expertise in Apache Spark, AWS Glue, Google Dataflow, Talend.
Big Data Tools: Proficient with Hadoop, Kafka, Apache Flink, Hive, Presto.
Databases: Strong experience with MySQL, PostgreSQL, MongoDB, Cassandra.
Machine Learning: Hands-on with TensorFlow, PyTorch, Scikit-learn, XGBoost.
Cloud Platforms: Experience with AWS, Azure, GCP, Databricks.
Task Management: Familiar with Celery, RQ, RabbitMQ, Kafka.
Version Control: Git for source code management.
Desirable Skills:
Real-time Data Processing: Experience with Apache Pulsar, Google Cloud PubSub.
Data Warehousing: Familiarity with Redshift, BigQuery, Synapse Analytics.
Scalability Optimization: Knowledge of load balancing (NGINX, HAProxy) and parallel processing.
Data Governance: Use of MLflow, DVC, or other tools for model and data versioning.

Tools Technologies:
ETL: Apache Spark, Talend, AWS Glue, Google Dataflow.
Big Data: Hadoop, Kafka, Apache Flink, Presto.
Databases: MySQL, PostgreSQL, MongoDB, Cassandra.
Cloud: AWS, GCP, Azure, Databricks.
Storage: S3, BigQuery, Redshift, Synapse Analytics, HDFS.
Version Control: Git.

--

Thanks & Regards

Mohd Irfan

SR. Technical Recruiter

--

Keywords: artificial intelligence machine learning sthree information technology
Urgent Hiring || Data Engineer- Big Data Engineer with Deploying AI || Remote
[email protected]
http://bit.ly/4ey8w48
https://jobs.nvoids.com/job_details.jsp?id=2117753&uid=

[email protected]
View All

02:19 AM 28-Jan-25

To remove this job post send "job_kill 2117753" as subject from [email protected] to [email protected]. Do not write anything extra in the subject line as this is a automatic system which will not work otherwise.

Your reply to [email protected] -

To

Subject
Message -

technicallead2336@gmail.com wrote:
Dear Vendors,

I hope this email finds you well.

Role:Data Engineer- Big Data Engineer
Location::Remote
Job description:

Job Overview:
Were seeking a highly skilled Data Engineer, Big Data Engineer to build scalable data pipelines, develop ML models, and integrate big data systems. You'll work with structured, semi-structured,
and unstructured data, focusing on optimizing data systems, building ETL pipelines, and deploying AI models in cloud environments.
Key Responsibilities:
Data Ingestion: Build scalable ETL pipelines using Apache Spark, Talend, AWS Glue, Google Dataflow, Apache NiFi. Ingest data from APIs, file systems, and databases.
Data TransformationValidation: Use Pandas, Apache Beam, and Dask for data cleaning, transformation, and validation. Automate data quality checks with Pytest, Unittest.
Big Data Systems: Process large datasets with Hadoop, Kafka, Apache Flink, Apache Hive. Stream real-time data using Kafka, Google Cloud PubSub.
Task Queues: Manage asynchronous processing with Celery, RQ, RabbitMQ, or Kafka. Implement retry mechanisms and track task status.
Scalability: Optimize for performance with distributed processing (Spark, Flink), parallelization (joblib), and data partitioning.
CloudStorage: Work with AWS, Azure, GCP, Databricks. Store and manage data with S3, BigQuery, Redshift, Synapse Analytics, and HDFS.

Required Skills:
ETL Data Processing: Expertise in Apache Spark, AWS Glue, Google Dataflow, Talend.
Big Data Tools: Proficient with Hadoop, Kafka, Apache Flink, Hive, Presto.
Databases: Strong experience with MySQL, PostgreSQL, MongoDB, Cassandra.
Machine Learning: Hands-on with TensorFlow, PyTorch, Scikit-learn, XGBoost.
Cloud Platforms: Experience with AWS, Azure, GCP, Databricks.
Task Management: Familiar with Celery, RQ, RabbitMQ, Kafka.
Version Control: Git for source code management.
Desirable Skills:
Real-time Data Processing: Experience with Apache Pulsar, Google Cloud PubSub.
Data Warehousing: Familiarity with Redshift, BigQuery, Synapse Analytics.
Scalability Optimization: Knowledge of load balancing (NGINX, HAProxy) and parallel processing.
Data Governance: Use of MLflow, DVC, or other tools for model and data versioning.

Tools Technologies:
ETL: Apache Spark, Talend, AWS Glue, Google Dataflow.
Big Data: Hadoop, Kafka, Apache Flink, Presto.
Databases: MySQL, PostgreSQL, MongoDB, Cassandra.
Cloud: AWS, GCP, Azure, Databricks.
Storage: S3, BigQuery, Redshift, Synapse Analytics, HDFS.
Version Control: Git.

Thanks & Regards

Mohd Irfan

SR. Technical Recruiter

Keywords: artificial intelligence machine learning sthree information technology 
Urgent Hiring || Data Engineer- Big Data Engineer with Deploying AI || Remote
technicallead2336@gmail.com

Your email id:

Captcha Image:

Captcha Code:

Pages not loading, taking too much time to load, server timeout or unavailable, or any other issues please contact admin at [email protected]

Time Taken: 0

Location: ,