Apply now »

PYSPARK DSA | 4 TO 8 YEARS | PAN INDIA

Job Description

  • Bachelor’s or master’s degree in computer science, Engineering, or related field
  • 5-8 years of experience in data engineering and machine learning
  • Extensive experience with Python programming, including Pandas, NumPy, and various libraries for data manipulation and analysis
  • Hands-on experience with Spark architecture, including data frame operations, lazy evaluation, and UDFs
  • Strong understanding of DevOps principles and experience with tools like Docker, Kubernetes, and Jenkins
  • Familiarity with machine learning libraries such as Spark ML-lib or Azure ML is a plus
  • Previous experience in product-oriented organizations or Tier 1 companies is preferred

Primary Skills

  • Utilize advanced SQL techniques such as joins with in-line views and self-joins for data manipulation and extraction
  • Proficient in Python programming, including but not limited to variables, functions, loops, conditions, and various data structures
  • Implement object-oriented programming concepts including polymorphism, abstract classes, and interfaces for code modularity and reusability
  • Develop and maintain data pipelines using Spark architecture, understanding nodes, clusters, lazy evaluation, and DAG in Spark
  • Perform data frame operations and user-defined functions (UDFs) for efficient data processing in Spark
  • Integrate APIs for data retrieval and interaction with external systems
  • Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions
  • Apply best practices in software development and data engineering to ensure scalability, reliability, and performance of the ML pipelines

Secondary Skills

  • Good Communication Skills

Ref:  1795085
Posted on:  Apr 16, 2024
Experience level:  Experienced
Contract Type:  Permanent
Location: 

Bangalore, KA, IN

Department:  Big Data & Analytics

Apply now »