PYSPARK DSA | 4 TO 8 YEARS | PAN INDIA
Job Description
- Bachelor’s or master’s degree in computer science, Engineering, or related field
- 5-8 years of experience in data engineering and machine learning
- Extensive experience with Python programming, including Pandas, NumPy, and various libraries for data manipulation and analysis
- Hands-on experience with Spark architecture, including data frame operations, lazy evaluation, and UDFs
- Strong understanding of DevOps principles and experience with tools like Docker, Kubernetes, and Jenkins
- Familiarity with machine learning libraries such as Spark ML-lib or Azure ML is a plus
- Previous experience in product-oriented organizations or Tier 1 companies is preferred
Primary Skills
- Utilize advanced SQL techniques such as joins with in-line views and self-joins for data manipulation and extraction
- Proficient in Python programming, including but not limited to variables, functions, loops, conditions, and various data structures
- Implement object-oriented programming concepts including polymorphism, abstract classes, and interfaces for code modularity and reusability
- Develop and maintain data pipelines using Spark architecture, understanding nodes, clusters, lazy evaluation, and DAG in Spark
- Perform data frame operations and user-defined functions (UDFs) for efficient data processing in Spark
- Integrate APIs for data retrieval and interaction with external systems
- Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions
- Apply best practices in software development and data engineering to ensure scalability, reliability, and performance of the ML pipelines
Secondary Skills
-
Good Communication Skills
Ref:
1795085
Posted on:
Apr 16, 2024
Experience level:
Experienced
Contract Type:
Permanent
Location:
Bangalore, KA, IN
Department:
Big Data & Analytics