Spark Engineer

Mission

As a Spark Technical Solutions Engineer, you will provide a deep dive technical and

consulting related solutions for the challenging Spark/ML/AI/Delta/Streaming/Lakehouse

reported issues by our customers and resolve any challenges involving the Data bricks

unified analytics platform with your highly comprehensive technical and customer

communication skills. You will assist our customers in their Data bricks journey and

provide them with the guidance, knowledge, and expertise that they need to realize

value and achieve their strategic objectives using our products.

Responsibilities:

● Performing initial level analysis and troubleshooting issues in Spark using

Spark UI metrics, DAG, Event Logs for various customer reported job

slowness issues.

● Troubleshoot, resolve and suggest deep code-level analysis of Spark to

address customer issues related to Spark core internals, Spark SQL,

Structured Streaming, Delta, Lakehouse and other data bricks runtime

features.

● Assist the customers in setting up reproducible spark problems with solutions in

the areas of Spark SQL, Delta, Memory Management, Performance tuning,

Streaming, Data Science, Data Integration areas in Spark.

● Participate in the Designated Solutions Engineer program and drive one or

two of strategic customer's day to day Spark and Cloud issues.

● Plan and coordinate with Account Executives, Customer Success Engineers and

Resident Solution Architects for coordinating the customer issues and best

practices guidelines.

● Participate in screen sharing meetings, answering slack channel conversations

with our internal stakeholders and customers, helping in driving the major

spark issues at an individual contributor level.

● Build an internal wiki, knowledge base with technical documentation, manuals

for the support team and for the customers. Participate in the creation and

maintenance of company documentation and knowledge base articles.

● Coordinate with Engineering and Backline Support teams to provide

assistance in identifying, reporting product defects.

● Participate in weekend and weekday on-call rotation and run escalations during

data bricks runtime outages, incident situations, ability to multitask and plan day

2 day activities and provide escalated level of support for critical customer

operational issues, etc.

● Provide best practices guidance around Spark runtime performance and

usage of Spark core libraries and APIs for custom-built solutions developed

by Data bricks customers.

● Be a true proponent of customer advocacy.

● Contribute in the development of tools/automation initiatives.

● Provide front line support on the third party integrations with Data bricks

environment.

● Review the Engineering JIRA tickets and proactively intimate the support

leadership team for following up on the action items.

● Manage the assigned spark cases on a daily basis and adhere to committed

SLA's.

● Achieving above and beyond expectations of the support organization KPIs.

● Strengthen your AWS/Azure and Data bricks platform expertise through

continuous learning and internal training programs.

Competencies

● Min 6 years of experience in designing, building, testing, and maintaining

Python/Java/Scala based applications in typical project delivery and

consulting environments.

● 3 years of hands-on experience in developing any two or more of the Big

Data, Hadoop, Spark, Machine Learning, Artificial Intelligence, Streaming,

Kafka, Data Science, Elasticsearch related industry use cases at the

production scale. Spark experience is mandatory.

● Hands on experience in the performance tuning/troubleshooting of Hive and

Spark based applications at production scale.

● Proven and real time experience in JVM and Memory Management techniques

such as Garbage collections, Heap/Thread Dump Analysis is preferred.

● Working and hands-on experience with any SQL-based databases, Data

Warehousing/ETL technologies like Informatica, DataStage, Oracle,

Teradata, SQL Server, MySQL and SCD type use cases is preferred.

● Hands-on experience with AWS or Azure or GCP is preferred

● Excellent written and oral communication skills

● Linux/Unix administration skills is a plus

● Working knowledge in Data Lakes and preferably on the SCD types use

cases at production scale.

● Demonstrated analytical and problem-solving skills, particularly those that apply

to a “Distributed Big Data Computing” environment.

View all job openings