Data Engineer – Logs

Palo Alto/Bangalore

About Peritus

Peritus enables self-healing autonomous datacenters with automated, cognitive support for infrastructure software and hardware. It is a funded startup co-created at The Hive in Palo Alto, CA that delivers artificial intelligence-based virtual support expert systems for data center service fulfillment and incident resolution.

As datacenter vendors move from on-premise to the cloud their existing support system lacks the agility and cost-effectiveness for the cloud. Peritus significantly enhances operational efficiencies of existing support services and enables managed service providers & system vendors to offer new business continuity entitlements. Peritus assists & automates a wide spectrum of decisions in system support including incident classification, routing, contract coverage, incident resolution recipes and orchestration of incident management between subject matter experts (SMEs).

Peritus’ unique vectorization of system log data drives predictive modeling with highly granular feature extraction for early detection of system events. The platform’s advanced natural language processing (NLP) capabilities drive Peritus’ incident modeling and predictive capabilities. The core service fulfillment engine uses a combination of supervised and unsupervised methods to predict incident features from system log data. Peritus delivers automated orchestration of incident resolution through its close integration with existing incident management platforms.

Job Description

We are building a product that helps customers fulfill service requests as well as troubleshoot and diagnose infrastructure issues that cut across domains. The product needs to collect, collate and analyze logs, telemetry and monitoring data emanating from multi-vendor and multi-datacenter/cloud systems. The analysis should help with timestamp-based correlation of events across systems, extraction of inferences derived from performance counters from multiple systems and trace the sequence of events across layers of the data center stack.

Responsibilities

We are looking to hire an engineering technical leader with deep systems knowledge. The role entails a deeper understanding of how data center systems operate requiring familiarity with computing, storage, networking and virtualization products. Think DTrace but with a scope that spans across systems and not just within a system. The logs infrastructure needs to scale horizontally and enable machine learning models to learn systems behavior. Over time, the learning translates to automatic application of resolutions and thereby reducing human involvement.

Can you help fill in the blanks for users handling complex systems issues that affect performance and/or bring systems down?

Qualifications & Expertise

The successful engineer would have a proven track record of building complex log analysis platforms:

  • Degree in Computer Science, Engineering, or related fields
  • Strong in data structures, algorithms, and systems
  • Deep understanding of systems behavior and affinity for logs
  • Experience with building infrastructure to collect, prepare and analyze logs, telemetry, and monitoring data
  • Experience with building and using tools like Dtrace, BTrace, etc.
  • Experience with log analytics tools like Splunk, ELK, etc.
  • Knowledge of time series models and the use of systems like OpenTSDB, Graphite, etc.
  • Experience with handling complex production and/or support escalation issues
    • Heuristic problem solving with incomplete information
    • Deep domain knowledge and hands-on experience across a very broad spectrum of backend, frontend, cloud, AI and data infrastructure platforms,

Please send your resumes to jobs@peritus.ai

Job Overview

Experience

5-6 years

Qualification

Degree in Computer Science

Position

Full Time

Location

Palo Alto/Bangalore

Other Jobs

Data Engineer – Systems and Configuration

Palo Alto/Bangalore

Full Stack Engineer

Bangalore, India

Data Scientist

Palo Alto, California