17 July 2016

Job Recommendation Engines: Low- to No-Tech and Not Getting Better

Job Recommendation from ZipRecruiter 2016-07

Let's face it.  Many recruiters are poorly trained. They choose -- or are asked to -- identify potential candidates based on slight domain knowledge, and that's putting it kindly. Meanwhile, employers have (despite press to the contrary still enjoy a buyer's market)  moved to an era of increased specialization.

To address this, or possibly to cut recruiters out of the process altogether, some recruiters have increased their use of job matching algorithms. On the job description side, available data is not canonical and metadata nonstandard -- and not getting better. On the resume side, applicants are increasingly forced into redundant cut and paste into employer-specific resume systems hosted by Taleo, Peoplesoft, Silkroad etc. There appears to be little payoff from this structuring, because the resulting standardization fails to improve matching; the resulting outlines have little do to with the jobs being matched to.

The screenshot from the ZipRecruiter recommendation engine is for a senior cybersecurity specialist. At the least, the recommendation engine should survey the applicant for suitability of the match. When data is weak and you have a a chance to ask for more, why not ask?

Commentary by @knowlengr  +Mark 

28 June 2016

Big Data Project with Microservices (and Task) Description

This description was collected and forwarded by Artech, a staffing firm.
  1.        Build the Event Hubs integration with Service Fabric microservices implementation. Streaming the processed files from blobs into EH for downstream processing.
    1.        Anonymized files (~1000 of them and to a size of ~GB) will be given as input
    2.       Service Fabric code portion will be provided.
  2.        Build the Spark processing reading off EventHubs, implementation in either Python or Scala would suffice.
    1.        Look at the caching needs; leverage .cache to retain appropriate results from Spark ‘Actions’ in Spark executors
  3.        Our team will evaluate a set of data store that would be a landing spot post Spark – Blobs being a required one. We will pick 1 or 2 from this list -- SQL DW, Azure SQL DB, Cassandra and DocumentDB being other candidate stores and we will have code snippets and/or guidance
 Integration & Deployment
  1.        Integrate the items from above with completed items (Azure Data Factory with ARM provisioning, picking up from the ADF pipeline which lands files onto blobs)
  2.        Apply best practices for capacity planning, deployment for E2E
  3.        Integrate the deployment with existing set of tools and processes.  
  1.        Build a unit test framework that can test each building block in isolation (ADF à Blobs, Blobs à Service Fabric, Service Fabric à EH, EH à Spark, Spark à <Data Store>
  2.        Build an E2E test environment with telemetry on latency, throughout with percentiles. *Leverage APM tools as appropriate