Data Transformation/Enrichment

The Need

Imagine you've conducted a large-scale survey as part of your market research or customer feedback initiatives. The survey responses contain a wealth of information including the names and email addresses of potential customers, as well as their job titles. This is an invaluable resource for your sales and marketing teams, as these contacts could be potential sales leads.

However, the raw survey data is not readily usable. The data is unstructured and the relevant information - such as names, email addresses, and job titles - is mixed in with other less pertinent details. Furthermore, the job titles provided in the survey data are often in various formats and styles, and don't readily indicate the seniority level or the specific role of the contact.

To effectively utilize this data, you need to transform and classify the data into a structured format that can be easily imported into your CRM or sales prospecting system. The transformed data will give your sales team direct insights into each lead's name, company, industry, and job role, enabling them to tailor their sales pitches and outreach efforts more effectively.

The task of transforming and classifying this data manually would be a time-consuming process and is prone to errors. Automating this process would not only save time and resources but would also increase the accuracy of the data classification. That's where PromptJoy comes in.

The Solution with PromptJoy

PromptJoy provides a simple yet powerful solution to this problem. By creating an API using the PromptJoy platform, you can automate the process of transforming your survey data into a format that your CRM can ingest. Not only can PromptJoy handle schema transformation, but it can also perform tasks like extracting the company name and industry from an email address and classifying job titles, all within the same API.

As an example, https://preview.promptjoy.com/apis/jn8Cep can be created with the simple prompt of "Given the original schema, transform it into the desired output. Please figure out the company name and industry from the email address. Please classify the job title appropriately. Use knowledge about the company and industry to inform the job title classification." By providing the expected input schema and the desired output schema, the API can now be used for the transformation.

Example: Using the Schema Transformation API

The Schema Transformation API takes an input JSON object with fields for name, email_address, and job_title. Here's an example of an input object:


{
  "name": "John Doe",
  "email_address": "[email protected]",
  "job_title": "Managing Director, Investment Banking"
}

The API transforms this input into a more detailed and structured format that's suitable for a CRM system. The output includes fields for contact_id, name, contact_info (which includes the email), company (which includes the name, size, and industry), and job (which includes the title, role, and function).

{
  "name": {
    "first": "John",
    "last": "Doe"
  },
  "contact_info": {
    "email": "[email protected]",

  },
  "company": {
    "name": "The Goldman Sachs Group",
    "size": "Large",
    "industry": "Financial Services",
  },
  "job": {
    "title": "Managing Director",
    "role": "Management",
    "function": "Investment Banking",
  }
}

Harnessing the power of large language models (LLMs), the PromptJoy API offers an intuitive way to extract and classify crucial information from the raw data. With just an email address and job title, the API is able to infer pertinent details such as the name, size, and industry of the associated company, as well as classify the job function and level of seniority.

However, it's important to note that while powerful, LLMs are not infallible. They can sometimes generate ("hallucinate") inaccurate information or use outdated data. This is particularly the case when dealing with precise numerical data such as exact company revenue. So we recommend using categories (industry, size) for enrichment usage.

PromptJoy is actively developing a hallucination checking API. This tool aims to ensure higher accuracy by cross-verifying the inferences made by the LLM. It will make the use of LLMs even more reliable and effective in data transformation tasks.

Using the API in Bulk

To use this API in bulk, you can send a POST request to the API endpoint with an array of input objects in the request body. The API will process each object in the array and return an array of transformed objects.

The specific implementation would depend on your programming language of choice and your HTTP client library. However, the general process is the same: construct an HTTP POST request with the array of input objects in the request body, send the request to the API endpoint, and handle the response.

What This Replaces

Using PromptJoy's Schema Transformation API can replace several parts of your existing data pipeline:

  • DBT (Data Build Tool): DBT is a tool for transforming data in your warehouse. While DBT is powerful, it requires writing SQL and Jinja scripts. In contrast, PromptJoy abstracts away these complexities behind a simple API.

  • Rule-Based Classifiers: If you're currently using rule-based classifiers to classify job titles or extract company information, the Schema Transformation API can do this automatically, reducing the need for manual rule creation and maintenance.

  • Clearbit: Clearbit is a data enrichment tool that can provide company information from an email address. However, it's a separate service with its own cost. PromptJoy can extract company information as part of the schema transformation, eliminating the need for a separate Clearbit integration.

  • In-House AI Models: If you've built in-house AI models to classify job titles or extract company information, you can replace them with the Schema Transformation API. This can reduce the cost and complexity of maintaining your own models and allow you to focus on your core business.

Remember that each business case is unique, and while PromptJoy provides a powerful and easy-to-use solution, it's essential to evaluate it in the context of your specific needs and existing systems. The Schema Transformation API is highly flexible and can be used in a wide range of scenarios, but it's always a good idea to test it thoroughly with your data and use case.

In conclusion, the Schema Transformation API from PromptJoy can simplify and automate the process of transforming and classifying data, making it a great tool for any data-intensive business. Happy data wrangling!

Last updated