A guide for data professionals to LLM/GPT Prompting for everyday tasks….

As an AI and data professional, I understand the importance of having efficient workflows and accurate data analysis. That’s why I’m excited to share with you a guide on how you can leverage the power of using LLMs in Microsoft OpenAI with Python to streamline your processes and enhance your data analysis capabilities.

Prompt design and structure play a critical role in maximizing the potential of Microsoft OpenAI GPT models. Whether you are using the Chat Completion API or the Completion API, understanding the nuances of prompt formatting is key to obtaining accurate and relevant responses. When designing prompts for the Chat Completion API, it is important to frame the conversation context clearly and concisely.

“Prompt design and structure significantly impact the quality and relevance of the generated text. By investing time and effort into crafting effective prompts, you can unlock the full potential of Microsoft OpenAI GPT models in your data analysis workflows.”

Tips for good prompt design:

  1. Be clear and concise – State the desired outcome or provide specific instructions.
  2. Use sample outputs – Including examples of the desired responses can guide the model’s understanding.
  3. Experiment with temperature – Adjusting the temperature parameter can influence the level of randomness in the generated responses.
  4. Iterate and refine – Prompt design is an iterative process, so don’t hesitate to experiment and fine-tune your prompts.”

There are two kinds of prompts we generally use:

  • General problem, general solution
  • Specific problem, specific solution

The first generally starts with the likes of:

“How do I…” … “What is the…” …”When does…”

Prompt Examples for data engineering

Here are some examples of this in the data sphere:

PROMPT: How do I select only three fields from an SQL database table when the select statement sends back all by default?

ANSWER:
To select only three specific fields (or columns) from a table in an SQL database, you need to specify those fields in your SELECT statement instead of using the wildcard *, which selects all fields. Here's the general syntax for selecting specific fields:

In this statement, replace field1, field2, and field3 with the actual names of the columns you want to retrieve, and your_table_name with the name of the table you’re querying.

For example, if you have a table named employees and you want to select the name, email, and department fields, your query would look like this:

This query will return only the values from the name, email, and department columns for all records in the employees table.

For a more specific prompt question, you need to supply a lot more detail – the LLM cannot read your mind! … you need to give it context, and remember, in natural language processing context is king!

Here is a pretty common structure that connects three tables together:

Table-A is customers, each customer can have many departments, and each department can have many users. They are connected to one another using a foreign-key. The question we want to know as a data engineer (albeit junior!) is how can we connect these in an SQL statement and get a list of usernames for a given customer? The main thing to know here is its not a simple ‘general’ problem, rather, its specific, and as such we need to GUIDE the LLM and give it CLUES by way of CONTEXT … so here’s how to do it:

Your ROLE is expert database engineer. I need help in writing SQL code. I have three tables (named A,B,C), each connected to the other using a logic of "A is connected to B, and B is connected to C". Give me an example SQL statement demonstrating how to select a field named 'UserName' from table C, where the ID from table A = "123"

1 – we tell the LLM the ROLE it should play – this helps set the perspective for it
2 – we state the GENERAL CONTEXT AREA by saying ‘we need help in writing SQL code
3 – next, we give it the BACKGROUND CONTEXT which DESCRIBES the detail of the structure we have (tables A..C), and importantly, telling it the LOGIC of how the tables are connected “A is connected to B, and B is connected to C”
4 – finally we we ask the question ‘Give me an example SQL statement…’

Providing all of the best CLUES to the LLM by way of context, allows it to give you the best answer it can predict, and hopefully, the one you are looking for:

ANSWER:

Then it EXPLAINS what it did:

Note what’s going on here – the LLM is ‘REASONING OUT’ what it is doing – giving you INSIGHT into its thinking – this is important, as if you don’t get the answer you want, you can tell it WHERE to adjust its reasoning to improve the answer.

Remember, the key to a successful prompt is GIVING CONTEXT and putting guide-rails around what you want the LLM to do.